P3540R3
#embed offset parameter

Published Proposal,

Authors:
Latest:
https://thephd.dev/_vendor/future_cxx/papers/d3540.html
Paper Source:
GitHub ThePhD/future_cxx
Implementation:
GitHub ThePhD/embed
Project:
ISO/IEC 14882 Programming Languages — C++, ISO/IEC JTC1/SC22/WG21
Audience:
CWG

Abstract

An additional, user-supported embed parameter implemented in Clang and GCC for providing an offset.

1. Changelog

1.1. Revision 3 - June 12th, 2026

1.2. Revision 2 - June 5th, 2025

1.3. Revision 1 - February 14th, 2025

1.4. Revision 0 - December 13th, 2024

2. Introduction and Motivation

The goal is to add the extremely-popular and already-implemented gnu::offset and clang::offset parameters as standard parameters. That is the only motivation of this proposal; to standardize existing practice.

Originally, users asked to add this parameter, but only after C23 standardized. Given the late stage that users have asked -- waiting until the very end -- it has to be added separately. This proposal aims to standardize what users have asked for, and what Clang and GCC have implemented.

3. Design

The design of offset(some-preprocessor-constant-value) is straightforward:

These are the only tenets of the design, and match the practice for existing implementations gnu::offset and clang::offset. It is also how the original author envisioned this when it was first PR’d to Clang, and the original tests (plus new ones) still pass in the LLVM/clang and gnu/gcc repositories.

There was some discussion during the June 4th, 2025 about the order of parameters. This turned into a significant enough change that it is a separate paper. No additional discussion or changes are proposed for this paper: it is exactly a standardization of existing practice and what was approved in Austria.

4. Wording

This wording is relative to C++'s latest working draft.

4.1. Intent

The intent of the wording is to provide a preprocessing directive that:

4.2. Feature Test Macro

The feature test macro __cpp_pp_embed should be increased from 202502L to ${PROPER_VALUE}, where PROPER_VALUE. is determined by the Editors.

4.3. Proposed Language Wording

4.3.1. Add to the control-line production in §15.1 Preamble [cpp.pre] a new grammar production for offset

embed-standard-parameter:

limit ( pp-balanced-token-seq )

offset ( pp-balanced-token-seq )

prefix ( pp-balanced-token-seqopt )

suffix ( pp-balanced-token-seqopt )

if_empty ( pp-balanced-token-seqopt )

4.3.2. Modify and add paragraphs in §15.4.1 General [cpp.embed.gen]

15.4.1 General [cpp.embed.gen]

...

...

A resource is a source of data accessible from the translation environment. A resource has an implementation-resource-width, which is the implementation-defined size in bits of the resource. If the implementation-resource-width is not an integral multiple of CHAR_BIT, the program is ill-formed. Let implementation-resource-count be implementation-resource-width divided by CHAR_BIT. Every resource has a resource-offset, which is

  • the value as computed from the optionally-provided offset embed-parameter ([cpp.embed.param.offset]), if present;

  • otherwise, 0.

Every resource also has a resource-count, which is

  • max ( min ( limit-value , implementation-resource-count - resource-offset ) , 0 ) if the limit embed-parameter ([cpp.embed.param.limit]) is present, where the value computed from the limit embed-parameter is limit-value the value as computed from the optionally-provided limit embed-parameter ([cpp.embed.param.limit]), if present ;
  • otherwise, max ( implementation-resource-count - resource-offset , 0 ) the implementation-resource-count .

A resource is empty if the resource-count is zero.

...

..

The integer literals in the comma-separated list correspond to resource-count consecutive calls to std::fgetc ([cstdio.syn]) from the resource, as a binary file. First, resource-offset calls to std::fgetc from the resource as a binary file have their result discarded and ignored. Next, resource-count consecutive calls to std::fgetc produce the elements of the comma-separated list of integer literals, in order. If any of the resource-count calls to std::fgetc returns EOF, the program is ill-formed. If any call to std::fgetc returns EOF, the program is ill-formed.

4.3.3. Add a new sub-clause §15.4.2.✨ under Resource Inclusion for Embed parameters for the new offset parameter [cpp.embed.param.offset]

15.4.2.✨ offset parameter [cpp.embed.param.offset]

An embed-parameter of the form offset ( pp-balanced-token-seq ) denotes the number of elements to be skipped from the resource. It shall appear at most once in the embed-parameter-seq.

The pp-balanced-token-seq is evaluated as a constant-expression using the rules as described in conditional inclusion ([cpp.cond]), but without being processed as in normal text an additional time.

The constant-expression shall be an integral constant expression whose value is greater than or equal to zero.

[Example:

constexpr const unsigned char arr[] = {
  // a hypothetical resource capable of expanding to
  // four or more elements
#embed <sdk/jump.wav>
};

constexpr const unsigned char offset_arr[] = {
  // the same hypothetical resource capable of expanding
  // to four or more elements
#embed <sdk/jump.wav> offset(2)
};

constexpr const unsigned char offset_limit_arr[] = {
  // the same hypothetical resource capable of expanding
  // to four or more elements
#embed <sdk/jump.wav> offset(1) limit(1)
};

static_assert(arr[2] == offset_arr[0]);
static_assert(arr[3] == offset_arr[1]);
static_assert(arr[1] == offset_limit_arr[0]);

end example]

4.4. Remove the following text from [cpp.embed.param.limit], ❡3

The constant-expression shall be an integral constant expression whose value is greater than or equal to zero. The resource-count ([cpp.embed.gen]) becomes implementation-resource-count, if the value of the constant-expression is greater than implementation-resource-count; otherwise, the value of the constant-expression.

4.5. Add a new example to the if_empty embed parameter [cpp.embed.param.if.empty] section

[Example: Given a resource <single_byte> that has an implementation-resource-count of 1, the following directives:

#embed <single_byte> offset(1) if_empty(42203)

are replaced with:

42203

end example]