1. Changelog
1.1. Revision 5 - February 2nd, 2022
-
WG14 Homework! Changes between N2724 and this revision are in a special purple color.
-
vs._Atomic T
change was dropped and both are considered semantically the same._Atomic ( T ) -
Use the term "variably modified type" in the right place instead, which is the decision from an e-mail discussion. The discussion of Jens Gustedt’s paper about VLA/VMT (N2838 - Types and Sizes") evaluation was not accepted, so it should stay as "variably-modified type".
-
1.2. Revision 4 - January 1st, 2022
-
Add a small fix to specify "variable length array" for evaluation rather than "variably modified types": not all variably modified types need evaluation.
1.3. Revision 3 - May 15th, 2021
-
Make sure we mention the old C99 Rationale and Nick Stoughton’s previous evaluation of
in the Appendix.typeof -
Added final direction based on the March 2021 Virtual Standard Meeting’s Vote. The numbers listed are in the form
to the given question / option.Yes / No / Abstain
Keyword Options:
Use
keyword, with
_Typeof header. 6/7/5
< stdtypeof . h > Use
keyword, no header. 16/2/1
typeof Use some other spelling (
, or similar). 1/14/3
qualified_typeof This was very strong direction to use the keywords directly, and not use an alternate spelling.
On the subject of using Expressions / types within
/
typeof .
remove_quals
with type names going in, in addition to expressions (voting "No" means no type names, just expressions) 17/1/4
typeof
applied to expressions, in addition to type names (voting No means no expressions are allowed) 11/2/5
remove_quals This was very strong direction to allow both types and expressions in both constructs.
1.4. Revision 2 - March 7th, 2021
-
Focus on
spelling.remove_quals -
Give equal choice in keyword token for
(to match the other declarations)remove_quals -
Fix up some of the section talking about macro-generic facilities for later.
1.5. Revision 1 - December 5th, 2020
-
Completely Reformulate Paper based on community, GCC, and LLVM implementation feedback.
-
Address major implementation contention of qualifiers with both
(or appropriate flavor) and_Typeof
._Remove_quals -
Note that variably modified types are their own special nightmare.
-
Add section about not using C++'s
identifier for this and other compatibility issues.decltype -
Completely rewrite the wording section.
1.6. Revision 0 - October 25th, 2020
-
Initial release.
2. Introduction & Motivation
is a extension featured in many implementations of the C standard to get the type of an expression. It works similarly to
, which runs the expression in an "unevaluated context" to understand the final type, and thusly produce a size.
stops before producing a byte size and instead just yields a type name, usable in all the places a type currently is in the C grammar.
There are many uses for
that have come up over the intervening decades since its first introduction in a few compilers, most notably GCC. It can, for example, help produce a type-safe generic printing function that even has room for user extension (see example implementation). It can also help write code that can use the expansion of a macro expression as the return type for a function, or used within a macro itself to correctly cast to the desired result of a specific computation’s type (for width and precision purposes). The use cases are vast and endless, and many people have been locking themselves into implementation-specific vendorship. This keeps their code out of other compilers (for example, Microsoft’s Visual C Compiler) and weakens the ISO C ecosystem overall.
3. Implementation & Existing Practice
Every implementation in existence since C89 has an implementation of
. Some compilers (GCC, Clang, EDG, tcc, and many, many more) expose this with the implementation extension
. But, the Standard already requires
to exist. Notably, with emphasis (not found in the standard) added:
The
operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. — [N2596, Programming Languages C - Working Draft, §6.5.3.4 The
sizeof and
sizeof operators, Semantics](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2596.pdf)
_Alignof
Any implementation that can process
is already doing
internally. This feature is the most "existing practice"-iest feature to be proposed to the C Standard, possibly in the entire history of the C standard. The feature was also mentioned in an "extension round up" paper that went over the state of C Extensions in 2007[^N1229].
was also considered an important extension during the discussion of that paper, but nobody brought forth the paper previously to make it a reality.
3.1. Corner cases: Variably Modified Types and VLAs
Putting a normal or VLA-type computation results in an idempotent type computation that simply yields that type in most implementations that support the feature. If the compiler supports Variable Length Arrays, then
-- if it is similar to GCC, Clang, tcc, and others -- it is already supported with these semantics. These semantics also match how
would behave (computing the expression or having an internal placeholder "VLA" type), so we propagate that same ability in an identical manner.
Notably, this is how current implementations evaluate the semantics as well. The standard claims that whether or not any computation done for Variably Modified Types -- with side effects -- is actually unspecified behavior, so there’s no additional guarantees about the evaluation for such types.
3.2. Taking both expressions and types
The goal was to be compatible with
, which takes both expressions and types. Existing
expressions also take this design choice. We see this as a good thing, since it is compatible with the usage of
extensions in existing Macros and code, where occasionally programmers use type names directly into these macros with the fore-knowledge that it will be used exclusively in
or
operations.
3.3. Why not "decltype"?
C++ has a feature it calls
, which serves most of the same purpose. "Most" is because it has a subtle difference which would wreak havoc on C code if it was employed in shared header code:
int value = 20 ; #define GET_TARGET_VALUE (value) inline decltype ( GET_TARGET_VALUE ) g () { return value ; } int main () { int & r = g (); return r ; }
The return type of
would be
in C++, and
in C. Other expressions, such as array indexing and pointer dereferencing, also have this same issue. This is due to the parentheses in the expression. Macros in both languages frequently see extra parentheses employed around expressions to prevent mixing of precedence or other shenanigans from token-based macro expansion and subsequent language parsing; this would be a footgun of large proportions for C and C++ users, and create a divergence in standard use that would rise to the level of a liaison issue that may become unfixable. This is also part of the reason why
was given that keyword in C++, and not
: they did not want this kind of subtle and brutal change to afflict C and C++ code.
does not have this problem because -- if a Sister Paper ever proposes it for C++ -- it will have identical behavior to
.
This was also addressed when C++ was itself trying to introduce
and competing with
in WG21 for C++.
3.4. C++ Compatibility
A similar feature should be proposed in C++, albeit it will likely take the keyword name
rather than
. This paper intends to have a similar paper brought before the C++ Committee -- WG21 -- through its Liaison Study Group, if this paper is successful.
3.5. Qualifiers
There is some discussion about what happens with qualifiers, both standard and implementation-defined. For example, "Named Address Space" qualifiers are subject to issues with GCC"s
extension, as shown here. The intention of one of the GCC maintainers from that thread is:
Well, I think we should fix typeof to not retain the address space. It’s probably our implementation detail of having those in TYPE_QUALS that exposes the issue... — Richard Biener, GCC Maintainer, November 5th, 2020
There is also some disagreement between implementations about what qualifiers are worth keeping with respect to
between implementations. Therefore,
as proposed does not strips all qualifiers from the computed type result. The reason for this is that a user can add specifiers and qualifications to a type, but can not take them away once they are part of the expression. For example, consider the specification of
that contains macro-provided constants like
. These constants have the type
: should all
expressions therefore result in a
, or a
? What about
? And so on, and so forth.
There is an argument to strip all type qualifiers (
,
,
, and
) from the final type expression is because they can be added back by the programmer easily. However, the opposite is not true: you cannot add back qualifiers or create macros where those qualifiers can be taken in as parameters and re-applied to the function. This does leave some room to be desired: some folk may want to deliberately propagate the
-ness,
-ness, or
-ness of an expression to its end users.
3.5.1. Qualifiers - The Solution
Originally, the idea of a
and an
was explored. This was a tempting direction but ultimately unsuitable as it duplicated functionality with a slight caveat and did not have a targeted purpose. A much better set name for the functionality is
and
.
is an all-qualifier-preserving type reproduction of the expression (or pass-through if a type is given) . It suitably envelopes the total space of existing practice. The only reason
would exist is to... well, remove qualifiers. It only makes sense to just name it appropriately by using
as a keyword. The benefits of choosing this name are also clear:
-
there are no search hits for
in searching the ACT database (catalogue of Debian/Fedora/etc. open source packages and their code) (December 5th, 2020); and,remove_quals -
there are no search hits for
in the entirety of GitHub save for 4 instances of Python Code (December 5th, 2020).remove_quals
This means that we need not entertain the idea of needing a header or some other choice and can simply directly name
as a keyword in the code instead, saving ourselves a massive debate about what should and should not be a keyword.
3.5.2. In General
Separately, we should consider a Macro Programming facility for C that can address larger questions. This paper strives to focus on the material gains from existing practice and the pitfalls of said existing practice. Therefore, this paper proposes only
and
.
After this paper is handled, further research should be given to handling qualifiers, function types, and arrays in Macros for generic programming. This paper focuses only on what we can find existing practice for.
4. Proposed Changes
The below changes are for adding the two keywords.
4.1. Proposed Wording
The following wording is relative to [N2596].
4.1.1. Modify §6.3.2.1 Lvalues, arrays, and function designators, paragraphs 3 and 4 with footnote 68:
Except when it is the operand of the
operator
sizeof , or typeof operators , or the unary
sizeof operator, or is a string literal used to initialize an array, an expression that has type "array of type" is converted to an expression with type "pointer to type" that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
& A function designator is an expression that has function type. Except when it is the operand of the
operator
sizeof operator, a typeof operator 69)or the unary
sizeof operator, a function designator with type "function returning type" is converted to an expression that has type "pointer to function returning type".
& 69)Because this conversion does not occur, the operand of theoperator remains a function designator and violates the constraints in 6.5.3.4.
sizeof
4.1.2. Add a keyword to the §6.4.1 Keywords:
_Thread_local
typeof
remove_quals
4.1.3. Modify §6.6 Constant expressions, paragraphs 6 and 8:
An integer constant expression125) shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants,
expressions whose results are integer constants,
sizeof expressions, and floating constants that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the
_Alignof typeof operators,
sizeof operator, or
sizeof operator.
_Alignof ...
An arithmetic constant expression shall have arithmetic type and shall only have operands that are integer constants, floating constants, enumeration constants, character constants,
expressions whose results are integer constants, and
sizeof expressions. Cast operators in an arithmetic constant expression shall only convert arithmetic types to arithmetic types, except as part of an operand to the
_Alignof typeof operators,
sizeof operator, or
sizeof operator.
_Alignof
4.1.4. Adjust the footnote 131) in §6.7.1 Storage-class specifiers:
131) Thus, the only operator that can be applied to an array declared with storage-class specifier
is
register and the typeof operators.
sizeof
4.1.5. Adjust the Syntax grammar of §6.7.2 Type specifiers, the paragraph 2 list, and paragraph 4 Semantics:
type-specifier:
void
...
typedef-name
typeof-specifier
...
- enum specifier
- typedef name
- typeof specifier
Specifiers for
structures, unions, enumerations, and atomic typesstructures, unions, enumerations, atomic types, and typeof specifiers are discussed in 6.7.2.1 through6.7.2.46.7.2.5 . Declarations of typedef names are discussed in 6.7.8. The characteristics of the other types are discussed in 6.2.5.
4.1.6. Adjust the footnote 133) in §6.7.2.1 Structure and union specifiers:
133)As specified in 6.7.2 above, if the actual type specifier used is
or a typedef-name defined as
int , then it is implementation-defined whether the bit-field is signed or unsigned. This includes an
int type specifier produced by the use of the typeof specifier (6.7.2.5).
int
4.1.7. Add a new §6.7.2.5 The Typeof specifiers:
§6.7.2.5 The Typeof specifiers
Syntax
typeof-specifier:
( typeof-specifier-argument )
typeof
( typeof-specifier-argument )
remove_quals
typeof-specifier-argument:
expression
type-name
The
and
typeof tokens are collectively called the typeof operators.
remove_quals Constraints
The typeof operators shall not be applied to an expression that designates a bit-field member.
Semantics
The typeof-specifier applies the typeof operators to an expression (6.5) or a type-name. If the typeof operators are applied to an expression, they yield the type-name representing the type of their operand11�0). Otherwise, they produce the type-name with any nested typeof-specifier evaluated 11�1). If the type of the operand is a variably modified type, the operand is evaluated; otherwise, the operand is not evaluated.
All qualifiers (6.7.3) on the type from the result of a
operation are removed, including the
remove_quals qualifier11�2). Otherwise, for
_Atomic operations, all qualifiers are preserved.
typeof 11�0) When applied to a parameter declared to have array or function type, the
operator yields the adjusted (pointer) type (see 6.9.1).
typeof 11�1) If the typeof-specifier-argument is itself a typeof-specifier, the operand will be evaluated before evaluating the current typeof operation. This happens recursively until a typeof-specifier is no longer the operand.
11�2)
, with parentheses, is considered an
_Atomic ( type - name ) -qualified type.
_Atomic
4.1.8. Add the following examples to new §6.7.2.5 The Typeof specifier:
EXAMPLE 1 Type of an expression.The following program:is equivalent to this program:typeof ( 1 + 1 ) main () { return 0 ; } int main () { return 0 ; } EXAMPLE 2 Types and qualifiers.The following program:is equivalent to this program:const _Atomic int purr = 0 ; const int meow = 1 ; const char * const mew [] = { "aardvark" , "bluejay" , "catte" , }; remove_quals ( meow ) main ( int argc , char * argv []) { remove_quals ( purr ) plain_purr ; typeof ( _Atomic typeof ( meow )) atomic_meow ; typeof ( mew ) mew_array ; remove_quals ( mew ) mew2_array ; return 0 ; } const _Atomic int purr = 0 ; const int meow = 1 ; const char * const mew [] = { "aardvark" , "bluejay" , "catte" , }; int main ( int argc , char * argv []) { int plain_purr ; const _Atomic int atomic_meow ; const char * const mew_array [ 3 ]; const char * mew2_array [ 3 ]; return 0 ; } EXAMPLE 3 Equivalence ofand
sizeof .
typeof int main ( int argc , char * argv []) { // this program has no constraint violations _Static_assert ( sizeof ( typeof ( 'p' )) == sizeof ( int )); _Static_assert ( sizeof ( typeof ( 'p' )) == sizeof ( 'p' )); _Static_assert ( sizeof ( typeof (( char ) 'p' )) == sizeof ( char )); _Static_assert ( sizeof ( typeof (( char ) 'p' )) == sizeof (( char ) 'p' )); _Static_assert ( sizeof ( typeof ( "meow" )) == sizeof ( char [ 5 ])); _Static_assert ( sizeof ( typeof ( "meow" )) == sizeof ( "meow" )); _Static_assert ( sizeof ( typeof ( argc )) == sizeof ( int )); _Static_assert ( sizeof ( typeof ( argc )) == sizeof ( argc )); _Static_assert ( sizeof ( typeof ( argv )) == sizeof ( char ** )); _Static_assert ( sizeof ( typeof ( argv )) == sizeof ( argv )); _Static_assert ( sizeof ( remove_quals ( 'p' )) == sizeof ( int )); _Static_assert ( sizeof ( remove_quals ( 'p' )) == sizeof ( 'p' )); _Static_assert ( sizeof ( remove_quals (( char ) 'p' )) == sizeof ( char )); _Static_assert ( sizeof ( remove_quals (( char ) 'p' )) == sizeof (( char ) 'p' )); _Static_assert ( sizeof ( remove_quals ( "meow" )) == sizeof ( char [ 5 ])); _Static_assert ( sizeof ( remove_quals ( "meow" )) == sizeof ( "meow" )); _Static_assert ( sizeof ( remove_quals ( argc )) == sizeof ( int )); _Static_assert ( sizeof ( remove_quals ( argc )) == sizeof ( argc )); _Static_assert ( sizeof ( remove_quals ( argv )) == sizeof ( char ** )); _Static_assert ( sizeof ( remove_quals ( argv )) == sizeof ( argv )); return 0 ; } EXAMPLE 4 NestedThe following program:.
typeof (...) is equivalent to this program:int main ( int argc , char * []) { float val = 6.0f ; return ( typeof ( remove_quals ( typeof ( argc )))) val ; } int main ( int argc , char * []) { float val = 6.0f ; return ( int ) val ; } EXAMPLE 5 Variable Length Arrays and typeof operators.#include <stddef.h>size_t vla_size ( int n ) { typedef char vla_type [ n + 3 ]; vla_type b ; // variable length array return sizeof ( remove_quals ( b ) ); // execution-time sizeof, translation-time typeof operation } int main () { return ( int ) vla_size ( 10 ); // vla_size returns 13 } EXAMPLE 6 Nested typeof operators, arrays, and pointers.int main () { typeof ( typeof ( const char * )[ 4 ]) y = { "a" , "b" , "c" , "d" }; // 4-element array of "const pointer to char" return 0 ; } EXAMPLE 7 Function types, pointer types, and array types.void f ( int ); typeof ( f ( 5 )) g ( double x ) { // g has type "void(double)" printf ( "value %g \n " , x ); } typeof ( g ) * h ; // h has type "void(*)(double)" typeof ( true? g : NULL) k ; // k has type "void(*)(double)" void j ( double A [ 5 ], typeof ( A ) * B ); // j has type "void(double*, double**)" extern typeof ( double []) D ; // D has an incomplete type typeof ( D ) C = { 0.7 , 99 }; // C has type "double[2]" typeof ( D ) D = { 5 , 8.9 , 0.1 , 99 }; // D is now completed to "double[4]" typeof ( D ) E ; // E has type "double[4]" from D’s completed type
4.1.9. Modify §6.7.3 Type specifiers, paragraph 6:
If the same qualifier appears more than once in the same specifier-qualifier list or as declaration specifiers, either directly , via one or more typeof specifiers, or via one or mores, the behavior is the same as if it appeared only once. If other qualifiers appear along with the
typedef qualifier the resulting type is the so-qualified atomic type.
_Atomic
4.1.10. Modify §6.7.6.2 Array declarators, paragraph 5:
If the size is an expression that is not an integer constant expression: if it occurs in a declaration at function prototype scope, it is treated as if it were replaced by
; otherwise, each time it is evaluated it shall have a value greater than zero. The size of each instance of a variable length array type does not change during its lifetime. Where a size expression is part of the operand of a typeof or
* operator and changing the value of the size expression would not affect the result of the operator, it is unspecified whether or not the size expression is evaluated. Where a size expression is part of the operand of an
sizeof operator, that expression is not evaluated.
_Alignof
4.1.11. Modify §6.9 External definitions, paragraphs 3 and 5:
There shall be no more than one external definition for each identifier declared with internal linkage in a translation unit. Moreover, if an identifier declared with internal linkage is used in an expression
(other than as a part of the operand of athere shall be exactly one external definition for the identifier in the translation unitor
sizeof operator whose result is an integer constant),
_Alignof ., unless it is:
- part of the operand of a
operator whose result is an integer constant;
sizeof - part of the operand of a
operator whose result is an integer constant;
_Alignof - or, part of the operand of any typeof operator whose result is not a variably modified type.
...
An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression (other than as a part of the operand of a
typeof operator whose result is not a variably modified type, or a,
sizeof or
sizeof operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one.173)
_Alignof
5. Appendix
The following are old sections or references related to older parts of the proposal that have since been superceded and other interesting, but not critical, information.
5.1. Prior Art in Standardization
The C99 rationale states that:
A proposed typeof operator was rejected on the grounds of insufficient utility.
The times have since changed drastically and
became powerfully useful and proved itself as good. Therefore, we are happy to include it. Another paper closer to the release of C11/C17 also came out: [N1229], an omnibus that listed all of the different extensions and evaluated them. There, support was greater for
, but nobody came forward with a paper to follow up on Nick Stoughton’s work.
This paper closes the loop on the request that Nick Stoughton did in that analysis as well as many user requests over the intervening more-than-a-decade of time.
5.2. Keyword Name Ideas (from Revision 2)
There are 3 options for names. We have wording for the options using find-and-replace on the
as well as the
. The option that provides the most consensus will be what is chosen:
5.2.1. Option 1: _Typeof
keyword, < stdtypeof . h >
header
-
for the type of keyword_Typeof -
for the remove qualifications keywordremove_quals
This is the relatively conservative option that uses a
keyword plus
to get access to the convenient spelling. It prevents implementations that have already settled on the
keyword in their extension modes from having to warn users or breakage or deal with that problem. Many have raised issues with this, annoyed at the constant spelling of keywords in fundamentally awkward and strange ways while requiring headers to fix up usage. This is consistent with other new keywords introduced in the Standard to avoid breakage at all costs, but suffers from strong lamentations in needing a header to access a common spelling.
This is the authors' status quo and compromise position.
5.2.2. Option 2: typeof
keyword
-
for the type of keywordtypeof -
for the remove qualifications keywordremove_quals
This is the relatively aggressive (but still milquetoast, overall) option. It takes over the extension that is used in non-conforming C modes in a few compilers, such as XL C and GCC. Maintainers/implementers from GCC and Clang have noted their approval for this option, but e.g. XL C maintainers and implementers are less enthused.
The reason some folks are against this change is because there are "bugs" in the implementation where some qualifiers are preserved, but other implementation-defined qualifiers are not. Most implementations agree that things like
and
should be preserved (and the compiler that did not implement it this way acknowledged that it was, more or less, a mistake). There are also qualifiers that are dropped on some implementations for their vendor-specific extensions. An argument can be made that implementations can continue to do whatever they want with implementation-defined qualifiers as far as
is concerned, as long as they preserve the standard qualifiers.
This option is the authors' overwhelmingly strong preference.
5.2.3. Option 3: Use a completely new keyword spelling
This uses a completely novel name to avoid the problem altogether. These names take no interesting space from users or implementers and it is the safest option, though it risks obscurity in what is a commonly anticipated feature. Names for this include:
-
+qual_typeof remove_quals -
+qualified_typeof remove_qualifiers -
+typeof_qual remove_quals -
+typeof_qualified remove_qualifiers
Choosing this options means picking one of these novel keywords and substituting it for the
spelling in the wording above (not applicable any longer).
This is the authors' least favorite option.