There are some tricks, quirks and features which seems to throw even experienced developers off the track. Thus I periodically revisit this blogpost, to do a sloppy job of gathering them in an unordered list with some short explanations, examples, quotes, or references.
Obviously though, I'm not gonna list absolutely everything, for facts like
"function nan()
cannot set errno
as it shall behave like strtod()
in
certain cases" ain't so much fun.
Resources:
- Advanced C: The UB and optimizations that trick good programmers.
- C99 with Technical corrigenda TC1, TC2, and TC3 included
- Deep C (and C++) by Olve Maudal and Jon Jagger
- Hidden features of C - Stack Overflow
- Interesting ways to use C? : r/C_Programming
- Lesser known C features
- Let's Destroy C
- Mildly interesting quirks of C | Hacker News
- Rob's Programming Blog: How Well Do You Know C?
- Some dark corners of C
- The Preprocessor Iceberg Meme
- What your weirdest C feature? : r/C_Programming
- Array pointers
- Comma operator
- Digraphs, trigraphs and alternative tokens
- Designated initializer
- Compound literals
- Compound literals are lvalues
- Escaping shadowing
- Multi-character constants
- Bit fields
- 0 bit fields
volatile
type qualifierrestrict
type qualifierregister
type qualifier- Flexible array member
%n
format specifier%.*
(minimum field width) format specifier- Other less known format specifiers
- Interlacing syntactic constructs
-->
"operator"idx[arr]
- Negative array indexes
- Constant string concatenation
- Backslash line splicing
- Using
&&
and||
as conditionals - Compile time assumption checking using
enum
s - Ad hoc
struct
declaration in the return type of a function - "Nested"
struct
definition is not kept nested - Flat initializer lists
- Implicit casting of
void
pointers - Static array indices in function parameter declarations
- Macro Overloading by Argument List Length
typedef
syntax is like other specifiers- Function types
- The oddities of relationship between function designators and pointers
- X-Macros
- X-Files
- Named function parameters
- Combining default, named and positional arguments
- Abusing unions for grouping things into namespaces
- Unity builds
- Matching character classes with
sscanf()
- Garbage collector
- Cosmopolitan Libc
- Inline assembly
- Coroutines
- Evaluate
sizeof
at compile time by causing duplicate case error - Detecting constant expressions
- Object Oriented Programming
- Safe(ish) variadic functions
- Metaprogramming
- Preprocessor is a language of its own
- CCAN
- Function pointers to match arrays in
_Generic
Array pointers #
Decay-to-pointer makes regular pointers to array usually not needed:
int arr[10];
int *ap0 = arr; // array decay-to-pointer
// ap0[2] = ...
int (*ap1)[10] = &arr; // proper pointer to array
// (*ap1)[2] = ...
But ability to allocate a big multi-dimensional array on heap is nice:
int (*ap3)[90000][90000] = malloc(sizeof *ap3);
With pointers even VLA can find its use (more here):
int (*ap4)[n] = malloc(sizeof *ap4);
Comma operator #
The comma operator is used to separate two or more expressions that are included where only one expression is expected. When the set of expressions has to be evaluated for a value, only the right-most expression is considered.
For example: b = (a=3, a+2);
– this code would firstly assign value 3
to a
, and then a+2
would be assigned to variable b
. So, at the end,
b
would contain value 5 while variable a
would be 3.
On Wikipedia we can find few more examples.
Digraphs, trigraphs and alternative tokens #
C code may not be portable, but the language itself is probably more portable than any other; there are system using e.g. EBCDIC encoding instead of ASCII, to support them C has digraphs and trigraphs – multi-character sequences treated by the compiler as other characters.
Digraph | Trigraph | iso646.h | |||||
---|---|---|---|---|---|---|---|
<: |
[ |
??= |
# |
and |
&& |
||
:> |
] |
??( |
[ |
and_eq |
&= |
||
<% |
{ |
??/ |
\ |
bitand |
& |
||
%> |
} |
??) |
] |
bitor |
| |
||
%: |
# |
??' |
^ |
compl |
~ |
||
%:%: |
## |
??< |
{ |
not |
! |
||
——– | ———– | ??! |
| |
not_eq |
!= |
||
——– | ———– | ??> |
} |
or |
|| |
||
——– | ———– | ??- |
~ |
or_eq |
|= |
||
——– | ———– | ——– | ———– | xor |
^ |
||
——– | ———– | ——– | ———– | xor_eq |
^= |
- Mini-post: Digraphs and Trigraphs | ENOSUCHBLOG
- C alternative tokens - Wikipedia
- Why are there digraphs in C and C++? - Stack Overflow
- Purpose of Trigraph sequences in C++?
- A brief description of Normative Addendum 1
Designated initializer #
These allow you to specify which elements of an object (array, structure, union) are to be initialized by the values following. The order does not matter!
struct Foo {
int x, y;
const char *bar;
};
void f(void)
{
int arr[] = { 1, 2, [5] = 9, [9] = 5, [8] = 8 };
struct Foo f = { .y = 23, .bar = "barman", .x = -38 };
struct Foo arr[] = {
[10] = { 8, 8, 9 },
[8] = { 1, 8, bar3 },
[12] = { .x = 9, .z = 8 },
};
struct {
int sec, min, hour, day, mon, year;
} z = {
.day = 31, 12, 2014,
.sec = 30, 15, 17
}; // initializes z to { 30, 15, 17, 31, 12, 2014 }
}
- Designated Inits (Using the GNU Compiler Collection (GCC))
- Designated initializers for aggregate types (C only) - IBM Documentation
- What is a designated initializer in C? - Stack Overflow
Compound literals #
A compound literal looks like a cast of a brace-enclosed initializer list. Its value is an object of the type specified in the cast, containing the elements specified in the initializer.
#include <stdio.h>
struct Foo { int x, y; };
void bar(struct Foo p)
{
printf("%d, %d", p.x, p.y);
}
int main(void)
{
bar((struct Foo){2, 3});
return 0;
}
Compound literals are lvalues #
(struct Foo){};
((struct Foo){}).x = 4;
&(struct Foo){};
func(&(struct Foo){.x = 2});
Even if you already knew about compound literals, there's a high chance you've never consciously noticed them being lvalues. And it's important, because when a value is an lvalue, we can get its address (and e.g. pass it to function).
Escaping shadowing #
The following code will return 42, not 3840!
int x = 42;
int func() {
int x = 3840;
{
extern int x;
return x;
}
}
Multi-character constants #
They are implementation dependent and even the standard itself to usually
best avoid them. That being said, using them as self-documenting enum
s
can be quite handy when you may need to deal with raw memory dumps later on.
enum state {
waiting = 'WAIT',
running = 'RUN!',
stopped = 'STOP',
};
For example, on my machine I could localize 'WAIT'
like here:
00001120: c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 .ff...........@.
00001130: f3 0f 1e fa e9 67 ff ff ff 55 48 89 e5 48 83 ec .....g...UH..H..
00001140: 10 c7 45 fc 54 49 41 57 8b 45 fc 89 c6 48 8d 05 ..E.TIAW.E...H..
00001150: b0 0e 00 00 48 89 c7 b8 00 00 00 00 e8 cf fe ff ....H...........
00001160: ff b8 00 00 00 00 c9 c3 f3 0f 1e fa 48 83 ec 08 ............H...
Bit fields #
Declares a member with explicit width, in bits. Adjacent bit field members may be packed to share and straddle the individual bytes.
struct cat {
unsigned int legs : 3; // 3 bits for legs (0-4 fit in 3 bits)
unsigned int lives : 4; // 4 bits for lives (0-9 fit in 4 bits)
};
0 bit fields #
- What does an unnamed zero length bit-field mean in C? - Stack Overflow
- Practical Use of Zero-Length Bitfields - Stack Overflow
- IBM Documentation - IBM Documentation
- Bit fields - cppreference.com
Description from Arm Compiler 6 docs:
A zero-length bit-field can be used to make the following changes:
- Creates a boundary between any bit-fields before the zero-length bit-field and any bit-fields after the zero-length bit-field. Any bit-fields on opposite sides of the boundary are treated as non-overlapping memory locations. This has a consequence for C and C++ programs. The C and C++ standards require both load and store accesses to a bit-field on one side of the boundary to not access any bit-fields on the other side of the boundary.
- Insert padding to align any bit-fields after the zero-length bit-field to the next available natural boundary based on the type of the zero-length bit-field. For example,
char:0
can be used to align to the next available byte boundary, andint:0
can be used to align to the next available word boundary.
An example taken from the SO answer (with slight changes):
struct bar { unsigned char x : 5; unsigned short : 0; unsigned char y : 7; }
The above in memory would look like this (assuming 16-bit
short
, ignoring endian):char pad pad short boundary | | | | v v v v xxxxx000 00000000 yyyyyyy0
The zero-length bit field causes the position to move to next
short
boundary (or: be placed on the nearest natural alignment for the target platform). We definedshort
to be 16-bit, so 16 minus 5 gives 11 bits of padding.
volatile
type qualifier #
This qualifier tells the compiler that a variable may be accessed by other means than the current code (e.g. we are dealing with MMIO device), thus to not optimize away reads and writes to this resource.
- Why is volatile needed in C? - Stack Overflow
- Advanced C: The UB and optimizations that trick good programmers.
- volatile type qualifier - cppreference.com
- volatile (computer programming) - Wikipedia
- Volatile: Almost Useless for Multi-Threaded Programming
restrict
type qualifier #
By adding this type qualifier, a programmer hints to the compiler that for the lifetime of the pointer, no other pointer will be used to access the object to which it points. This allows the compiler to make optimizations (for example, vectorization) that would not otherwise have been possible.
- restrict - Wikipedia
- restrict type qualifier - cppreference.com
- The restrict type qualifier - IBM Documentation
register
type qualifier #
It suggests that the compiler stores a declared variable in a CPU register
(or some other faster location) instead of in random-access memory.
The location of a variable declared with this qualifier cannot be accessed
(but the sizeof
operator can be applied).
Nowadays register
is usually meaningless as modern compilers place variables
in a register if appropriate regardless of whether the hint is given. Sometimes
may it be useful on embedded systems, but even then compiler will probably
provide better optimizations.
Flexible array member #
From Wikipedia:
struct vectord {
short len; // there must be at least one other data member
double arr[]; // the flexible array member must be last
// The compiler may reserve extra padding space here,
// like it can between struct members.
};
struct vectord *vector = malloc(...);
vector->len = ...;
for (int i = 0; i < vector->len; ++i) {
vector->arr[i] = ...; // transparently uses the right type (double)
}
- flexible array member – Jens Gustedt's Blog
- Zero Length (Using the GNU Compiler Collection (GCC))
- The benefits and limitations of flexible array members | Red Hat Developer
%n
format specifier #
This StackOverflow answer presents it reasonably well:
%n
returns the current position of the imaginary cursor used whenprintf()
formats its output.int pos1, pos2; const char *str_of_unknown_len = "we don't care about the length of this"; printf("Write text of unknown %n(%s)%n length\n", &pos1, str_of_unknown_len, &pos2); printf("%*s\\%*s/\n", pos1, " ", pos2-pos1-2, " "); printf("%*s", pos1+1, " "); for (int i = pos1+1; i < pos2-1; ++i) { putc('-', stdout); } putc('\n', stdout);
will have following output
Write text of unknown (we don't care about the length of this) length \ / --------------------------------------
Granted a little bit contrived but can have some uses when making pretty reports.
%.*
(minimum field width) format specifier #
Instead of this:
char fmt_buf[MAX_BUF];
snprintf(fmt_buf, MAX_BUF, "%%.%df", prec);
printf(fmt_buf, num);
do this:
printf("%.*f", prec, num);
when you want to pad with variable number of characters.
Other less known format specifiers #
Have a look at §7.21.6.1
and §7.21.6.2
of the draft of C11 standard. You'll find %#
, %e
, %-
, %+
, %j
, %g
, %a
and few other interesting specifiers.
Interlacing syntactic constructs #
The following is syntactically correct C code:
#include <stdio.h>
int main()
{
int n = 3;
int i = 0;
switch (n % 2) {
case 0:
do {
++i;
case 1:
++i;
} while (--n > 0);
}
printf("%d\n", i); // 5
}
I know goto
phobic programmers using it like this:
switch (x) {
case 1:
// 1 specific code
if (0) {
case 2:
// 2 specific code
}
// common for 1 and 2
}
The most famous usage of this quirk/"feature" is Duff's device:
send(to, from, count)
register short *to, *from;
register count;
{
register n = (count + 7) / 8;
switch (count % 8) {
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while (--n > 0);
}
}
-->
"operator" #
The following is correct C code:
size_t n = 10;
while (n --> 0) {
printf("%d\n", n);
}
You may ask, since when C has such operator and the answer is: since never.
-->
is not an operator, but two separate operators --
and >
written
in a way they look like one. It's possible, because C cares less than more
about whitespace.
n --> 0
is equivalent of (n--) > 0
idx[arr]
#
Square brace notation of accessing array elements is a syntactic sugar for pointer arithmetics:
arr[5]
≡ *(arr + 5)
≡ *(5 + arr)
≡ 5[arr]
You absolutely must never use this in actual code… but it's hella fun otherwise!
// array[index]
boxes[products[myorder.product].box].weight;
// index[array]
myorder.product[products].box[boxes].weight;
Negative array indexes #
For quick and dirty debugging purposes I wanted to check if padding at the end of an array is filled with correct value, but I didn't know where the padding starts. Thus I did the following:
int *end = arr + (len - 1);
if (end[0] == VAL && end[-1] == VAL && end[-5] == VAL) {
puts("Correct padding");
}
Constant string concatenation #
You don't need sprintf()
(nor strcat()
!) to concatenate strings literals:
#define WORLD "World!"
const char *s = "Hello " WORLD "\n"
"It's a lovely day, "
"innit?";
Backslash line splicing #
Each instance of a backslash character \
immediately followed by a new-line
character is deleted, splicing physical source lines to form logical source lines.
#define I_AM_O\
NE_MACRO 123
// I am a comment. \
I'm stil the same comment. \
I'm a so-called ONE-LINE comment!
int fun()
{
if (drive == 2) // drive 2 is C:\
return 1; <-- my firend here is still part of a COMMENT!!
writestuff();
return 0;
}
int main()
{
int x = I_AM_ONE_MACRO; // correctly expands to 123
int same_\
variable = 1;
same_variable = 1;
const char *p = "String with\
so many spaces in the MIDDLE!";
puts(p); // "String with so many spaces in the MIDDLE!"
return 0;
}
Using &&
and ||
as conditionals #
If you write Shell scripts, you know what I mean.
#include <ctype.h>
#include <stdio.h>
#include <stdbool.h>
int main(void)
{
1 && puts("Hello");
0 && puts("I won't");
1 && puts("World!");
0 && puts("be printed");
1 || puts("I won't be printed either");
0 || puts("But I will!");
true && (9 > 2) && puts("9 is bigger than 2");
isdigit('9') && puts("9 is a digit");
isdigit('n') && puts("n is a digit") || puts("n is NOT a digit!");
isalpha('a') && !puts("I'll be printed") || puts("But me AS WELL!");
return 0;
}
The compiler will probably scream warnings at you as it's really uncommon to do this in C code.
Compile time assumption checking using enum
s #
#define D 1
#define DD 2
enum CompileTimeCheck
{
MAKE_SURE_DD_IS_TWICE_D = 1/(2*(D) == (DD)),
MAKE_SURE_DD_IS_POW2 = 1/((((DD) - 1) & (DD)) == 0)
};
Can be useful for libraries with compile-time configurable constants.
Ad hoc struct
declaration in the return type of a function #
You can define struct
s in very (at first glance) random places:
#include <stdio.h>
struct Foo { int a, b, c; } make_foo(void) {
struct Foo ret = { .c = 3 };
ret.a = 11 + ret.c;
ret.b = ret.a * 3;
return ret;
}
int main()
{
struct Foo x = make_foo();
printf("%d\n", x.a + x.b + x.c);
return 0;
}
"Nested" struct
definition is not kept nested #
#include <stdio.h>
struct Foo {
int x;
struct Bar {
int y;
};
};
int main()
{
struct Bar s = { 34 }; // correct
// struct Foo.Bar s; // wrong
printf("%d\n", s.y);
return 0;
}
Flat initializer lists #
int arr[3][3] = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
// = { {1,2,3}, {4,5,6}, {7,8,9} };
struct Foo {
const char *name;
int age;
};
struct Foo records[] = {
"John", 20,
"Bertha", 40,
"Andrew", 30,
};
Implicit casting of void
pointers #
C11 §6.3.2.3 ¶1:
A pointer to
void
may be converted to or from a pointer to any object type. A pointer to any object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.
C11 §6.5.16.1 ¶1:
—the left operand has atomic, qualified, or unqualified pointer type, and (considering the type the left operand would have after lvalue conversion) one operand is a pointer to an object type, and the other is a pointer to a qualified or unqualified version ofvoid
, and the type pointed to by the left has all the qualifiers of the type pointed to by the right;
void*
was added to C89 because of a need for generic pointer
type which can be implicitly casted back and forth.
In fact, explicitly casting void
pointers has the following problems:
- it is unnecessary, as
void*
is automatically and safely promoted to any other pointer type; - it adds clutter to the code, casts are not very easy to read (especially if the pointer type is long);
- it makes you repeat yourself;
- it can hide an error would return type change from
void*
to something more concrete.
Static array indices in function parameter declarations #
Except in certain contexts, an unsubscripted array name (for example,
region
instead ofregion[4]
) represents a pointer whose value is the address of the first element of the array, provided that the array has previously been declared. An array type in the parameter list of a function is also converted to the corresponding pointer type. Information about the size of the argument array is lost when the array is accessed from within the function body.To preserve this information, which is useful for optimization, C99 allows you to declare the index of the argument array using the static keyword. The constant expression specifies the minimum pointer size that can be used as an assumption for optimizations. This particular usage of the static keyword is highly prescribed. The keyword may only appear in the outermost array type derivation and only in function parameter declarations. If the caller of the function does not abide by these restrictions, the behavior is undefined.
The following examples show how the feature can be used.
int n; void foo(int arr[static 10]); // arr points to the first of at least 10 ints void foo(int arr[const 10]); // arr is a const pointer void foo(int arr[const]); // const pointer to int void foo(int arr[static const n]); // arr points to at least n ints (VLA)
void foo(int p[static 1]);
is effectively a standard
way to declare that p
must be non-null pointer.
Macro Overloading by Argument List Length #
#include <stdio.h>
#include "cmoball.h"
#define NoA(...) CMOBALL(FOO, __VA_ARGS__)
#define FOO_3(x,y,z) "Three"
#define FOO_2(x,y) "Two"
#define FOO_1(x) "One"
#define FOO_0() "Zero"
int main()
{
puts(NoA());
puts(NoA(1));
puts(NoA(1,1));
puts(NoA(1,1,1));
return 0;
}
typedef
syntax is like other specifiers #
Usually we declare new types using the following syntax:
typedef unsigned char byte;
typedef struct {
int x;
int y;
const char *p;
} Record;
But other placements of typedef
keyword are possible:
unsigned typedef char byte;
struct {
int x;
int y;
const char *p;
} typedef Record;
And we can still declare multiple types in one go:
struct {
int x;
int y;
const char *p;
} typedef Record, record, *record_ptr;
Function types #
Function pointers ought to be well known, but as we know the syntax is bit awkward.
On the other hand, less people know you can (as with most objects in C) create
a typedef
for function type.
#include <stdio.h>
int main()
{
typedef double fun_t(double);
fun_t sin, cos, sqrt;
fun_t *ftpt = &sqrt;
printf("%lf\n", ftpt(4)); // 2.000000
return 0;
}
The oddities of relationship between function designators and pointers #
Example by u/AnonymouX47 from Reddit post What your weirdest C feature?:
Let's have a simple function prototype:
void f(void);
Those lines are equivalent to each other:
void (*fp)(void) = f; void (*fp)(void) = *f; void (*fp)(void) = &f; void (*fp)(void) = ******f; void (*fp)(void) = &***********f; void (*fp)(void) = ***&***f; void (*fp)(void) = &**&***&***&f;
Those too are equivalent to each other:
f(); (*f)(); (&f)(); (*&f)(); fp(); (*fp)(); (*&fp)(); (****fp)(); (&******fp)(); (**&**fp)(); (*&*&*&*fp)();
But
(&fp)()
or(&*&*&fp)()
won't work.
X-Macros #
- X Macro - Wikipedia
- Real-world use of X-Macros - Stack Overflow
- What are X-macros? – Arthur O'Dwyer
- X macro: most epic C trick or worst abuse of preprocessor? / Arch Linux Forums
- The Most Elegant Macro – Phillip Trudeau
- The X Macro - Digital Mars
- The New C: X Macros - Dr Dobb's
X-Files #
This technique is quite simple to come up with: just think how to make templates
in C using #include
. Despite seeing it from time to time in the wild (maybe
even more frequently than X-Macros), seems that until very recently it didn't
even have a name.
Here's an article (archive) by Chris Bazley, showcasing a bit of this technique.
Named function parameters #
struct _foo_args {
int num;
const char *text;
};
#define foo(...) _foo((struct _foo_args){ __VA_ARGS__ })
int _foo(struct _foo_args args)
{
puts(args.text);
return args.num * 2;
}
int main(void)
{
int result = foo(.text = "Hello!", .num = 8);
return 0;
}
Combining default, named and positional arguments #
Using compound literals and macros to create named arguments (…):
typedef struct { int a,b,c,d; } FooParam; #define foo(...) foo((FooParam){ __VA_ARGS__ }) void (foo)(FooParam p);
adding default arguments is also quite easy:
#define foo(...) foo((FooParam){ .a=1, .b=2, .c=3, .d=4, __VA_ARGS__})
But now positional arguments don't work anymore, and there may be situations where you want to support both options. But I recently realized, that you can make them work by adding a dummy parameter:
typedef struct { int _; int a,b,c,d; } FooParam; #define foo(...) foo((FooParam){ .a=1, .b=2, .c=3, .d=4, ._=0, __VA_ARGS__})
Now, foo can be called in the following ways:
foo(); // a=1, b=2, c=3, d=4 foo(.a=4, .b=5); // a=4, b=5, c=3, d=5 foo(4, 5); // a=4, b=5, c=3, d=5 foo(4, 5, .d=8); // a=4, b=5, c=3, d=8
The dummy parameter isn't needed when you have arguments that are required to be passed by name:
typedef struct { int alwaysNamed; int a,b,c,d; } FooParam; #define foo(...) foo((FooParam){.a=1,.b=2,.c=3,.d=4, .alwaysNamed=5, __VA_ARGS__})
Abusing unions for grouping things into namespaces #
Suppose that you have a
struct
with a bunch of fields, and you want to deal with some of them all together at once under a single name; perhaps you want to conveniently copy them as a block throughstruct
assignment.By using unions you can access both
a.field2
anda.sub
(anda.field2
is the same asa.sub.field2
) without any macros.struct a { int field1; union { struct { int field2; int field3; }; struct { int field2; int field3; } sub; }; };
Unity builds #
Because #include
mechanism is essentially a primitive copy-pasting
contents of included file into the current code, C allows us to make
so called unity builds,
where we dump everything into one translation unit.
Applying this technique sometimes may lead to faster compile times, simplified build process or provide opportunity for optimizations. Unfortunately, is doesn't scale well at all as it doesn't mix with parallel and incremental builds. It also hinders modularization/encapsulation of code.
Matching character classes with sscanf()
#
From this comment on Reddit:
sscanf()
can be used as an ersatz "regex" (not really, only character classes) matcher. For example, one can write something like this to check if the input consists of letters or underscores:int len = 0; char buf[256]; int read_token = sscanf(input, "%255[a-zA-Z_]", buf, &len); if (read_token) { /* do something */ }
or skip whitespace characters:
int len = 0; char buf[256]; sscanf(input, "%255[\r\n]%n", buf, &len); input += len;
Garbage collector #
Boehm GC is a library providing garbage collector for C and C++
Cosmopolitan Libc #
Description from project's website:
Cosmopolitan Libc makes C a build-once run-anywhere language, like Java, except it doesn't need an interpreter or virtual machine. Instead, it reconfigures stock GCC and Clang to output a POSIX-approved polyglot format that runs natively on Linux + Mac + Windows + FreeBSD + OpenBSD + NetBSD + BIOS with the best possible performance and the tiniest footprint imaginable.
Inline assembly #
For a high-level language C communicates quite well with low-level world. You
can write Assembly code and link it against program written in C quite easily.
In addition to that, many compilers offer as an extension (listed as common
in Annex J of the C Standard) a feature called inline assembly, typically
introduced to the code by the asm
keyword.
- Inline Assembly - OSDev Wiki
- Inline assembly - cppreference.com
- Inline Assembler (C) | Microsoft Learn
- Writing inline assembly code - Arm Compiler for Embedded User Guide
- Inline Assembly in C/C++ - University of Alaska Fairbanks
Coroutines #
Apparently there are methods for implementing coroutines in C:
- Coroutines in C
- Coroutine#C - Wikipedia
- Does C support coroutines? - Quora
- c-coroutine
- Libaco
- Libtask
Evaluate sizeof
at compile time by causing duplicate case error #
Assume you are working on embedded system or generally on something
where getting a printf()
output may not be trivial task.
int foo(int c)
{
switch (c) {
case sizeof (struct Foo): return c + 1;
case sizeof (struct Foo): return c + 2;
}
}
Adding such simple function anywhere in your code may (depending on compiler)
produce an error message telling us the result of sizeof
operator.
error: duplicate case value '16'
case sizeof(struct Foo): return c + 2;
^
Detecting constant expressions #
#define ICE_P(x) _Generic((1 ? ((void*)((x)*(uintptr_t)0)) : &(int){1}), int*: 1, void*: 0)
Source: I think I found a C11 compliant way to detect constant expressions : r/C_Programming
TL;DR: Calling
ICE_P
will evaluate to true if the argument is a constant expression and otherwise evaluate to false, so the following:int x = 3; printf("%d %d\n", ICE_P(x), ICE_P(3));
should print
0 1
.
Object Oriented Programming #
- Object-Oriented Programming in C - Quantum Leaps
- Object-orientation in C - Stack Overflow
- Object-Oriented Programming With ANSI C
- C Traps and Pitfalls by Andrew Koenig
- "you can have something like interfaces and virtual methods by using function pointers"
Safe(ish) variadic functions #
Variadic functions are often sources of subtle bugs because theirs arguments are not type-checked and can have unintuitive promotion/conversion rules.
There's also a problem that variadic function doesn't know how many arguments have been passed to it, thus the caller needs to do it manually, either via some sentinel value that marks the end of arguments (e.g
execl()
usingNULL
to mark the end) or by passing the size via another parameter.With some clever usage of C99's variadic macros, combined with compound literals, these shortcomings can be worked around.
While the below example only works when all parameters are of the same type, in cases where different types might be needed, it should be possible to extend this trick with "tagged union" and some usage of C11's_Generic
.
#include <stdio.h>
void print_vec(FILE *f, const int *v, size_t n)
{
for (size_t i = 0; i < n; ++i) {
fprintf(f, "%d\n", p[i]);
}
}
#define print_vec(fstream, ...) \
print_vec((fstream), \
(const int[]){ __VA_ARGS__ }, \
(sizeof (int[]){ __VA_ARGS__ } / sizeof (int)) )
int main(void)
{
print_vec(stdout, 1);
print_vec(stdout, 1, 2, 3);
print_vec(stdout, 1, 2, 3, 4, 5);
return 0;
}
Metaprogramming #
C11 added _Generic
to language, but turns out metaprogramming
by inhumanely abusing the preprocessor is possible even in pure C99:
meet Metalang99 library.
#include <datatype99.h>
datatype(
BinaryTree,
(Leaf, int),
(Node, BinaryTree *, int, BinaryTree *)
);
int sum(const BinaryTree *tree) {
match(*tree) {
of(Leaf, x) return *x;
of(Node, lhs, x, rhs) return sum(*lhs) + *x + sum(*rhs);
}
return -1;
}
Preprocessor is a language of its own #
I've already mentioned some preprocessor tricks, but there's way more! In fact, I could easily make such a list out of preprocessor oddities alone. After all, it is a full fledged, Turing-complete language with its own rules, grammar and caveats; heck, it's not even strictly exclusive to C - there are madlads using it in conjunction with e.g. JavaScript
Thankfully, Tima "Hirrolot" Kinsart - developer of aforementioned Metalang99, already prepared awesome-c-preprocessor list with sane and insane deeds possible to do in C preprocessor.
CCAN #
Comprehensive C Archive Network, modeled after Perl's CPAN (which in turn was modeled after CTAN) is a repository of C code snippets. While it's not such a necessity like for Perl, nor is it officially endorsed or widely used, I think it's existence is interesting enough to warrant a mention.
Function pointers to match arrays in _Generic
#
Right now _Generic
doesn't allow for matching arrays, but document
N3348
by Martin Uecker mentions a trick
to make them match by wrapping the type into function pointer:
#include <stdio.h>
#define LENGTHOF(arr) (sizeof (arr) / sizeof (arr)[0])
#define FOO(n,m) \
_Generic(void (*)(int(*)[n][m]), \
void (*)(int(*)[3][6]) : 1, \
void (*)(int(*)[3][3]) : 2, \
void (*)(int(*)[2][8]) : 3, \
void (*)(int(*)[4][*]) : 4, \
void (*)(int(*)[8][*]) : 5 \
)
#define BAR(arr) FOO(LENGTHOF(arr), LENGTHOF(arr[0]))
int main()
{
printf("%d\n", FOO(3,6)); // 1
printf("%d\n", FOO(3,3)); // 2
printf("%d\n", FOO(2,8)); // 3
printf("%d\n", FOO(4,0)); // 4
printf("%d\n", FOO(4,1)); // 4
printf("%d\n", FOO(4,2)); // 4
int arr[8][2];
printf("%d\n", BAR(arr)); // 5
// printf("%d\n", FOO(5,2)); // error
return 0;
}