Consider I want to generate parities at compile time. The parity calculation is given literal constants and with any decent optimizer it will boil down to a single constant itself. Now look at the following parity calculation with the C preprocessor:
#define PARITY16(u16) (PARITY8((u16)&0xff) ^ PARITY8((u16)>>8))
#define PARITY8(u8) (PARITY4((u8)&0x0f) ^ PARITY4((u8)>>4))
#define PARITY4(u4) (PARITY2((u4)&0x03) ^ PARITY2((u4)>>2))
#define PARITY2(u2) (PARITY1((u2)&0x01) ^ PARITY1((u2)>>1))
#define PARITY1(u1) (u1)
int message[] = { 0x1234, 0x5678, PARITY16(0x1234^0x5678));
This will calculate the parity at compile time, but it will produce an enormous amount of intermediate code, expanding to 16 instances of the expression u16
which itself can be e.g. an arbitrary complex expression. The problem is that the C preprocessor can't evaluate intermediary expressions and in the general case only expands text (you can force it to do integer arithmetic in-situ but only for trivial cases, or with gigabytes of #defines).
I have found that the parity for 3 bits can be generated at once by an arithmetic expression: ([0..7]*3+1)/4
. This reduces the 16-bit parity to the following macro:
#define PARITY16(u16) ((4 & ((((u16)&7)*3+1) ^ \
((((u16)>>3)&7)*3+1) ^ \
((((u16)>>6)&7)*3+1) ^ \
((((u16)>>9)&7)*3+1) ^ \
((((u16)>>12)&7)*3+1) ^ \
((((u16)>>15)&1)*3+1))) >> 2))
which expands u16
only 6 times. Is there an even cheaper (in terms of number of expansions) way, e.g. a direct formula for a 4,5,etc. bit parity? I couldn't find a solution for a linear expression of the form (x*k+d)/m
for acceptable (non-overflowing) values k,d,m for a range > 3 bits. Anyone out there with a more clever shortcut for preprocessor parity calculation?
Is something like this what you are looking for? The following "PARITY16(u16)" preprocessor macro can be used as a literal constant in structure assignments, and it only evaluates the argument once.
/* parity.c
* test code to test out bit-twiddling cleverness
* 2013-05-12: David Cary started.
*/
// works for all 0...0xFFFF
// and only evalutes u16 one time.
#define PARITYodd33(u33) \
( \
((((((((((((((( \
(u33) \
&0x555555555)*5)>>2) \
&0x111111111)*0x11)>>4) \
&0x101010101)*0x101)>>8) \
&0x100010001)*0x10001)>>16) \
&0x100000001)*0x100000001)>>32) \
&1)
#define PARITY16(u16) PARITYodd33(((unsigned long long)u16)*0x20001)
// works for all 0...0xFFFF
// but, alas, generates 16 instances of u16.
#define PARITY_16(u16) (PARITY8((u16)&0xff) ^ PARITY8((u16)>>8))
#define PARITY8(u8) (PARITY4((u8)&0x0f) ^ PARITY4((u8)>>4))
#define PARITY4(u4) (PARITY2((u4)&0x03) ^ PARITY2((u4)>>2))
#define PARITY2(u2) (PARITY1((u2)&0x01) ^ PARITY1((u2)>>1))
#define PARITY1(u1) (u1)
int message1[] = { 0x1234, 0x5678, PARITY16(0x1234^0x5678) };
int message2[] = { 0x1234, 0x5678, PARITY_16(0x1234^0x5678) };
#include <stdio.h>
int main(void){
int errors = 0;
int i=0;
printf(" Testing parity ...\n");
printf(" 0x%x = message with PARITY16\n", message1[2] );
printf(" 0x%x = message with PARITY_16\n", message2[2] );
for(i=0; i<0x10000; i++){
int left = PARITY_16(i);
int right = PARITY16(i);
if( left != right ){
printf(" 0x%x: (%d != %d)\n", i, left, right );
errors++;
return 0;
};
};
printf(" 0x%x errors detected. \n", errors );
} /* vim: set shiftwidth=4 expandtab ignorecase : */
Much like the original code you posted, it pairs up bits and (in effect) calculates the XOR between each pair, then from the results it pairs up the bits again, halving the number of bits each time until only a single parity bit remains.
Many people say they are calculating "the parity" of a message. But in my experience, most of the time they are really generating a error-detection code bigger than a single parity bit -- a LRC, or a CRC, or a Hamming code, or etc.
If the current system is compiling in a reasonable amount of time, and it's giving the correct answers, I would leave it alone. Refactoring "how the pre-processor generates some constant" will produce bit-for-bit identically the same runtime executable. I'd rather have easy-to-read source even if it takes a full second longer to compile.
Many people use a language easier-to-read than the standard C preprocessor to generate C source code. See pycrc, the character set extractor, "using Python to generate C", etc.
If the current system is taking way too long to compile, rather than tweak the C preprocessor, I would be tempted to put that message, including the parity, in a separate ".h" file with hard-coded constants (rather than force the C pre-processor to calculate them every time), and "#include" that ".h" file in the ".c" file for the embedded system.
Then I would make a completely separate program (perhaps in C or Python) that does the parity calculations and prints out the contents of that ".h" file as pre-calculated C source code, something like
print("int message[] = { 0x%x, 0x%x, 0x%x };\n",
M[0], M[1], parity( M[0]^M[1] ) );
and tweak my MAKEFILE to run that Python (or whatever) program to regenerate that ".h" file if, and only if, it is necessary.