Is there a general strategy to create an efficient bit permutation algorithm. The goal is to create a fast branch-less and if possible LUT-less algorithm. I'll give an example:
A 13 bit code is to be transformed into another 13 bit code according to the following rule table:
BIT | INPUT (DEC) | INPUT (BIN) | OUTPUT (BIN) | OUTPUT (DEC) |
---|---|---|---|---|
0 | 1 | 0000000000001 | 0000100000000 | 256 |
1 | 2 | 0000000000010 | 0010000000000 | 1024 |
2 | 4 | 0000000000100 | 0100000000000 | 2048 |
3 | 8 | 0000000001000 | 1000000000000 | 4096 |
4 | 16 | 0000000010000 | 0000001000000 | 64 |
5 | 32 | 0000000100000 | 0000000100000 | 32 |
6 | 64 | 0000001000000 | 0001000000000 | 512 |
7 | 128 | 0000010000000 | 0000000010000 | 16 |
8 | 256 | 0000100000000 | 0000000001000 | 8 |
9 | 512 | 0001000000000 | 0000000000010 | 2 |
10 | 1024 | 0010000000000 | 0000000000001 | 1 |
11 | 2048 | 0100000000000 | 0000000000100 | 4 |
12 | 4096 | 1000000000000 | 0000010000000 | 128 |
Example: If the input code is 1+2+4096=4099 the resulting output would be 256+1024+128=1408
A naive approach would be:
OUTPUT = ((INPUT AND 0000000000001) << 8) OR ((INPUT AND 0000000000010) << 9) OR ((INPUT AND 0000000000100) << 9) OR ((INPUT AND 0000000001000) << 9) OR ...
It means we have 3 instructions per bit (AND, SHIFT, OR) = 39-1 (last OR omitted) instructions for the above example. Instead we could also use a combination of left and right shifts to potentially reduce code size (depends on target platform), but this will not decrease the amount of instructions.
When inspecting the example table, you will of course notice a few obvious possibilities for optimization, for example in line 2/3/4 which can be combined as ((INPUT AND 0000000000111) << 9)
. But beside that it is becoming a difficult tedious task.
Are the general strategies? I think using Karnaugh-Veitch Map's to simplify the expression could be one approach? However it is pretty difficult for 13 input variables. Also the resulting expression would only be a combination of OR's and AND's.
For bit permutations, several strategies are known that work in certain cases. There's a code generator at https://programming.sirrida.de/calcperm.php which implements most of them. However, in this case, it seems to find only basically the strategy you suggested, indicating that it seems hard to find any pattern to exploit in this permutation.