Shuffling Bits
Background
NOTE: This article is from chapter 7.2 in Hackers’ Delight. I write this post because:
- To have a better understanding of the algorithm
- Need to apply it in a similar situation
Resources sometimes can be scarce in MCU. We may tend to use as little resources as possible and do calculations as fast as we could when the performance of MCU just barely meets our demands.
Besides when talking about calculations in MCU, many people would think about bit operation. For example, in normal C program, we will write:
1 | int a = 8 / 2; |
However, for program running in MCU, people usually write:
1 | int32_t a = 8 >> 1; |
Because the bit operation is faster than arithmetic operation in most situations.
Now cames the problem:
How can we shuffle bits efficiently?
Example
Suppose we need to shuffle a 32-bit word:
1 | abcd efgh ijkl mnop ABCD EFGH IJKL MNOP |
into
1 | aAbB cCdD eEfF gGhH iIjJ kKlL mMnN oOpP |
Step 1
To solve to problem, we can do shuffles like:
1 | abcd efgh ijkl mnop ABCD EFGH IJKL MNOP |
These operations can be done with basic RISC instructions in log2(w/2)
steps. w
is the how many bits the word has. In this example, w = 32
.
1 | x = (x & 0x0000ff00) << 8 | (x >> 8) & 0x0000ff00 | x & 0xff0000ff; |
Step 2
The code above seems quit efficient. But let’s think about how to exchange 2 variables value with XOR
:
1 | a = a ^ b; |
You see, by using XOR
,the exchange becomes much more easier, can we apply the similar mathod to the code above?
1 | t = (x ^ (x >> 8)) & 0x0000ff00; x = x ^ t ^ (t << 8); |
Here is an interesting trick: To unshuffle the word, we could simply perform the swaps in reverse order
1 | t = (x ^ (x >> 1)) & 0x22222222; x = x ^ t ^ (t << 1); |
Bundle
Here comes another question:
Shuffle
1 | 0000 0000 0000 0000 ABCD EFGH IJKL MNOP |
to
1 | 0A0B 0C0D 0E0F 0G0H 0I0J 0K0L 0M0N 0O0P |
With the knowledge above, this becames easy to do:
1 | x = ((x & 0xff00) << 8) | (x & 0x00ff); |
To unshuffle
1 | x = x & 0x55555555; |