Lesson #29

Bit Manipulation Operators

Overview

In this lesson I will cover the following topics

The binary numbering system.
The bitwise AND, OR, and XOR operators.
The bit shift operators.

Body

Overview

In this lesson I am going to talk about one of the low level features of C: its ability to manipulate individual bits in memory. Many applications never need to use these features. However, any program that needs to interact directly with hardware devices, such a device drivers and embedded systems, will make extensive use of these capabilities. Those of you who are computer technology majors at VTC will need to be familiar with these features when you take, for example, the microprocessor course and the interfacing course.

Some of you taking this course may have already taken VTC's digital electronics course where the binary numbering system is discussed in detail. However, that is not true of all of you. Thus I will write this lesson assuming that you have not seen binary before. If you are familiar with binary then some of this lesson will be a review.

Binary

In our culture we normally count using the base 10. There are 10 digits, 0 to 9, and numbers are written in terms of the powers of 10. For example a number like 1234 is really

(1 * 10^3)  + (2 * 10^2) + (3 * 10^1) + (4 * 10^0)

Here I'm using "^" to indicate "to the power of". Keep in mind also that 10^0 is just one.

Binary is a counting system that just uses two digits: 0 and 1. When you write a binary number you use the same idea as with decimal (base 10) numbers except that everything is written in powers of 2.

10110 = (1 * 2^4) + (0 * 2^3) + (1 * 2^2) + (1 *2^1) + (0 * 2^0)

So in this example I have

      = (1 * 16) + (0 * 8) + (1 * 4) + (1 * 2) + (0 * 1)
      = 16 + 4 + 2
      = 22

In other words 10110 in binary is 22 in decimal. It is very handy to know the powers of two when converting from binary to decimal.

2^0 = 1
2^1 = 2
2^2 = 4
2^3 = 8
2^4 = 16
2^5 = 32
2^6 = 64
2^7 = 128

If you are going to do much work with computers and especially computer electronics, you should probably memorize all the powers of two up to at least 2^16. One handy thing to remember is that 2^10 is 1024; a quantity usually called "1 K". In any event, using the chart above makes converting a eight bit number into decimal fairly easy.

10110010   <--- The number.
76543210   <--- Bit position.

128 + 32 + 16 + 2    <--- Add in the decimal value associated with
                          the one bits.

178        <--- The decimal equivalent.

Converting from decimal into binary can be done in various ways. One method is to repeatedly divide the number by 2 and use the remainder as the next bit going from right to left in the number. We don't need to worry about doing that here.

Hex

Writing numbers down in binary is a major pain in the neck. Here is a typical 64 bit binary number:

1001101010100011110000110101011010000110111011100011010101100110

Ugly, isn't it?

It turns out that it is much nicer to write numbers using the base 16 (hexadecimal or just "hex") than in base 2 or, for that matter, in base 10. To write hex numbers you need 16 digits. We normally only have 10 digits, so computer people just use the letters A through F as six additional digits. Counting in hex goes like this: 0, 1, 2, 3, ..., 9, A, B, C, D, E, F, 10, 11, 12, ..., 19, 1A, 1B, 1C, 1D, 1E, 1F, 20, 21, 22, ..., 99, 9A, 9B, ..., 9F, A0, A1, A2, ..., FD, FE, FF, 100, 101, 102, ...

Look over that sequence above and make sure it makes sense to you. It uses the same sort of cycle as counting in base 10 uses. Just think of the letters A-F as extra digits you have to cycle through. Decoding a hex number into decimal is the same as I've already described.

1F3 = (1 * 16^2) + (15 * 16^1) + (3 * 16^0)
    = 256 + 240 + 3
    = 499

Here I've used the fact that the decimal value of the hex digit 'F' is 15.

So why do we care about hex? It turns out that because 16 is a nice round power of two—in particular it is 2^4—it is very easy to convert from binary to hex and back again. You just have to memorize the binary patterns that correspond to the hex digits. Here they are:

Binary	Hex
0000	0
0001	1
0010	2
0011	3
0100	4
0101	5
0110	6
0111	7
1000	8
1001	9
1010	A
1011	B
1100	C
1101	D
1110	E
1111	F

To convert a binary number to hex, partition the number into groups of four bits, starting at the right side, and then convert each group one at a time. There is no multiplying or dividing necessary. You can do it by just looking at the number. Here's an example

10011100        <--- The number
1001 1100       <--- The number broken into groups of four bits.
   9    C       <--- The hex values that go with each group.
9C              <--- The hex equivalent of the binary 10011100.

Going the other way is just as easy

A7              <--- The number
   A    7       <--- Think about each digit one at a time.
1010 0111       <--- The binary values that go with each digit.
10100111        <--- The binary equivalent of hex A7.

Now here is a typical 64 bit number expressed in hex

3F1C00902A11039B

That's still pretty big and ugly, but it's a lot nicer looking than it would be in binary!

The other nice thing about hex is that "nice round numbers" (at least in the eyes of computers) look round. For example, 64 K is 65,536 in decimal but 10000 in hex. Basically, computers think in binary and hex is just an easier way to write binary numbers.

Back to C

It turns out that you can write integer literals in a C program using hex if you want. To do this, you must put a "0x" in front of the number to alert the compiler that what it is about to see is a hexadecimal value. For example

int number = 0x11CF;

This stores 4559 into number since 4559 is the decimal equivalent of hex 11CF. Either way the same bit pattern is being stored into the integer number. In some cases, however, it makes more sense to express a number in hex rather than decimal.

If you do write hex integers in your program it actually doesn't matter if you use A-F or a-f for the hex digits. This is one of the few areas where C is not case sensitive. I personally like using uppercase letters in hex numbers because I think they look more like the other digits. That's just my preference.

You can print out numbers in hex using the %x or %X format specifier in printf. If you use %x, printf will print out the extra digits in lowercase (a-f). If you use %X, printf will use uppercase (A-F). There are situations when you want to see your results in hex rather than decimal.

The bit manipulation operators

C contains several operators that manipulate the bits of an integer. These operators can only be applied to the integral types. That is not much of a restriction, however, because those are the only types people ever feel tempted to use these operators on.

Let me first explain how the "bitwise AND" operator works. It looks like this

x = y & z;

Unlike the logical AND operator you use in conditional tests, the bitwise AND operator is just a single ampersand. To understand what it does, consider the following two eight bit binary numbers (integers are typically 32 bits, but 8 bits should be enough to illustrate what is happening).

1011,1100       <--- The first number.
0000,1111       <--- The second number.

I'm showing a comma between groups of four bits to make the number a bit easier to read. It's the same idea as why commas are used in large decimal numbers like 1,358,412. I put the commas between groups of four bits and not three to make it easier to think about the hexidecimal representation if necessary.

Take a look at the least significant bits of each number. Those are the bits on the extreme right hand side. The first number has a 0 bit in that position. The second number has a 1 bit in that position. Since 1 (true) AND 0 (false) is false, the result of a bitwise AND would be 0 in that position. The only way you can get a 1 result is if both bits are 1. In order for X AND Y to be true, X must be true AND Y must be true. To get the overall result of the bitwise AND I have to process each pair of bits. I get this

1011,1100       <--- The first number.
0000,1111       <--- The second number.
---------
0000,1100       <--- The bitwise AND of the above numbers.

Notice that since 0 AND anything is 0, all the zeros in the second number effectively blot out the 1s in the first. Furthermore the 1s in the second number allow the bits of the first number to "come through". This is the point of bitwise AND. It can be used to erase individual bits. Watch...

x = y & 0x0000000F;

The number "F" has "1111" as the least significant four bits. All other bits are zeros. Thus doing y & 0x0000000F has the effect of erasing all bits except the lower four.

Suppose I wanted to see if bit number 6 in the value stored in y is a 1. I could do this

if (y & 0x00000040) { ....

To understand how this works, expand that hex number into binary. It's a 32 bit number.

0000,0000,0000,0000,0000,0000,0100,0000

Now let me number the bits. They are numbered from right to left starting with zero. Since there are 32 of them, I will use two lines to show the bit numbers.

 0000,0000,0000,0000,0000,0000,0100,0000  <--- The number.
 1098 7654 3210 9876 5432 1098 7654 3210  <--- The bit number.
  3           2            1           0  <--- Second digit of bit number.

As you can see, my number has a single 1 bit in position #6. When I AND this number into y the result will absolutely for sure have zeros in all bit positions except perhaps position 6. The result will be zero in position 6 too if y has a zero in that bit position. If y has a 1 in that bit position, the result will have a one there as well. Since C regards any non-zero value as "true" the result of y & 0x00000040 will be "true" only if there is a 1 in bit position 6 of y.

You can also use the AND operator to force a bit to take on a value of zero. Here is how it might be done using 32 bit integers.

x = x & 0xFFFFFFBF;

To see how this works write out 0xFFFFFFBF in binary. You will see that it has a zero in bit position 6. This will erase that bit in x and put the result back into x. C has a special operator that combines the AND operator and the assignment. It works just like some of the others you have seen already.

x &= 0xFFFFFFBF;

In addition to bitwise AND, C provides a bitwise OR and a bitwise exclusive-OR operator. The OR operator yields a 1 bit at a particular position if either operand has a 1 bit at that position. The exclusive-OR operator yields a 1 bit at a particular position if either operand, but not both, has a 1 bit at that position. You can use the bitwise OR operator to force bits to 1 and you can use the bitwise exclusive-OR operator to reverse (invert) bits. Here's how

x = x | 0x00000040;     // Uses bitwise OR to force bit #6 to 1.
x = x ^ 0x00000040;     // Uses bitwise XOR to invert bit #6.

As you might guess these examples can be simplified to

x |= 0x00000040;
x ^= 0x00000040;

These bitwise operators may seem rather strange. However, they are very basic and very, very fast. Typically a bit operation can be translated by the compiler into a single machine instruction. Such operations execute in only a few nanoseconds (billionths of a second) on modern machines. Hardware devices often are arranged electronically so that they appear like odd memory locations to the machine. A single bit in such a location can have significant meaning such as, "the motor is on" or "the lock is enabled". A program that has to manipulate the hardware is very interested in checking the value of individual bits and in changing those bits. For example, here is a function that activates a motor, but checks first to make sure the lock is off.

#define MOTOR_PORT      0xFFFF001E  // Address of motor control port.
#define LOCK_BIT        0x01        // Bit #0 shows lock status.
#define POWER_BIT       0x02        // Bit #1 shows motor power status.

...

void activate_motor(void)
{
  char *p = (char *)MOTOR_PORT;
    // Convince compiler to treat an integer as an address.

  if (*p & LOCK_BIT) {
    printf("Error: Can't activate the motor with the lock on!!\n");
  }
  else {
    *p |= POWER_BIT;
  }
}

Here I'm using some object-like macros to make nice names for some otherwise very ugly numbers. This is a good idea because it also makes it easy to upgrade the program should the hardware change to a different address or be reorganized so that the status information appears on different bits.

Using a type conversion operator (described in more detail in Lesson #30) I force the compiler to accept a literal integer as a pointer. This is normally a bad idea, but if you know for sure what is at a certain address, perhaps because you built the hardware yourself, then you can get away with it. C allows you to do this sort of thing and that is one reason why C is the language of choice for hardware control applications.

Next I use a bitwise AND operation to look at only the lock bit in the motor control port. If it's a one (true) then I print an error message and avoid turning on the motor if the lock is enabled. This might be very important. We could be talking about a huge, expensive, and very powerful motor. The last thing we want to do is to try and run it while it is locked! If the lock is not on, then I use a bitwise OR operation to force the power bit to the ON state. Here I'm assuming that the hardware has connected that bit to a suitable electronic device such as a relay.

This is not a course about hardware control. However, if you are a VTC computer engineering technology student you will see these issues again in later courses where we will talk about such matters. However, I want to give all of you a feeling for the significance of these bitwise operators and why they exist in the language.

The shift operators

There are two other bitwise operators that you should know about. They are the shift operators. They work by moving the bits of a number a number of positions to the left or the right. Here is how they look.

x = x << 3;

This shifts the bits of x to the left by 3 positions. Suppose x contained

0011,1000,0000,1111,0000,0000,0000,1111

After shifting left 3 positions the result would be

1100,0000,0111,1000,0000,0000,0111,1000

The pattern of 1s and 0s is the same, but now it's just shifted over. Notice how three zeros were introduced on the right hand end of the number. Notice also how the bits just "fell off" the left hand end.

The shift right operator looks similar

x = x >> 3;

As you might expect, you can also write

x >>= 3;
x <<= 3;

There is a good deal more that could be said about the shift operators. For example their exact action depends on if x is signed or unsigned. However, we don't need to look at those issues here.

Summary

The binary numbering system uses only the digits 0 and 1. Otherwise binary numbers work in the same way as "normal" decimal numbers. Each binary digit (bit) in a binary number carries a weight that is a factor of two greater than the bit to its right.
```
1011 = (1 * 2^3) + (0 * 2^2) + (1 * 2^1) + (1 * 2^0)
```
It is often easier to write binary numbers using the base 16 (hexidecimal) because each hex digit corresponds to exactly four binary digits. C allows you to write numbers in hex using a "0x" prefix like this: 0xFF13.
In C you can combine integers by ANDing together their corresponding bits using the & operator. You can OR together the bits of two integers using the | operator and you can exclusively or (XOR) the bits of two intergers using the ^ operator. Note that in C, the ^ operator is used for bitwise exclusive oring. In some other languages ^ is used to raise numbers to a power. C does not have a "raise to the power of" operator.
You can shift the bits of an integer to the left with the << operator and you can shift the bits of an integer to the right with the >> operator.