Pointers is one of the most important features of C++. It is a feature that is (purposely) missing from other popular object-oriented languages like Java and C#.
Pointers are a powerful thing. It gives you direct access to computer memory. However, with great power comes great responsibility, since a great chunk of programming errors in C++ are related to pointers.
In this post we are going to explore pointers and memory and update our game with new features.
Pointers are variables just like int and char. They represent memory addresses. For every available type in C++, you can have a pointer to that type. For example you can have a pointer to int or a pointer to float. You declare a pointer like this:
int * px;
float * pf;
char *c, b, *d;
On the first line we declare a pointer to int. Similarly, on the second we declare a pointer to float. The third line is a little tricky; the * is attached to the name of the variable and not the type in front, so c and d are pointers to char but b is a char.
You can get the address of any variable using the & operator. The returned type of the operator is a pointer to the type of the variable:
int * px;
int x;
px = &x;//(return type is int *)
Pointers are types too, so you can have pointer to pointer to pointer to pointer … type. Example:
int ** px;
int * x;
px = &x;//(return type is int **)
Now px is a pointer to pointer to int. As you can see from the example above, for every pointer level we add a * to the type.
There is also the reverse operation; reading the contents of the memory address the pointer points to:
int * px;
int x = 5;
px = &x;
std::cout << (int)px << ": " << *px;
On the last line you can see we are converting the pointer to an int, in order to get its address in a numerical form. You may get a warning from the compiler about this conversion, since it is usually a thing you don’t want to do. This number is probably a random one each time you run the program. With the * operator, we get contents of the address px and the returned type of this operator is int. This operation is called dereferencing a pointer. For every derefence we “remove” a ponter (*) from the type.
int ** px;
*px;// int *
**px;// int
By now you may be asking, well, why do a memory address needs to have an type. Isn’t a memory address just a number representing the placement of a cell in memory? Yes, it is. But at one point you are going to dereference the pointer to interpret the containing bits. The type of the pointer is the what the compiler uses to determine how many bytes to read and what type do these bytes are. In order to fully understand this concept, we have to take a look a computer memory.
Computer memory is nothing more than a sequence of bits. It may be organized and managed in a lot complicated and useful ways but it is sufficient to think of this as just a serial sequence of bytes and every byte is accessed by its address which is its offset from the start of the memory.
On the image above you can see an example memory layout. On the address 02 for example we have the bits 0x32, which is 32 in hexadecimal and 0011 0010 in binary. We could interpret those bytes as a char, unsigned or signed, use the 3 bytes next to it and interpret those together as an int. The computer wouldn’t know what those bits are, we just play around with bits and bytes under the hood every time we run a program. When dealing with pointers, it is more complicated since all pointers have the same repressentation internally, so in order to read back memory using them we encode the byte size into the pointer type.
Pointers are the same size as ints. Int size is determined by architecture, on 32-bit systems a pointer is 32 bits and on 64 bit systems, it is 64 bits. Since memory addressing is performed using pointers, using a 32bit pointer size, limits useful memory to 2^32 ~= 4 billion bytes ~= 4GB. 64 bit addresses limit us to 4 billion times that (18446744073709551616 bytes)!
You can get the size of a type by using the sizeof operator:
int * px;
float * pf;
char *c, b, *d;
Hexadecimal Aside
0x is denoting that the literal following is in hexadecimal. In hexadecimal every digit is a number from 0-15. 10: A, 11: B, 12: C, 13:D, 14:E, 15:F. It is very useful repressenting bytes that way since they can be repressented using 2 digits. Every digit in hex accounts for 4 bits in binary. There are a lot of hex to binary to decimal converters available, so you don’t actually have to learn hexadecimal conversions by hand. It would be useful though if you get into anything low level that requires examining raw binary data.
Null pointer
There is one memory address that you cannot use. That address is the adress 0 also known more commonly as NULL pointer. NULL is an literal which equals to zero. Since it equals to zero, converting a NULL pointer to boolean returns false. You can mark and check a pointer as pointing to nothing by assinging NULL/0 to it.
In C++11, which is the latest version of the language, you can also use the keyword nullptr to denote a NULL pointer. Since the compiler knows that you are talking about a pointer and not an integer (pre-processor replaces NULL with 0), you can have richer error checking at compile time.
Calling functions by reference
On the previous post about functions, we saw that the arguments we provide to function are mere copies. That means, that changing them has no effect outside of the scope of the function call. The reason for this, is that the copy and the original argument occupy a different place in memory. However, if we provide a memory location as an argument (pointer) then we have a copy of the address. That means that we know which memory location to alter and so we can pass arguments by reference.
void call_by_pointer(int * x){
*x = 5;
};
void call_by_value(int x){
x = 15;
};
int main(){
int y = 7;
std::cout << y << "\n";
call_by_pointer(&y);
std::cout << y << "\n";
call_by_value(y);
std::cout << y << "\n";
return 0;
}
Note that the pointer argument is a copy of a pointer, so changing the pointer itself and not the underlying memory has no effect outside the call. That means that if we want to change the pointer, we have to have a pointer to pointer type. This is very important for dynamic memory allocation that we will see on a new post.