Basics of Memory Addresses in C

来源:互联网 发布:阿里·云客服官方招募 编辑:程序博客网 时间:2024/06/03 15:57

http://denniskubes.com/2012/08/17/basics-of-memory-addresses-in-c/


Basics of Memory Addresses in C

Memory Addresses

It is helpful to think of everything in C in terms of computer memory. Let’s think of computer memory as an array of bytes where each address in memory holds 1 byte. If our computer has 4K of memory for example, it would have 4096 elements in the memory array. When we talk about pointers storing addresses, we are talking about a pointer storing an index to an element in the memory array. Dereferencing a pointer would be getting the value at that index in the array. All of this is of course a lie. How operating systems handle memory is much more complex than this. Memory is not necessarily contiguous and it is not necessarily handed out sequentially. But the analogy provides an easy way to think about memory in C to get started.

Confused about pointers, addresses and dereferencing? Take a look at this 5-Minute Guide to Pointers.

Say our computer has 4K of memory and the next open address is index 2048. We declare a new char variable i = ‘a’. When the variable gets declared memory is set aside for its value and the variable name is linked to that location in memory. Our char i has a value ‘a’ stored at the address 2048. Our char is a single byte so it only takes up index 2048. If we use the address-of operator (&) on our variable i it would return the address 2048. If the variable was a different type, int for instance, it would take up 4 bytes and use up elements 2048-2051 in the array. Using the address-of operator would still return 2048 though because the int starts at that index even though it takes up 4 bytes. Let’s look at an example.

1// intialize a char variable, print its address and the next address
2char charvar ='\0';
3printf("address of charvar = %p\n", (void*)(&charvar));
4printf("address of charvar - 1 = %p\n", (void*)(&charvar - 1));
5printf("address of charvar + 1 = %p\n", (void*)(&charvar + 1));
6 
7// intialize an int variable, print its address and the next address
8int intvar = 1;
9printf("address of intvar = %p\n", (void*)(&intvar));
10printf("address of intvar - 1 = %p\n", (void*)(&intvar - 1));
11printf("address of intvar + 1 = %p\n", (void*)(&intvar + 1));

Running that you should get output like the following:

1address of charvar = 0x7fff9575c05f
2address of charvar - 1 = 0x7fff9575c05e
3address of charvar + 1 = 0x7fff9575c060
4address of intvar = 0x7fff9575c058
5address of intvar - 1 = 0x7fff9575c054
6address of intvar + 1 = 0x7fff9575c05c

In the first example on lines 1-5 we declare a char variable, print out the address-of the char, and then print out the address just before and just after the char in memory. We get the addresses before and after by getting the using the & operator and then adding or subtracting one. In the second example on lines 7-11 we do the same thing except this time we declare an int variable, printing out its address and the addresses right before and after it.

In the output we see the addresses in hexadecimal. What is important to notice is that the char addresses are 1 byte before and after while the int the addresses are 4 bytes before and after. Math on memory addresses, pointer math, is based on the sizeof the type being referenced. The size of a given type is platform dependent but for this example our char takes 1 byte and our int takes 4 bytes. Subtracting 1 address from a char gives a memory address that is 1 byte previous while subtracting 1 from an int gives a memory address that is 4 bytes previous.

Even though in our example we were using the address-of operator to get the addresses of our variables, the operations are the same when using pointers that hold the address-of a varible.

Some commenters have brought up that storing &charvar – 1, an invalid address because it is before the array, is technically unspecified behavior. This is true. The C standard does have areas that are unspecified and on some platforms even storing an invalid address will cause an error.

Array Addresses

Arrays in C are contiguous memory areas that hold a number of values of the same data type (int, long, *char, etc.). Many programmers when they first use C think arrays are pointers. That isn’t true. A pointer stores a single memory address, an array is a contiguous area of memory that stores multiple values.

1// initialize an array of ints
2int numbers[5] = {1,2,3,4,5};
3int i = 0;
4 
5// print the address of the array variable
6printf("numbers = %p\n", numbers);
7 
8// print addresses of each array index
9do {
10    printf("numbers[%u] = %p\n", i, (void*)(&numbers[i]));
11    i++;
12} while(i < 5);
13 
14// print the size of the array
15printf("sizeof(numbers) = %lu\n",sizeof(numbers));

Running that you should get output like the following:

1numbers = 0x7fff0815c0e0
2numbers[0] = 0x7fff0815c0e0
3numbers[1] = 0x7fff0815c0e4
4numbers[2] = 0x7fff0815c0e8
5numbers[3] = 0x7fff0815c0ec
6numbers[4] = 0x7fff0815c0f0
7sizeof(numbers) = 20

In this example we initialize an array of 5 ints. We then print the address of the array itself. Notice we didn’t use the address-of & operator. This is because the array variable already decays to the address of the first element in the array. As you can see the address of the array and the address of the first element in the array are the same. Then we loop through the array and print out the memory addresses at each index. Each int is 4 bytes on our computer and array memory is contiguous, so each int addres be 4 bytes away from each other.

In the last line we print the size of the array. The size of an array is the sizeof(type) * number of elements in the array. Here the array holds 5 ints, each of which takes up 4 bytes. The entire array is 20 bytes.

Struct Addresses

Structs in C tend to be contiguous memory areas, though not always. And like arrays they hold multiple data types, but unlike arrays they can hold a different data types.

1struct measure {
2  charcategory;
3  intwidth;
4  intheight;
5};
6 
7// declare and populate the struct
8struct measure ball;
9ball.category = 'C';
10ball.width = 5;
11ball.height = 3;
12  
13// print the addresses of the struct and its members
14printf("address of ball = %p\n", (void*)(&ball));
15printf("address of ball.category = %p\n", (void*)(&ball.category));
16printf("address of ball.width = %p\n", (void*)(&ball.width));
17printf("address of ball.height = %p\n", (void*)(&ball.height));
18 
19// print the size of the struct
20printf("sizeof(ball) = %lu\n",sizeof(ball));

Running that you should get output like the following:

1address of ball = 0x7fffd1510060
2address of ball.category = 0x7fffd1510060
3address of ball.width = 0x7fffd1510064
4address of ball.height = 0x7fffd1510068
5sizeof(ball) = 12

In this example we have our struct definition. Then we declare a instance ball of the struct measure and we populate its width, height, and category members with values. Then we print out the address of the ball variable. Like the array varible structs decay to the address of their first element. We then print out each of the struct members. Category is the is the first member and we see that it has the same address as the ball variable. The width member is next followed by the height member. Both have address higher than the category member.

You might think that because category is a char and chars take up 1 byte then the width member should be at an address 1 byte higher than the start. As you can see from the output this isn’t the case. According to the C99 standard (C99 §6.7.2.1), a C implementation can add padding bytes to members for aligment on byte boundaries. It cannot reorder the data members but it can add in padding bytes. In practice most compilers will make each member the same size as the largest member in the struct but this is entirely implementatation specific.

In our example you can see that the char actually takes up 4 bytes and the size of the struct takes a total of 12 bytes. What to take away?

  • A struct variable points to the address of the first member in the struct.
  • Don’t assume that struct members will be a specific number of bytes away from another field, they may have padding bytes or the memory might not be contiguous depending on the implementation. Use the address-of (&) operator on the member to get its address.
  • And use sizeof(struct instance) to get the total size of the struct, don’t assume it is just the sum of its member fields, it may have padding.

Conclusion

Hope this post helps you to understand more about how addresses operate on different data types in C. In a future post we will go over some basics on pointers and arrays in C.

Update 1:Thanks to Sorito, I added a link back to blog post about pointers, addresses, and dereferencing.
Update 2:Thanks to Keith Thompson and tjoff from hacker news for helping clarify struct addresses and memory. I reworked the example code to be more clear about memory.