I wrote this for Lancelot and he said it was so good I should
post it here. This is of interest only to C/C++ programmers.
Pointers are variables that don't hold "ordinary"
program data like integers or characters. Instead, they
contain the addresses of other variables. A pointer variable
that contains the address of another variable is said to
"point at" that variable.
Two operators are strongly connected with pointers:
"&" and "*".
"&" finds the address of a variable. It does
some more in C++, but this is about C. This address is then
usually put into a pointer variable. Examples a bit later.
"*" in a declaration declares a variable as a
pointer to the given type, rather than a normal variable of
that type.
"*" in front of a variable not in a declaration
accesses the value that the variable pointed at by a pointer
variable contains.
Here is the example:
- char c1, dog, egg, foo;
- // all ordinary character variables
- c1 = 'X';
- // c1 now contains character 'X'
- char *p;
- // p is a pointer to a variable of type char
- p = NULL;
- // p points at absolutely nothing. NULL is pretty much
the only
// constant that can be assigned to a pointer. NULL
almost
// always means the address 0 (which is never used for
anything),
// so a pointer containing NULL is a pointer that is not
being used.
- p = c1;
- // ILLEGAL STATEMENT! p is a pointer to char, c1 is a
char.
// These are two very different types. The compiler will
complain.
- p = &c1;
- // now p points at c1 and can be used to access c1's
value.
dog = *p;
// the "*" is used to access the value of the
variable to which
// p points (c1 in this case). So now dog = 'X'.
- *p = 'Z';
- // the "*" can also be used to change the
value in the pointed-at
// variable (c). In this case, c1 becomes 'Z'.
p = &foo;
// now p points at another variable, foo.
There... those were all the wonderful things that can be done
with pointer variables pointing at "normal" scalar
variables. Scalar variables are variables that are neither
structures nor arrays.
Note that, as long as p == &c1, *p means and does
(almost) exactly the same as c1.
Of course, character pointers are only a simple example.
You can have pointers to int, for example.
int istart, ifinish;
int *p2;
p2 = &istart;
*p2 = 1;
p2 = &ifinish;
*p2 = 10;
will set istart to 1 and ifinish to 10.
Again, using pointers is very much like using the
pointed-at variables themselves:
printf("the value of ifinish is: %d", ifinish);
printf("the value of ifinish is: %d", *p2);
do the very same thing, assuming that p2 has not been changed
since setting it to &ifinish.
Using pointers with character variables is sometimes
useful, but pointers are most effective when used with arrays.
int arr[3] = { 2, 4, 6, 0 };
int *pp = arr[0];
Now pp points at arr[0] (which contains a 2). *pp can be used
to read or change the value of arr[0]. (yawn! so what?) What
is really useful is that incrementing and decrementing pp will
move it around in the array! You can say
pp = pp + 1;
Notice that there are no stars, so you aren't changing
anything in arr. You are changing pp to point at the next
element of arr, or arr[1]. pp++; will do the same thing.
Having a pointer to an array lets you manipulate the array
without using subscripts, and often without knowing which
array you are manipulating or exactly where you are in the
array. This is one of the big concepts of C.
Look at this code:
pp = &arr[0];
while (*pp != 0) {
*pp = *pp + 5;
pp++;
}
This is a typical array manipulation sequence: pp starts at
arr[0] and is used to add 5 to every element. pp is
incremented to get to the next array element. When pp reaches
the element that has a value of 0, the loop stops.
Several things can be shortened in the above code:
- the name of an array is the same as the address of its
first element (element 0). So it's always possible to say
pp = arr; instead of pp = &arr[0];
- Adding a value to a variable can be abbreviated using
the += operator;
- Initialization, testing and incrementing are ideal
candidates for a for loop:
for (pp=arr; *pp!=0; pp++) *pp += 5;
That's a lot of power for a one line statement!
Now you see why character strings in C often end in '\0':
The C library has many loops similar to the one above that
depend on a 0 value to tell them they have reached the end of
the string.
Now for some more magical abbreviations in C:
You can use a pointer like an array, except in some rare
cases where the size of the array is significant. So, you can
say *pp or pp[0] interchangeably. Both access the integer that
pp is currently pointing at. Note that the subscript has
nothing to do with the subscripts of the array being pointed
at! If you do pp = &arr[3]; then pp[2] does not access
arr[2] but rather arr[5] ! The subscript after a pointer is
simply added to the place the pointer is currently pointing
to, and the new location calculated from that.
You can also use an array like a pointer, except that you
can't move it around. In particular, you can pass an array to
a function that expects a pointer, and (usually) a pointer to
a function that expects an array. It's just important that the
type that the pointer points to is the same as the type of the
array. The compiler will usually catch mistakes if you get
this wrong.
This is practically all there is to know about pointers. I
will hint at some advanced topics concerning pointers but
leave them to your imagination:
- pointers can also point at structures. The FILE *file
useage is a typical example of a pointer to a structure.
You can look at it in stdio.h.
- pointers can also point at other pointers. This is
sometimes useful and necessary but it hurts the brain.
You can usually tell these multiple-level pointers by
the fact that they are used with multiple stars, like
this: **pp3.
- pointers can also point at multi-dimensional arrays.
To be used like this, they need to be either pointers at
pointers or pointers at arrays of one less dimension.
An important example of multiple-level pointering is the
main() argument argv[]. argv[] is an array of strings. But we
know that a string is really an array of characters. This is
why main() is sometimes declared as
int main(int argc, char *argv[]);
and sometimes as
int main(int argc, char **argv);
both of which basically say the same thing.
Declarations of pointer variables take some getting used
to. You usually find the variable name in the middle of a mess
of stars and brackets. You can either get used to it from
examples or look up the rules in a book. I'm not going to
write down any rules about this without a lawyer.