Lesson #2

References and const

Overview

In this lesson I will cover the following topics

  1. How to write and compile the basic "Hello, World" program in C++.

  2. Introduce the std namespace.

  3. Introduce references and reference function parameters.

  4. Explain the importance of references to constants.

Body

"Hello, World!" in C++

To start things off, I will show you a program that prints "Hello, World!" on the terminal using C++. Although this program is very simple, it is always a good idea, when working with a new language or environment, to write this simple program first. There is no sense trying to write anything more complicated until you can get something very simple working.

Since C++ is a superset of C (approximately), the basic "Hello, World!" program in C will work. However, C++ comes with an entirely different library of I/O functions that you should get used to using. This library is called IOStreams. Here is the "Hello, World!" program in C++ using the IOStreams library:

  #include <iostream>

  int main( )
  {
      std::cout << "Hello, World!\n";
      return 0;
  }

Let me point out a few things about this program.

  1. To declare the facilities in the IOStreams library you need to #include the header <iostream>. Notice that there is no .h at the end of that header's name. This is normal. The C++ standard library is declared in header files that do not have any extension. Note, however, that is common for programmers to use .h or .hpp extensions on the header files they create.

  2. Unlike in C, you do not need to declare main as taking a void parameter. The empty parentheses indicates that there are no parameters. C++ compilers will accept void in the parameter list, but it is not necessary and commonly left out.

  3. The actual I/O operations are all done on the object named cout. The cout object represents the standard output device---the screen of your console. You can write characters to that device using the << operator. The characters get written out one after another in a sequential stream, and the << operator really just inserts characters into that stream. For this reason it is often called an inserter.

Now I want to say a few words about namespaces.

C++ is designed for very large programs. One of the problems very large programs have is coordinating names. In C, every function in a program must have a unique name. In a large program there might be thousands of functions written by people who don't know each other (for example in competing libraries). If any two of those functions have the same name there are problems.

To avoid naming problems, C++ allows you to put functions and anything else with a name into a "namespace". Each namespace might contain hundreds of names, but since a single library supplier controls the entire name space associated with that library, that supplier can ensure that no two names in it collide.

An application might use several different namespaces. If two functions in different namespaces happen to have the same name, it is not an error because the functions can be distinguished by which namespace they are from. This is exactly the same reason why operating systems allow users to create directories. Two files with the same name can exist on the disk provided they are in different directories.

It happens that the entire standard C++ library is in a namespace called std. Thus whenever I wish to use a name from the library I must prefix that name with a suitable namespace qualifier. That's why I have std:: in front of cout in my sample program above. That tells the compiler (and the programmer) that I'm interested in the cout name in the std namespace.

It might seem like an annoyance to prefix names from the standard library with std::. In fact, at times it is. However, for reasons that will become clear later, you don't need to use namespace qualifiers as much as you might think. If you find yourself accessing names in a particular namespace frequently you can include a using directive at the top of your source file. For example

  #include <iostream>

  using namespace std;

  int main( )
  {
      cout << "Hello, World!\n";
      return 0;
  }

In this version I tell the compiler that it should treat all names in namespace std as if they had been declared in the global namespace. Such names can then be accessed without the std:: qualifier. Note that the using directive should come after the #include <iostream>. If you put it before the #include the compiler will not know about the name std when it sees the using directive and it will produce an error message.

In some older texts you might see the "Hello, World" program like this:

  #include <iostream.h>

  int main( )
  {
      cout << "Hello, World!\n";
      return 0;
  }

This version #includes <iostream.h> instead of <iostream> and does not bother with the namespace stuff at all. This may work, particularly if you are using an old compiler. Namespaces were added to C++ relatively recently. Old compilers do not support them. To maintain compatibility with old programs, modern C++ compilers often distinguish between headers like <iostream> and <iostream.h>. The header with the .h declares things in the global namespace as it was done in the days before namespaces. The header without the .h declares things in namespace std. In this tutorial I will follow the official C++ standard and use namespace std as appropriate. You should do the same in your own programs unless you have some specific reason to do otherwise.

Compiling C++ programs

To compile the basic "Hello, World!" program, you must first put the program into a file. While C programs always end with a .c extension, the convention for C++ programs is less well defined. Different systems tend to use different extensions, although the most common extensions are .cpp, .cc, .cxx, and .C (note the uppercase letter). In the Windows world, the .cpp extension is pretty universal. That is also fine to use on many Unix systems so that is what I will use in this tutorial. Use your favorite text editor to create the file hello.cpp and type in the basic "Hello, World!" program I showed above.

To compile it on a Unix system using gcc do something like:

  $ g++ -o hello hello.cpp

The C++ compiler in the GNU Compiler Collection is called g++. The -o hello tells the compiler to put the executable output into the file named hello. After the compiler finishes do:

  $ hello

to run the program. It should print "Hello, World!" on the terminal. Congratulations, you have written a C++ program!

References

There are two C++ features that I want to introduce right away. We will use them quite a bit. The first feature is references.

A reference is really just another name for something else. It is like an alias. Here is a simple example.

  //
  // The following function illustrates a simple way to use a reference.
  //
  void example( )
  {
      int  number;        // Your typical integer.
      int &ref = number;  // This declares ref to be of type "reference to int"
                          //   and "binds" it to the variable number.

      number = 1;    // Puts 1 into number.
      ref    = 2;    // Puts 2 into number!
  }

The expression ref = 2 puts 2 into the variable number because ref "refers" to number. The reference ref is like an alternate name for number. Anything that you do to ref, you are really doing to number. In my example above there is really only one variable: number. The reference does not actually exist as an entity in memory. It is just a label for something that does exist.

When you declare a reference like I did above you must "bind" it to a real variable. You must tell the compiler to which variable the reference refers. It is an error to declare a reference and not attach it to a variable. Null references do not exist.

My example above is not very interesting. References are hardly ever used this way. In real life, references are mostly used as function parameters. Consider this example

  //
  // This function has reference parameters. This is very common.
  //
  void swap( int &A, int &B )
  {
      int temp = A;   // Save a copy of A aside for later.

      A = B;          // Put B into A.
      B = temp;       // Put the old A into B.
  }

This function might be used like this

  int main( )
  {
      int X, Y;

      // ... Put values into X and Y.

      swap( X, Y );
      return 0;
  }

When swap is called, the parameters A and B are bound to the real variables X and Y. Inside the function A refers to the caller's X and B refers to the caller's Y. The swap function exchanges the values stored in X and Y (via the references) as desired and then returns.

To understand the significance of this, look at this version of swap

  //
  // This function takes integer parameters
  //
  void swap( int A, int B )
  {
      // Same as before
      int temp = A;

      A = B;
      B = temp;
  }

This does not work as expected. When the caller does swap(X, Y), the values of arguments X and Y are copied into the parameters A and B. The function exchanges the copies, but does absolutely nothing with the original variables. C++, like C, passes arguments to functions "by value." This means the arguments are copied into the function parameters when the function is called.

However, reference parameters work differently. When a function with a reference parameter is called, the reference is bound to the argument and no copy of anything is made. Inside the function the reference refers back to the variable that was used by the caller. This facility is often called "pass by reference". In some languages it is the only way parameters are passed.

You can get a similar effect in plain C using pointers like this:

  //
  // This function takes pointer parameters
  //
  void swap( int *A, int *B )
  {
      // Same idea as before.
      int temp = *A;

      *A = *B;
      *B = temp;
  }

In this example A, and B, are pointers to integers that exist elsewhere. Inside the function I have to use the indirection operator (the *) to get at the integers pointed at by those pointers. Notice that temp must still be an integer (not a pointer) since it has to hold a temporary integer value. I am not trying to swap pointers after all! (If that does not make sense to you, reread the above paragraph and think about it awhile until it does).

With my pointer oriented swap function I have to call it like this:

  int main( )
  {
      int X, Y;

      // Put values into X and Y.

      swap( &X, &Y );
      return 0;
  }

This works, but it is annoying to have to put those ampersands ("address of") in there. With references the call is more natural. You will see later that reference parameters are very important in C++. There are certain areas where you must use them to get the effects you want.

It is also possible in C++ to return references from functions. This is handy at times too. We will talk more about how that works in a later lesson.

const

The second C++ feature I'd like to introduce now is C++'s handling of const, particularly how it interacts with references.

Both C and C++ allow you to declare constants by putting the word const in front of the declaration.

  const int MAX = 100;

When you declare a constant you must initialize it. This is because you can't modify the value of a constant after you declare it so if you didn't initialize it there would be no way to give it a value.

  const int MAX;
  MAX = 100;      // Error. Can't modify a constant.

In my example above I used all uppercase letters for the name of my constant. That is common, but not universal.

In many courses people speak of "variables". However, that word is imprecise. Not all so-called variables can vary. In my example above would you call MAX a constant variable? That really doesn't make any sense.

A more accurate term is "object". In my example above, MAX is a constant integer object. Using the word object may sound awkward and overly formal, but it really is better. In this course I will use "object" from now on to talk about objects. I will drop the word variable. The important thing for you to remember is that objects are things that occupy memory and that can hold values. Every object in a C++ program has a type and you have to specify that type when you create the object.

References are not objects, by the way, because they don't generally occupy any memory and they don't have values. The object a reference refers to has a value, but the reference itself does not. This distinction between references and objects is a subtle one but it is a distinction that will prove important in your later study.

Okay... declaring constant objects is neat, but the real beauty of const is as it applies to pointer and reference parameters. I'll leave pointers alone for the moment (but they are handled in an analogous manner) and just talk about references. Consider this example:

  //
  // This function annoys the user.
  //
  void silly( int &count )
  {
      int i;

      for( i = 0; i < count; i++ ) {
          std::cout << "I am silly!\n";
      }
  }

Suppose the main function looked like

  int main( )
  {
      silly( 100 );
      return 0;
  }

It seems like this would print the silly message 100 times. In fact, it produces a compile error. Why? The function silly takes its parameter by reference. The compiler assumes that the function might therefor want to modify the object referred to by that reference. Yet in my main function above, I am giving silly a literal number. That doesn't make sense. How could silly possibly modify the literal constant "100"? This issue doesn't come up when you pass parameters by value because in that case the 100 is copied into the parameter. The function couldn't change the original 100 even if it wanted to.

Now it happens that silly does not really try to modify its parameter. It only reads the value of count. It never writes to it. I can express this fact by defining silly so that it takes a reference to a const. It looks like this

  //
  // This function annoys the user.
  //
  void silly( const int &count )
  {
      int i;

      for( i = 0; i < count; i++ ) {
          std::cout << "I am silly!\n";
      }
  }

The only difference is in the const that appears in the parameter list. Read const int &count as "count is a reference to a constant int". This declaration tells the compiler that the function never writes to the object referred to by that reference. The compiler will enforce this; inside the function count is treated as a constant int. It also means that when you do silly(100) the compiler allows it since it knows for sure that silly will not try to modify the 100.

This is all very fine, but you might wonder what happens when you try to give silly a "normal" (non-const) object like this

  int main( )
  {
      int annoyance = 100;

      silly( annoyance );
      return 0;
  }

Here annoyance is an ordinary integer object that could be modified. Yet when silly is called, it is not an error. Just because silly takes a reference to a constant does not mean it must be used only with constants. It just means that silly promises to not write to the object it is given. If the object is writable that is fine. The function silly doesn't care.

In general if you write a function with a reference parameter and if that function does not modify the object referred to by that parameter, you should declare that parameter as a reference to a const. You will see many, many examples of this in the future. In a later lesson I will explain why it is essential that you follow this rule.

Summary

  1. C++ is approximately a super set of C and thus most C programs will compile without changes as C++ programs.

  2. The entire C++ standard library is in namespace std. When you include a header without the .h extension you need to qualify all library names with std::.

  3. A reference is an alias for something else. When you declare a stand-alone reference you must bind it to an object. Reference parameters are bound to a function's arguments when that function is called.

  4. If a function takes a reference parameter and the function does not change that parameter, then the parameter should be declared as a reference to const. This allows the function to be used in situations where the compiler would otherwise prohibit it.

© Copyright 2016 by Peter C. Chapin.
Last Revised: January 12, 2016