Lesson #5

Abstract Data Types

Overview

In this lesson I will cover the following topics:

  1. Constructors

  2. Inline methods

  3. Introduction to overloaded functions and operators.

Body

Attachment

During this lesson I will make several references to a simple date class. That class is contained in the files Date.hpp and Date.cpp and should be available from the web page where you got this lesson.

My date class is covered by the GNU General Public License (GPL). In compliance with the requirements of that license, I have also placed a link to a copy of the GPL on the web page. For more information on the GPL see http://www.gnu.org/.

This date class is serviceable, but it lacks many useful features that a serious date class might have. It is also very inefficient. However, it serves my purposes in this lesson. I invite you to enhance the class as an exercise.

Constructors

My money example in the last lesson had a set method that you could use to give a money object a value. That was fine. Yet with the built-in types you can initialize an object when you declare it like this

int sum = 0;

You can do something similar with class objects by defining a special method called a constructor. The constructor method for a class is called automatically by the compiler when an object of that class is created. You can use the constructor method to initialize your object to some known, valid state. Instead of having your objects start off with indeterminate values, they can start off with perfectly well-defined values.

Take a look at the Date.hpp header file that I attached to this lesson. The Date class implements the concept of calendar dates in a way that is easy and natural to use in your program. The very first method declared in the public section (notice how I put the public section first) has the same name as the class. That means it is a constructor.

class Date {
public:
    Date( );

    // etc...
};

Notice how constructors do not mention a return type in their declaration. This is required. If you specify a return type with a constructor—even void—it is an error.

Normally you would expect to find the implementation of the constructor in the Date.cpp file. It might look like this:

Date::Date( )
{
    D = 1;
    M = 1;
    Y = 1970;
}

Again notice how the return type is not specified. This function sets the date object to January 1, 1970 by default. If you look at the private section of the date class you will see that it is using the "obvious" representation of a date as three integer values: one for the day of the month, one for the month of the year, and one for the year itself.

Since this function is so short, the overhead involved in calling it is a significant percentage of the time to execute it. To improve performance, C++ allows you to specify that a function should be expanded inline by the compiler. The bodies of such functions are put directly into the code where the function is called. No call or return instructions are generated by the compiler (and no stack manipulation is done). This makes executing the function faster.

There are a couple of ways to specify that a function should be inline. One is to put its body directly in the class definition. That is the method used for the date constructor.

class Date {
public:
    Date( ) { D = 1; M = 1; Y = 1970; }
    // etc...
};

Notice that I formatted this function a little differently than normal. That is acceptable in a case like this, I feel, because the function is so short. A longer format such as:

class Date {
public:
    Date( )
    {
      D = 1;
      M = 1;
      Y = 1970;
    }
    // etc...
};

is perfectly acceptable to the compiler, but just seems overly distracting in this situation. This is a case where bending the rules is desirable.

The purpose of this constructor is to initialize a date to something meaningful. In the program below, the date named start has an initial value of January 1, 1970, even though the programmer did not do anything to explicitly set this up. The constructor function is called automatically by the compiler when the object is declared.

int main( )
{
    vtsu::Date start;

    // etc...
}

C++ allows you to define many functions with the same name provided they have parameter lists that differ sufficiently for the compiler to tell them apart. This feature is called function overloading. It applies to normal functions and methods, and it also applies to constructors. The constructor taking no parameters is normally called the default constructor because it is used to initialize an object in a default way. However, my date class also contains a constructor taking three integer parameters that can be used to initialize a Date to something other than the default value.

class Date {
public:
    // etc...

    Date( int day, int month, int year )
        { set( day, month, year ); }

    // etc...
};

This constructor just makes an inline call to the class's set method to do the dirty work of setting the date accordingly. It might be used like this:

int main( )
{
    vtsu::Date my_birthday( 7, 9, 1960 );

    // etc...
}

Here the date object my_birthday is being initialized to September 7, 1960. The use of parentheses to initialize an object is new with C++, but it is necessary here because the constructor takes three parameters and there is no syntax in C for doing initialization in that situation. For consistency, the parentheses are also allowed in C++ when initializing built in objects. For example:

int sum( 0 );

declares sum to be an integer and initializes it to zero. This style of initialization is called function-like initialization since it looks similar to calling a function.

There is a potential ambiguity related to function-like initialization syntax. If you squint at the declaration of sum above, it appears almost to be a declaration of a function named sum that returns an integer. You can tell it isn't a declaration of a function because what is inside the parentheses is an expression used as a constructor argument and not a declaration of a parameter, as would be the case for a function declaration. Nevertheless, there are situations where there is an ambiguity. In those cases, the compiler will report an error, and you will need to use explicit type conversions or other trickery to disambiguate. Fortunately, those situations are rare.

If you have a constructor that takes only one parameter you can use the equal sign to initialize objects with that constructor. The Date class does not have such a constructor, but for purposes of illustration suppose the Money class from the last lesson did:

class Money {
public:
    Money( long D ) { dollars = D; cents = 0; }
    // etc...
};

Then you could say:

vtsu::Money account_balance = 100;
  // Initialize the account_balance to 100 dollars (and zero cents).

Some C++ programmers use the function-like initialization syntax all the time since it is the same for all objects (and for subtle technical reasons it might be faster in some cases). In cases where there is a choice, I tend to use whichever syntax is more natural looking.

Constructors are very powerful. Although they are intended to initialize the object they are constructing, they are completely general functions and can do anything. Some classes have constructors that read configuration files, interact with the user, allocate graphical resources, and so forth. The constructor does whatever is necessary to get the object "up and running." For complicated abstract objects that might be quite a lot.

The compiler will use the constructors you define whenever it is appropriate to do so. Here are some examples:

int main( )
{
    vtsu::Date process_dates[16];
      // An array of date objects. All are initialized with the default constructor.

    vtsu::Date birthdates[] = {
        vtsu::Date( 23,  8, 1948 ), vtsu::Date( 31, 10, 1950 ), vtsu::Date( 7,  9, 1960 )
    };

    // etc...
}

In the second example I'm initializing an array of dates using the constructor that takes three parameters. The expressions Date(...) tell the compiler to use the constructor to build a date object using the given parameters. The birthdates array has three date objects in it.

Destructors

No discussion of constructors would be complete without saying a few words about destructors. Although my Date example does not use any destructors, most classes do.

A destructor is a special method that the compiler calls when an object is about to cease to exist. For example, it is called for local objects when the local objects go out of scope. It is used to clean up the object and return any resources the object was using back to the system. Here is how it might look:

class Example {
public:
    Example() { std::cout << "I'm in Example::Example( )"  << std::endl; }
   ~Example() { std::cout << "I'm in Example::~Example( )" << std::endl; }
};

int main( )
{
    Example object;

    std::cout << "Hello, World!" << std::endl;
    return 0;
}

The destructor has the same name as the class, just like the constructor, and it has no return type specified. However, its name also has a '~' in it as shown in the example above. My class Example shows both the constructor and destructor as inline functions.

When object is initialized its constructor will be called. Just before main returns, object is destroyed and its destructor will be called. Thus, the output of the program is

I'm in Example::Example( )
Hello, World!
I'm in Example::~Example( )

Try it and see!

Destructors make it easy to ensure that things are cleaned up properly. Do you remember how I said that you don't have to worry about closing files when you use IOStreams? That's because the destructors of classes ifstream and ofstream close the files for you. When a file stream object goes out of scope, that detail is automatically handled. The Date class I provided as an example with this lesson has no clean up needs. Hence, there is no destructor. We will look at some classes that do need destructors soon.

Other Date operations

In addition to the constructors and the set operation, class Date provides a few other operations. It provides accessor methods that can be used to get at the components of a Date. They are declared in the class as:

int day( )   const { return D; }
int month( ) const { return M; }
int year( )  const { return Y; }

These are also inline functions since the overhead in calling them would otherwise be huge compared to the time to execute them. They are trivial in this implementation of Date, but they are provided anyway so that the implementation is properly hidden. As I will discuss shortly there are other ways of representing dates besides this obvious one and these functions would become much more complicated with some of those alternative representations.

The word const that appears after the function header means that these are const methods. That means they can be applied to a constant Date object. C++ allows class objects to be constants just like built-in objects. Here is how that might look:

int main( )
{
    const vtsu::Date my_birthday( 7, 9, 1960 );
      // While the constructor is executing the object is not constant.
      // When the constructor finishes, the object will be treated as
      // a constant.

    my_birthday.set( 29, 4, 1954 );
      // Error! set is not a const method and thus can't be used on a
      // constant object.

    if( my_birthday.year( ) < 1970 ) { ...
      // Fine. Year can be applied to a constant object because it is a
      // const method.

    // etc...
}

The idea is that the functions that return the state of the object do not normally change that state. Because they just read the object and don't write to it, I declared them to be const. The compiler will allow those functions to be used on constant dates and that is reasonable. The compiler enforces this concept to a certain degree. Inside a const method, all the data members of the class are taken to be constants. Any expression inside the const method that attempts to modify a data member is an error.

Note: There are issues here that are beyond the scope of this lesson and C++'s handling of those issues is not 100% satisfactory. In any case, the notion of a const method is useful and commonly applied.

The only other public operation supported on dates by this class is advance. It is used to change the value stored in a Date object. Here is an example:

int main( )
{
    vtsu::Date my_birthday( 7, 9, 1960 );

    my_birthday.advance( 1000 );
    // etc...
}

Here I'm computing the date 1000 days after my birthday and putting the result back into my_birthday. In other words, I'm advancing the date stored in my_birthday by 1000 days. Notice that this operation modifies the internal state of my_birthday. It does not return a new date object. You can define your methods to work either way. There are pros and cons to both approaches. For now, I will leave a detailed discussion of those issues for a later lesson.

Operations that are not class methods

In addition to the methods, there are several operations on dates that I implemented as ordinary functions. The first two are overloaded versions of the comparison operators == and <. C++ allows you to define the meaning of most operators as applied to class types. This is done by creating functions with odd looking names like operator== and operator<. These functions are ordinary in every way except for their names. However, the compiler will use these functions whenever you use the corresponding operators. For example:

void do_stuff( const vtsu::Date &incoming )
{
    vtc::date baseline(1, 1, 2000);

    if (incoming < baseline) {
        // The incoming date is before the baseline date.
    }
    else {
        // The incoming date is the same as or later than the baseline date.
    }
}

When I say incoming < baseline the compiler treats that as a function call like operator<(incoming, baseline). Providing overloaded operators is a nice way to make using your class very natural. I will talk about operator overloading in some detail in the next lesson.

Notice that it is only necessary to define two of the six relational operators. The other four are easily defined in terms of the first two. They are defined as inline functions using the reserved word inline. This is necessary because they are not class methods.

In addition to overloaded relational operators, I provided overloaded versions of the inserter and extractor operators to making doing I/O on dates natural as well. Here is a complete program that calculates how old you are in days.

#include <iostream>
#include "Date.hpp"

int main( )
{
    vtsu::Date birthday;
    vtsu::Date today;

    std::cout << "What is your birthday? ";
    std::cin  >> birthday;
      // This calls my overloaded extractor operator to get a date.

    std::cout << "What is today's date? ";
    std::cin  >> today;

    std::cout << "You are " << today - birthday << " days old.";
      // This calls my overloaded operator- to compute the number of days
      // between two Date objects. The result is a long integer which
      // is then printed.

    return 0;
}

Private member functions

Just as it is possible (but not recommended) to have public data members, it is also possible to have private methods. The Date class has is_leap, month_length, next, and previous methods that are private. These methods can only be called by other methods. They can't be used by users of Date in general. They are helper methods that just make the job of implementing the public methods a bit easier. In a future version of Date, with a different internal representation, these functions might no longer exist. Notice that is_leap and month_length are const methods. This is because they do not change the value stored in a Date object.

Information hiding revisited

The Date implementation I've been talking about so far uses the obvious way of representing a date. Storing the day, month, and year separately makes it very easy to do I/O operations with humans since humans expect to see and use dates in that form. However, it is very awkward to do computations with dates in that form because of the peculiarities of our calendar.

An alternative implementation for dates might look like this:

class Date {
private:
    long JD_number;

    // etc...
};

Here instead of storing the day, month, and year separately I store a single julian day number. Every day in history has a unique julian day number (starting from an obscure date around 4700 BCE). This approach makes computing with dates trivial. To find the number of days between two dates, simply subtract the julian day numbers. To advance a date by a given number of days, simply add the desired interval to the date's julian day number.

The downside of this implementation is that now I/O operations on dates are complicated. Humans don't normally want to see julian day numbers and so elaborate conversion functions are necessary to support the natural form of a date that humans expect.

The key point to realize is this: you can switch the internal representation of Date from one form to another any time you want. You will need to rewrite all the class methods to take into account the new internal structure of the class, but programs that use the class will only need to be recompiled. In this case, switching from the obvious representation of dates to the julian day number representation makes the I/O methods harder (and slower) and the computational methods easier (and faster). Some methods that were inline will probably become non-inline while some methods that were not inline will probably become inline. This is all normal and reasonable. The clients of the Date class do not need to know. This is an incredibly important capability.

Summary

  1. Classes can define a special method with the same name as the class that is used to initialize objects of that class when they are declared. This method is called the constructor. By defining suitable constructors for your classes, you can make sure that every object created is in a well-defined state before it is used. This is a very powerful facility.

  2. Although class methods are often defined in a separate .cpp file, it is possible to define them right in the class definition (in a header file). Such methods are said to be inline, and the compiler should, in many cases, be able to replace calls to those methods with their bodies directly. In other words, calling an inline method is faster because no actual CALL instruction is generated by the compiler.

    It is also possible to explicitly declare a function to be inline using the inline reserved word.

  3. It is possible to define multiple functions or methods (including constructors) with the same name as long as the parameter lists for those functions or methods are sufficiently different. For example, you can declare two different print functions like this:

    void print( int );
    void print( char * );
    

    The compiler figures out which one to call based on the argument types. Thus:

    print( 10 );       // Calls print( int )
    print( "Hello" );  // Calls print( char * )
    print( 0 );        // Ambiguous. Compile error. The literal zero might be a null pointer.
    

    Note that overloaded functions are completely independent. They have the same name as each other, but they do not need to do the same thing (to prevent confusion, all functions with the same name should probably have similar purposes).

© Copyright 2023 by Peter Chapin.
Last Revised: July 20, 2023