Review Lesson #1

Functions

Overview

In this lesson I will cover the following topics

  1. What is a function and how functions are written.

  2. How functions are declared.

  3. How functions are called.

  4. Local vs global scope.

Body

The anatomy of a function

In C, the basic unit of code is the function. The program starts by executing function main and ends when main returns. In the course of executing main, many other functions normally get called directly or indirectly. Writing a program in C is about writing functions.

You can think of a function as a box. It takes inputs, processes them, and produces an output. The inputs are the parameters and the output is the return value. Many functions produce other side effects as they process their parameters. In some cases the side effects are the main reason the function is used. Here is a picture that illustrates the ideas:

               +----------+
x   ----->     |          |
               |          |
y   ----->     |  func()  |     -----> result
               |          |
x   ----->     |          |
               +----------+

This function, named func, is shown taking three parameters and returning a result. In addition to computing the result, func may also do other side effects such as writing to a log file or printing some text on the display. When a function is documented, the nature and meaning of the parameters and the result must be documented. Furthermore, the side effects caused by the function and any precondition the function requires must also be documented. A precondition is something that must be true before the function can be legitimately called. Without all of this information being available and understood by the programmer, the function can't be used correctly.

Exactly how a function does its work internally is of no interest to the programmer who uses it. Exactly when and where a function will be used is of no interest to the programmer who writes it. Both rely on the documented description of what the function requires and what it does (called the function's interface) to understand how to work with the function. Functions thus allow two separate programmers to work together without even knowing each other. This is very important.

Let me show you a few examples to make things a bit more concrete. The function sqrt in the standard library (declared in math.h) computes the square root of its parameter. It might be used like this:

#include <math.h>
#include <stdio.h>

int main( void )
{
    double result;

    result = sqrt( 2.0 );
    printf( "The square root of 2 is: %f\n", result );
    return 0;
}

Here I'm sending the value 2.0 into the function and putting what the function returns into the variable result. This function has no external side effects, but it does have the precondition that the value it is given be non-negative. You can't use sqrt to compute the square root of a negative number.

My example also uses function printf. The printf function takes several parameters and returns an integer. In particular, it returns the number of characters it actually prints. Normally people don't care about this and, in fact, I'm ignoring the value returned by printf in my example above. Instead printf is usually called because of its very useful side effect: it causes things to be printed on the terminal. The printf function's first parameter is complicated. It is a string containing special formatting characters that describe what is to be printed and how. All of this is described in printf's documentation.

In both cases above, sqrt and printf, the user of the functions does not need to worry about exactly how the functions work. All the user needs to know is what they do and how to get them to do it. The people who wrote the functions need to worry about how to make them work, but don't need to think about the program in which they will be used. This is the essential separation of concerns that I talked about earlier.

When you write a function, you must write both the header and the body. The header contains the function's name, return type, and information about the function's parameters. Here is an example:

int count_items(char *buffer, char ch)
{
    ...
}

The return type is show first. In the example above it is int. Next comes the function's name and then, in parentheses, information about the parameters. The only information the compiler needs to know is the number of parameters, their types, and what names I want to use for them in the function's body. This particular function takes two parameters, the first of which is called buffer and is of type pointer to character, and the second of which is called ch and is of type character.

Before this function can be used or written, however, the programmers must know what it does. Thus, every function must be accompanied by appropriate documentation that describes it. A more correct header for this function is:

/*
 * Scans the null terminated string pointed at by buffer and counts the
 * number of times the character ch is found in that string. It returns
 * the resulting count.
 *
 */
int count_items( char *buffer, char ch )
{
    ...
}

It is important to understand that without documentation, a function is incomplete. The programmers who use a function need to know more about it than the compiler does. Thus, even though the compiler ignores the documentation, it is still a necessary part of the function.

The parameters to the function, buffer and ch in my example above, are generic placeholders. When the function is used, the actual data given to it will be something the caller will provide. The person who defines the function does not know what that data will be. This is normal.

The function definition contains the header as I've described so far and a body inside of braces. It is in the body that the action of the function is specified. Here is how count_items might look:

int count_items( char *buffer, char ch )
{
    int count = 0;

    while( *buffer ) {
        if( *buffer == ch ) count++;
        buffer++;
    }
    return count;
}

This function works by stepping the given pointer down the string until the null byte (which will test as false) is found. For each character in the string, that character is compared against ch. If it is a match a counter is incremented. When the entire string has been scanned, the final count is returned. If it happens that the given pointer is initially pointing at the null character, a count of zero will be properly returned.

I was able to write this function with no knowledge whatsoever of where it will be used. I can't emphasize too much how important that is.

Declaring functions

What I wrote for count_items above is called a definition. It resides in a .c file. Before a function can be used, however, the compiler should first pass over a declaration of that function. A declaration is just the header of the function without the body. The header is terminated with a semicolon (instead of moving right into an open brace). Here is a declaration for count_items:

int count_items( char *buffer, char ch );

This declaration merely informs the compiler that this function exists someplace and allows the compiler to process calls to the function. The declaration does not, by itself, cause the function to become part of your program.

Actually, in a function declaration, the names of the parameters are not significant in any way. They are ignored by the compiler and can be left out. Some programmers would write the declaration of count_items like this:

int count_items( char *, char );

Other programmers prefer to keep the names in the declaration because they help to document the function. It is a matter of style.

Function declarations are typically put in header files so that they can be easily #included into various programs. This is how the standard library is managed and most programmers do the same thing with their own functions as well. When you #include stdio.h you are only showing the compiler declarations of the various I/O functions. The actual function definitions are elsewhere.

Using functions

When a function is invoked, we say that it is called. To call a function, you must provide that function with appropriate arguments (and satisfy any preconditions that the function requires). Here is an example of how count_items might be called:

// Show the compiler the declarations of printf and strcpy.
#include <stdio.h>
#include <string.h>

// Declaration allows compiler to process call properly.
// This would normally be in a header file.
int count_items( char *, char );

int main( void )
{
    char name[128];
    int  e_count;

    strcpy( name, "Peter C. Chapin" );
    e_count = count_items( name, 'e' );

    printf( "There are %d 'e' characters in the string.\n", e_count );
    return 0;
}

When a function is called the values given to it are called the arguments to the function. In the example above, I provide name and 'e' as arguments to count_items. Recall that in C, the name of an array without an index is taken to be the address of the array's first element. Thus name is a pointer to the first character in the name array.

When a function is called the arguments are copied into the function's parameters. The value of the pointer name is copied into the parameter buffer, and the value of 'e' is copied into the parameter ch. The function then executes. The value the function returns (called result in this function) replaces the function call in whatever expression the function call is in. Here, that value is just used to assign to e_count.

It is very important to understand that functions get copies of their arguments. The function can modify its copy (the parameter) without changing the original argument. This is called pass by value and it is the only way functions in C work. Other languages have other ways of doing this, but not C. (In C++ you can pass a reference and allow the function to modify the original argument).

Notice how I distinguish between argument and parameter. When a function is called the arguments are copied into the parameters. The function then works with the parameters. This distinction is subtle but important.

Notice also how I am able to use count_items without any particular knowledge of how it works internally. This is also very important.

Scope

In order to create count_items, or any function, without any knowledge of how it will be used, it is important that variables declared inside the function be local to the function. Here is count_items again:

int count_items( char *buffer, char ch )
{
    int count = 0;

    while( *buffer ) {
        if( *buffer == ch ) count++;
        buffer++;
    }
    return count;
}

The variable count is a local variable. It only exists while count_items is running. When count_items returns that variable will cease to exist. Furthermore, any use of count inside of count_items will refer to the local variable. If a variable named count exists outside the function, it will be temporarily hidden.

In this way, the programmer who creates count_items can choose whatever variable names he/she likes without worrying about how they might conflict with variable names chosen elsewhere in the program. Without this facility, constructing large programs would be nearly impossible to do correctly.

Summary

  1. A function is a body of code that performs a particular task. Functions take inputs, called parameters, and produce a single result. In addition, many functions have side effects. In some cases, the function exists entirely for its side effects. Function definitions consist of a header that specifies the function's return type, name, parameters, and an attached body that specifies the action of the function. Function definitions are normally placed in .c files.

  2. A function declaration is just the function header followed by a semicolon. The names of the parameters are optional in a function declaration and are often left out. Function declarations are normally placed in header files.

  3. When a function is called its arguments are copied into its parameters. The value the function returns is used to replace the call in whatever expression the call is in.

  4. Variables declared inside the function are local to that function. They hid any similarly named variables outside the function. Such variables also (normally) only last as long as the function lasts.

© Copyright 2023 by Peter Chapin.
Last Revised: August 4, 2023