Lesson #11

Pseudo-Code

Overview

The associated file print_format.pcd contains the pseudo-code for a simple print formatter program. It is a practical demonstration of pseudo-code in action.

In this lesson I will cover the following topics

The attitude one should have when programming.
How a program is created.
The pseudo-code design technique.
Why flowcharts are bad.

Body

A few comments on writing programs

In this lesson I'm going to take a break from the nuts and bolts of the C language and talk about program design techniques. This is a very important topic. There are many computer languages out there. If you end up doing much programming the chances are you will be exposed to several different programming languages. In fact, such exposure is a good thing. Knowing why different languages are the way they are gives you insight about programming techniques in general. That insight will be useful to you no matter what programming language you use.

In the long run, specific knowledge of C is unimportant. Once you know how to program, you will be able to teach yourself any language you need to know without taking a course in it. I see ads in the paper for programmers in one language or another, but they make me laugh (or cry depending on my mood). With a good book and a bit of time any decent programmer should be able to learn any reasonable language quickly. Why an employer would insist on previous experience with a particular language is beyond me. What employers really need are people who know how to program. That is what this lesson is about.

Programming is art, not science

When you write a program, you are creating something. To do a good job you must put a bit of yourself into your work. It is your creative spark that will set your program apart from the rest. When it is finished, your work will reflect on you.

I am not an artist, but I imagine that writing a program is like making a sculpture. From nothing you use your own hands to build something beautiful. Yet programming is also about engineering since the final program must be functional as well. Programs have to "work" as well as be beautiful. In that respect, programming is perhaps more like building fine furniture.

In any case, I encourage you to invest a bit of your personal pride in your programming. Your work is your own. It is your art. Make your programs something special just as you are special. With this attitude you will be on your way to becoming a master programmer.

However, the danger of investing your ego in your programs is that it can be hard to separate yourself from the work when it becomes necessary to do so. The ability to let go of your work and to pass it on to someone else is also an essential part of programming. As I've said before, programming is a social act. Many people are involved in the construction of large programs. You need to let other people do their part. It can be hard to watch someone else ruin your masterpiece. But keep in mind that everyone has their own insights. You will never know it all. Learn from those you encounter.

Sequential vs holistic thinking

The computer is a sequential machine. It executes one instruction at a time. (Here I'm assuming that you have only a single processor in your computer). If your computer appears to do two things at once it is just an illusion created by the machine's fantastic speed. Thus your C programs run one statement at a time, in order, from the top of the page toward the bottom of the page. It is true that some constructs in your program cause the machine to loop back and re-execute certain statements or skip over certain statements. However, that does not change the fact that the machine does one thing at a time in a certain sequence. Statements that the machine has not yet reached have not executed and their effects have not yet happened.

This may seem obvious enough to some. Perhaps it even goes without saying. But my experience has been that some students tend to loose sight of this. Some students seem to think they can put statements into their program in almost any order and have it work properly. But that is not (normally) true. Statements must be in a very particular order.

Such non-sequential thinking is not a sign of stupidity. Some people more naturally think in terms of the totality of a problem rather than thinking in terms of the small, individual steps involved in solving it. Such holistic thinking is also very valuable when it comes to programming. To understand what must be done, you need to know what result you want! Yet it is true that in the end, to build a program, you need to write down a highly detailed sequence of steps that describe exactly what the machine must do to get the effect you desire. Converting the "big picture" into that sequence of steps is what designing software is all about. To be really good at it, you need to do both sequential and holistic thinking.

Step 1: Figuring out what you want

To write a program, you need a clear idea of what you want it to do. This might seem obvious, but it is surprising how many people jump right into the act of typing C without knowing where they are going. If you don't have a clear idea of what you want, you might want to write some experimental programs to explore the options. However, if you do that don't fool yourself into believing that your first experimental program is going to be your final work. In Frederick Brooks's classic book The Mythical Man-Month, Essays on Software Engineering, Brooks says that you should always plan on writing a program (at least) twice. The first time you can figure out how you should have done it. The second time you can actually do it.

Students are often reluctant to spend time writing a program just to throw it away and start over. When due dates loom the temptation is great to just force a misguided effort through to completion. This happens in the real world all the time. The result is lousy software. People tend to underestimate the time involved in creating a quality work. Brooks's point is that software development teams should build enough time into their estimates to throw away at least one version. Thankfully, the second version usually goes much faster.

In some cases, there will be a written specification of what needs to be done. Often these specifications are written by someone else. For example, in this course I write the specifications for the programming problems. Your job is to translate that specification into an actual program. Sometimes written specifications are clear, concise, and practical. Often they are vauge, overly detailed, and impossible to implement. The programmer's first task is to negeotiate with the person who wrote the specification so that both parties understand exactly what is to be done.

If you are building your own program, it is often helpful to write your own specification. Many questions and issues will come to your mind as you try to write down exactly what the program is going to do. It is much easier to resolve those issues at that time than later. It does not feel good to have a large program half written and then discover that what you are doing is not going to work the way you want.

When you are thinking about what your program should do, you need to resist thinking too much about how you are going to do it. The first step in building a program is to answer the question "what?" Save the "how?" for later. When you start thinking about how your program is going to do things, you may discover that it is nearly impossible to get some of the effects you want. In that case, come back to this step and rethink what you want. Can you live without certain features? If yes, then change the specification. If not, then bite the bullet and make them work even though it hurts.

Step 2: Designing the program

Once you have a clear idea of what you are trying to accomplish with a program, the next step is to design it. There are many ways of going about this. For a small program, just thinking about it for a while might be good enough. However, such "seat of the pants" design is unsuitable for a medium to large program.

The design technique most often taught in introductory programming courses is called "flowcharting". This technique involves drawing boxes that represent the various actions your program will take and connecting those boxes with lines to indicate which actions follow which. I am very much against flowcharts. Flowcharting is the world's worst design technique. I am not alone in my opinion. In fact, flowcharts are not even mentioned in serious books on software design except to say that flowcharting is an old, obsolete technique that should never be used.

In that case, what design technique is good to use? There are actually several options. The technique I advocate in this course is called "pseudo coding". You have already seen me use it in my earlier examples. Let me describe it in more formally here.

The problem with writing a program is that there is a huge amount of detail you must think about and get right before the program will run. You have to include the right headers, you have to declare main properly, you have to create variables of appropriate types, you have to put all your semicolons in the right places, and on and on. This detail is all necessary, but most of it does not bear directly on your problem. Thinking about all of that detail just clouds the real issues when it comes to design. That detail is just a distraction.

The idea of pseudo code, which I will call "p-code" from now on, is to write down the logic of your program without thinking about all those annoying details. You just write your program using English. However, to make it easier to eventually translate your program into a real programming language, you write your p-code using a sort of structured English that resembles the way a program could be structured.

The beauty of p-code is that if you don't know exactly how to do something it doesn't matter. You can just gloss over your uncertainty and "refine" the p-code later. Let me show you with an example.

Before I get into my example, I want to say a few words about notation. Some people use p-code very informally. Others, like me, adopt a semi-formal notation for it. The fact is that it really doesn't matter all that much if you are formal or informal about your p-code. As long as your p-code embodies the structure of your final program, you will be fine. Keep in mind that I made up the notation that I'm using below. If you want to use it too, that's fine. If you would rather use something a bit different, that's fine too. P-code is never given to the compiler so it doesn't even have to be handled consistently!

Okay... suppose I wanted to write a program that made money. I might start by writing this

WHILE <I don't have enough money> LOOP
  <Make more money>
END

Notice how I put "keywords" like WHILE in all uppercase letters. I do that to make them stand out (the Modula-2 programming language does the same thing for the same reason). I also like to put the word LOOP afterward to make it read easier (Ada does something like that). Finally, I like to put the word END at the end of my p-code blocks to make it clear how much material is in each block. These things are all irrelevant. I do that to make my p-code look a bit more formal. What matters is the structure—I have a loop that keeps going until I've made enough money.

The heart of p-code, however, are the English phrases. It is with those phrases that you describe what your program is doing. You don't need to describe exactly how your program is going to work to the very last detail. In fact, you should avoid doing that! The point of p-code is to organize your logic, not spell out every detail. The details will come later. I put my English phrases in angle brackets to separate them from the control logic. However, that's just the way I do it and that isn't very important.

So far my design is quite satisfactory. However, it isn't very specific. I will now refine my p-code a bit, adding some detail as I do so. How about this:

WHILE <The sum of all funds in my bank accounts is too low> LOOP
  <Make money by working, investing, and stealing>
END

Not bad. Actually, though, my worth is more than just the amount of money I have in the bank. I should really add up all my assets (house, car, etc). Also in what order should I work, invest, and steal? It seems like I should invest last since I need money before I can invest any.

WHILE <My total assests are too low> LOOP
  IF <I have a job> THEN
    <Work>
  ELSE
    <Steal>
  END
  <Invest>
END

Now the logic is starting to become revealed. I've decided that I will only steal if I must (I'm a good guy). If I have a job I will get money by working at my job and avoid stealing. In any case, I will invest what money I get.

But wait... If I invest money I stole I might get caught. Maybe I should only invest money I get from working. Perhaps this is better

WHILE <My total assets are too low> LOOP
  IF <I have a job> THEN
    <Work>
    <Invest>
  ELSE
    <Steal>
  END
END

You see: the time to make design decisions like this is now... before I have delved into the nasty details of the program. It is much easier to rearrange the p-code than it is to rearrange C. This is the time to check over the logic and get it right. That is the whole point of software design.

Actually, I'm still not happy with my p-code. As the outer loop executes sometimes I might have a job and sometimes I might not. Thus some of my assets will be due to my working and some due to my stealing. If I don't ever want to invest assets due to stealing, I'll need to keep track of which is which (the life of a criminal is hard). How about this version

<Set work assets to zero>
<Set stolen assets to zero>
WHILE <My total assets are too low> LOOP
  IF <I have a job> THEN
    <Work>
    <Invest some of my work assets>
  ELSE
    <Steal>
  END
END

At this point, I don't need to put the investment of assets inside the IF anymore. Since I'll only be investing work assets I could put it outside again.

<Set work assets to zero>
<Set stolen assets to zero>
WHILE <My total assets are too low> LOOP
  IF <I have a job> THEN
    <Work>
  ELSE
    <Steal>
  END
  <Invest some of my work assets>
END

The thing about this is that I'll invest some of my work assets even after I've been stealing. Is that really what I want? Hmmm.

You see how this goes? There are lots of decisions to be made about when you want to do something. Moving an action outside of an IF or a loop can have a very significant effect on the behavior of your program. The whole point of software design is to resolve these questions now... before you've written any C. For example, I should probably try to get a job if I don't already have one. Where should I put that into this program? How about this structure:

<Set work assets to zero>
<Set stolen assets to zero>
WHILE <My total assets are too low> LOOP
  IF <I have a job> THEN
    <Work>
    IF <My job is lousy> THEN
      <Try to get a better one>
    END
  ELSE
    <Steal>
    IF <I don't have a job> THEN
      <Try to get one>
    END
  END
  <Invest some of my work assets>
END

Here I take into account my desire to improve my job as well as get one. The actions "Try to get one" and "Try to get a better one" are pretty similar. I wonder if they can be combined. Perhaps this is better

<Set work assets to zero>
<Set stolen assets to zero>
WHILE <My total assets are too low> LOOP
  IF <I have a job> THEN
    <Work>
  ELSE
    <Steal>
  END
  IF <I don't have a job> OR <if my job is lousy> THEN
    <Try to get a good job>
  END
  <Invest some of my work assets>
END

Do I like that? Hmmm.

This process of playing with the p-code and refining the design can go on for a while. Eventually I'll probably have to say something about how I will "Work" and how I will "Steal." As the p-code becomes more refined it becomes more detailed. Each action becomes smaller and more specific. Eventually it becomes pointless to refine the p-code any further. When that happens the design is "ripe" and the program can be converted to a traditional programming language—like C.

Step 3: Converting the design to C (or whatever)

It has been said that about 80% of the effort of writing a large program is in the design. Yet the design happens without one line of C being written. The final 20% of the effort is in actually writing the C. Some programmers regard that as the boring part of the job. In some organizations the most experienced programmers don't write any programs at all; they just do design. The actual "coding" step is religated to inexperienced programmers who just "bang out" the result of the design without thinking any more about it.

In real life it's not that simple. Things come up in the final coding step that influence the design. This is particularly true if you are interested in building a very high performance program. Often the nasty real world will require that a beautiful and elegant design be changed into something ugly before it will actually work. Thus the coders often go back to the designers and say, "that's all very nice, but it can't work like that so you'll have to change it." That's life.

In any case the final step involves taking a highly refined design (p-code) and converting it into the programming language of your choice. A good design can probably be implemented using many different programming languages. The design just expresses the logic and the logic will be the same regardless. In our case, of course, we are using C.

There are several questions the coders must answer

What variables will I need and what should their types be? What names should I give them that will make their purpose clear?
What operations do I need to perform on the variables in order to get the effects called for in the design?
What files do I need to include in order to gain access to the necessary library facilities.
Which operations will I (or should I) implement as separate functions?

We will talk about this last issue in more detail in upcoming lessons. In large designs it is necessary to create many functions. The designer has probably done so in the design already.

If the design is refined enough the answers to these questions might be obvious. If the design is not too refined, the coder might have to complete the final refinement while he or she is writing code. That's okay to a point. The whole reason for doing a design in the first place is to avoid the problems of designing while coding. The design must be reasonably well refined or the coder will have too much to think about at once and you will get lousy software that doesn't work right.

In summary:

Step 1: Figure out what the program is supposed to do.
Step 2: Design the program using p-code. Refine the design a bit at a time, insuring that the logic is correct as you go. Don't worry too much about how you will make things work.
Step 3: Implement the design. When the design is "ripe" make the jump to the programming language of your choice (C in our case) by filling in all the nitty gritty details required to make a real, working program.

Keep in mind that 80% (or more) of your work will be in the first two steps. Actually writing code is the smallest part of the process—and when you do it right, the easiest. Avoid the temptation of writing code immediately. Although that can work for small programs, it is almost always a bad idea when building medium to large programs.

After creating the program, you must test and debug it. I have more to say about that in the next lesson.

P-code vs flowcharts

Some of you may already be familiar with flowcharts. Some of you may be forced to use them in the future (I'm sorry). Thus you might be interested to know just why I (and most other professional programmers) dislike them. Here are my reasons.

P-code is pure text and can be created and edited with a normal text editor. Flowcharts are graphical and require special tools to create and edit.
P-code is pure text and so can co-exist with the final program. It's not unusual for me to include the p-code of a program as a comment in the program itself. That makes a very handy reference for later. You can't do this with flowcharts.
With p-code you can use very long English phrases:
```
<Update the database with the results of the previous two calculations>
```
It is difficult to say such long things in the boxes of most flowcharts. As a result p-code makes it easier to be clear and specific about what you are doing.
P-code looks like a program and can be translated into a normal programming language much more easily. It's not always obvious how to translate a flowchart. In fact, you can create flowcharts that simply can't be translated into a normal programming language.
P-code encourages you to write a well structured program. Since p-code has blocks that nest it prompts you to create a program with properly nested blocks. Flowcharts don't work that way and, in fact, encourage you to create unstructured programs that are impossible to figure out. You can use flowcharts to build structured programs, but you have to add the extra discipline. With p-code you get that discipline for free.

I will say, however, that flowcharts do have some advantages. They are

Flowcharts tend to make for more attractive presentations. They are also fairly understandable to non-programmers. If you have to explain how your program works to a audience of non-programmers, you might consider making a few flowcharts. Don't use the flowcharts to design your program, however.
Since flowcharts can handle unstructured code, they can be helpful if you ever have to maintain an unstructured program. Such programs are hellish to deal with, but by creating a flowchart of the program, you can learn a lot about how it has been put together. In that respect flowcharts are a good *analysis* tool -- but not a good design tool.

Summary

Programming is more art than science. Building a program is a creative process that reflects on the creator. Take pride in your work and don't be satisfied with sloppiness.
When building a program you should first get a precise idea of what needs to be done. Read (or write) a detailed specification of the program and try to have all questions about what is supposed to happen answered before beginning. After you have studied the program's specification design the program using a suitable design technique. Finally, translate your design into a working program.
Do not design your program while you write it. Avoid jumping right into writing C (or any other programming language). Don't code too early!
Pseudo-code allows you to focus on the structure and logic of your program without being distracted by all the details of a full programming language. It is flexible enough to gloss over details while being formal enough to make translating into C reasonably easy.
Flowcharts are bad because they are unstructured. They are difficult to translate into C and difficult to use with normal programming tools (such as text editors).