Lesson #2

Unix: Various Useful Unix Commands

Overview

In this lesson I will cover the following topics

  1. Unix file naming conventions.

  2. Basic file manipulation commands (cat, more, less, cp, mv, rm).

  3. Wildcard characters and command line syntax.

  4. How to access the on-line manual.

Body

File names

As I mentioned in the last lesson, you are able to store files on the Unix host. Since these files are being stored on a Unix system you will want to know something about basic Unix file handling commands in order to manage them. First, however, you need to know the rules for Unix file names. Here are the important points.

  1. File names are case sensitive. The files afile.txt, AFILE.TXT, and afile.TXT are three different files. Traditionally files are given names using lowercase letters. You should follow that tradition unless you have a good reason to do otherwise.

  2. File names can be very long. For example, you can create a file with a name like:

      this-is-a-file-with-a-very-long-name.txt
    

    It is good to give files descriptive names, but you probably don't want to get too carried away!

  3. Except for the '/' character and the null character, every character is technically legal in a file name. You can create files with spaces, punctuation marks, and even control characters in their names. However files with unusual characters in their names are awkward to handle. I suggest that you stick to using just lowercase letters, numbers, the underscore character ('_'), the dash ('-'), and the dot ('.'). (Note that the underscore character is on the same key as the dash).

  4. Files with names that have a leading dot are "hidden". As I mentioned in the last lesson, you can use the "ls" command to view a list of your files. Actually by default the "ls" command does not show you the hidden files. Such files are typically used by applications to store configuration information. They are not displayed by ls so that you don't have to keep looking at them. You can use the "ls -a" command to show you all your files. Try it:

      $ ls -a
    

    You will see several files you didn't see before, including alpine's configuration file, .pinerc. These "dot files" are actually perfectly ordinary files. You can manipulate them like any other file.

Viewing files

I've already mentioned that the "ls" command gives you a list of your files. It has many options. You have already seen the "-a" option and what that does. Another useful option is the "-l" (ell) option. Try it:

  $ ls -l

This shows you the long form of your file listing. It shows not only each file's name, but also a lot of other information about each file. It shows the size, the date/time when the file was last modified, the name of the file's owner (you), and the permissions on the file, among other things. The file permissions control who can access the file and how they can access it. The Unix host has been configured so that by default only you have access to your own files.

If you actually want to look at the contents of a (text) file, there are several ways you might do it.

  1. You could load the file into a text editor like nano.

  2. You could use the "cat" command like this:

      $ cat afile.txt
    

    The cat command will type the file to the terminal as quickly as possible. If the file is large it will scroll by so fast that you won't be able to read it. The cat command is really only suitable for small files (or special purposes).

  3. You could use the "more" command like this:

      $ more afile.txt
    

    This command will show you the file a screen at a time. You can then use the space bar to see the next screen. You can type the 'b' key to back up to the previous screen.

  4. You could use the "less" command like this:

      $ less afile.txt
    

    This command is a more powerful version of more ("less is more"). I recommend that, in general, if you just want to look at a text file you should use less. Type the 'q' key when you want to quit less. Type the 'h' key while in less to get help.

Manipulating files

There are several commands that you must know to be an effective Unix user. Here they are.

"cp" is how you copy files
  $ cp afile.txt bfile.txt

This command copies afile.txt to bfile.txt. If bfile.txt existed beforehand, it is overwritten without warning. Unix does not believe in warning you about things. It assumes you know what you are doing and that you would rather not be annoyed with warnings. Keep that in mind!

"mv" is how your rename files.
  $ mv oldname.txt newname.txt

This command changes the name of oldname.txt so that it is now newname.txt. If there was a file named newname.txt beforehand, it is lost without warning. If "mv" seems like a strange name for the rename command keep in mind that renaming is like "moving" the file from one name to another.

"rm" is how your remove (delete) files
  $ rm afile.txt

This command deletes the file afile.txt without warning. If the file does not exist you will be told that it can't be found.

Wildcards

If you have a large number of files you want to manipulate you can usually use special wildcard characters to refer to them as a set. There are several wildcard characters, but the most common one is the '*' (the asterisk, but it's usually just called "star"). It means "anything". For example:

  $ rm *.c

This command removes all files that have a ".c" at the end of their names. The leading wildcard character means that the first part of the name can be anything. You saw in Lesson #1 how the '*' by itself can be used to refer to all files. In that situation it means that the entire name can be anything. A command such as:

  $ rm *

this deletes all of your files without warning!

If you give your files consistent names, you can use wildcards to manage independent groups of them more easily. For example, suppose you put all of your programming problem solutions into files with names like prob01.c, prob02.c, etc. If you decided to delete all of those files you could just use a command such as:

  $ rm prob*.c

Keep in mind that this command would also delete problem.c, probability.c, and any other similarly named file if they existed. That may or may not be an issue for you. (If it is, you might be able to use the other wildcard characters that I'm not discussing here to make your wildcard match more specific). Other uses of wildcard characters will be clear after we've discussed the Unix directory structure in the next lesson.

The Manual

Unix comes with an on-line manual that provides detailed information about every command. To access the manual, you should use the "man" command. For example, to read the manual page on the ls command do:

  $ man ls

The manual pages are complete and authoritative. They are also hard to read. They are intended to be references, not tutorials. Even so, I encourage you to get into the habit of looking at the manual pages when you have a question about how a command works. With practice and experience you will be better able to understand them. The only way to get that practice is to keep trying.

Notice that most commands have a number of options. The options start with a dash and they modify the way the command works. For example:

  $ ls

With no options, ls just prints the names of your files.

  $ ls -l

With the -l option, ls prints the long form of the file listing.

  $ ls -a

With the -a option, ls prints the names of all files, including "hidden" ones.

  $ ls -a -l

With both options, ls prints the long listing for all files.

  $ ls -al

Another way to invoke both options. In fact, this is the way it's usually done. The ls command has a large number of options. See the manual page ("man ls") for all the details.

Searching files for text

An operation that you will probably find yourself wanting to do fairly often is search a file or a collection of files for a certain word or phrase. For example, when programming you might want to locate every place where you used the name "line_count" in your various programs. It happens that Unix comes with a command that does such searches easily. It is called "grep".

  $ grep line_count myprog.c

This command, for example, searches the file myprog.c for lines containing "line_count". It then prints out each matching line. If you want it to print out line numbers too, add the "-n" option.

  $ grep -n line_count myprog.c

Text editors can usually search a file for certain text as well. However, grep is nice because it can search a large collection of files just as easily.

  $ grep -n line_count *.c

Here grep prints out the name of each file before displaying the matching lines from that file. This is useful if you are looking for something but forget exactly which file it is in.

Grep is actually quite powerful. It can search for complicated patterns and it has many features. See the man page ("man grep") for the details.

A word on command line syntax

You've probably noticed that Unix commands typically consist of the command name followed by one or more words with spaces between them. This is the way all commands work. The words after the command name are called the command's "arguments". You must use one (or more) spaces between each argument.

Most commands have several options. If you want to use an option, you typically must specify it first, before the other arguments. Options almost always start with a dash and consist of one or more letters. In some cases, options start with two dashes in a row.

Command arguments are often file names. However, depending on the command, they might be other things. For example, the first argument to grep (other than any option) is the word you are looking for. The second argument is the file you want to search. Often commands that accept a file name as an argument can accept a whole list of file names as well. In such a case the command operates on every file in the list. For example

  $ rm afile.txt bfile.txt cfile.txt

This command removes (deletes) the three named files.

When you use a wildcard character the command processor replaces the argument containing the wildcard character with a list of matching filenames. For example, if you say

  $ rm *.c

The command processor might actually execute:

  $ rm first.c second.c third.c

if those where the only .c files you happened to have. Since the rm command can handle this sort of command everything works like you expect. However you must be cautious. Suppose that due to a slipped finger you accidentally type:

  $ rm * .c

Since there is a space between the "*" and the ".c" this command has two arguments. Since the first argument is "*" the command processor replaces it with a list of ALL your files!! The rm command then happily deletes everything. You won't get any warning because Unix believes you know what you are doing and that you mean what you type. Be careful! Make regular backups of your files so that if you accidentally delete more than you wanted to you can recover your work.

Shells

Above I mentioned the "command processor". This is the program that starts executing for you when you first log in. It types the prompt and accepts your commands. It verifies that the commands make sense and arranges to have the commands executed.

Unix allows people to run whichever command processor they like. The system regards your command processor as just another program. Several different ones exist and there are people with fondness for each. In Unix lingo, these command processors are called "shells".

I have configured each of your accounts on the Unix host to use a relatively modern shell (either bash or ksh depending on the particular machine). Unless you are already an experienced Unix user and have a specific reason to switch to another shell, I would suggest that you just continue to use the default. However, I do want you to know that Unix does not actually require any one specific user interface. The system can, and typically does, support several different user interfaces at the same time.

Summary

  1. Unix file names are case sensitive and can contain pretty much any character. To keep life simple, I suggest you only use lowercase letters, digits, and the characters '_' and '.' in your file names. File names can be very long.

  2. You can use cat, more, or less to view the contents of a text file. You can use cp to copy files, mv to rename files, and rm to remove (delete) files.

  3. You can specify a large number of files on the command line at once using a wildcard character. The '*' character matches anything in a file name so, for example, '*.c' would expand to a list of all file names ending with .c. Command line arguments are separated by spaces. Most commands that can process a single file can also process an entire list of files. Command options are typically given as the first argument on the command line and usually are preceded by a dash. Thus "ls -al" invokes the ls command with both the 'a' and 'l' options.

  4. You can get more information about a command by reading the on-line manual page for that command. Use the "man" command to read a manual page. For example "man ls" shows you the manual page for the ls command. Manual pages are detailed and authoritative. They are references, not tutorials. However, you should get used to looking at them. Type the 'q' key while reading a manual page to quit from the man command.

© Copyright 2016 by Peter C. Chapin.
Last Revised: January 11, 2016