In this chapter, we'll show you how to upgrade software on your system,
including rebuilding and installing a new operating system kernel.
Although most Linux distributions provide some automated means to install,
remove, and upgrade specific software packages on your system, it is
often necessary to install software by hand. The kernel is the operating
system itself. It is a set of routines and data that is loaded by the
system at boot time and controls everything on the system: software access
to hardware devices, scheduling of user processes, memory management, and
more. Building your own kernel is often beneficial, as you can select which
features you want included in the operating system.
Installing and upgrading free software is usually more complicated than
installing commercial products. Even when you have precompiled binaries
available, you may have to uncompress them and unpack them from an archive
file. You may also have to create symbolic
links or set environment variables so that the binaries know where to
look for the resources they use. In other cases, you'll need to compile
the software yourself from sources.
Another common Linux activity is building the kernel. This is an
important task for several reasons. First of all, you may find yourself
in a position where you need to upgrade your current kernel to a newer
version, to pick up new features or hardware support. Secondly building
the kernel yourself allows you to select which features you do (and do not)
want included in the compiled kernel.
Why is the ability to select features a win for you? All kernel code
and data is "locked down" in memory; that is, it cannot
be swapped out to disk.
For example, if you use a kernel image with drivers for hardware you do not
have or use, the memory consumed by those hardware drivers cannot be
reclaimed for use by user applications. Customizing the kernel allows you
to trim it down for your needs.
7.1. Archive and Compression Utilities
When installing or upgrading software on Unix systems, the first things
you need to be familiar with are the tools used for compressing and
archiving files. There are dozens of such utilities available.
Some of these (such as tar and compress)
date back
to the earliest days of Unix; others (such as gzip) are
relative newcomers. The main goal of these utilities is to
archive files (that is, to pack many files
together into a single file for easy transportation or backup) and
to compress files (to reduce the amount of disk space required
to store a particular file or set of files).
In this section, we're
going to discuss the most common file formats and utilities
you're likely to run into. For instance, a near-universal convention
in the Unix world is to transport files or software as a
tar archive, compressed using compress or gzip.
In order to create or unpack these files yourself, you'll need
to know the tools of the trade. The tools are most often used when
installing new software or creating backups--the subject of
the following two sections in this chapter.
7.1.1. Using gzip and bzip2
gzip is a fast and efficient compression program distributed by the
GNU project. The basic function of gzip is to take a file,
compress it, save the compressed version as filename.gz,
and remove the original, uncompressed file. The original file is
removed only if gzip is successful; it is very difficult to accidentally
delete a file in this manner. Of course, being GNU software, gzip has
more options than you want to think about, and many aspects of its
behavior can be modified using command-line options.
First, let's say that we have a large file named garbage.txt:
rutabaga% ls -l garbage.txt
-rw-r--r-- 1 mdw hack 312996 Nov 17 21:44 garbage.txt
To compress this file using
gzip, we simply use the command:
gzip garbage.txt
This replaces
garbage.txt with the compressed file
garbage.txt.gz. What we end up with is the following:
rutabaga% gzip garbage.txt
rutabaga% ls -l garbage.txt.gz
-rw-r--r-- 1 mdw hack 103441 Nov 17 21:44 garbage.txt.gz
Note that
garbage.txt is removed when
gzip completes.
You can give gzip a list of filenames; it compresses each file
in the list, storing each with a .gz extension. (Unlike the
zip program for Unix and MS-DOS systems, gzip will not,
by default, compress several files into a single .gz archive. That's
what tar is for; see the next section.)
How efficiently a file is compressed depends upon its format and contents.
For example, many graphics file formats (such as GIF and JPEG) are already
well compressed, and gzip will have little or no effect upon such
files. Files that compress well usually include plain-text files, and
binary files such as executables and libraries. You can get information on a gzipped file using gzip -l.
For example:
rutabaga% gzip -l garbage.txt.gz
compressed uncompr. ratio uncompressed_name
103115 312996 67.0% garbage.txt
To get our original file back from the compressed version, we use
gunzip, as in:
gunzip garbage.txt.gz
After doing this, we get:
rutabaga% gunzip garbage.txt.gz
rutabaga% ls -l garbage.txt
-rw-r--r-- 1 mdw hack 312996 Nov 17 21:44 garbage.txt
which is identical to the original file. Note that when you
gunzip a file, the compressed version is removed once the
uncompression is complete.
gzip stores the name of the original, uncompressed file
in the compressed version. This way, if the compressed filename
(including the .gz extension) is too long for the filesystem type
(say, you're compressing a file on an MS-DOS filesystem with 8.3 filenames),
the original filename can be restored using gunzip even if the
compressed file had a truncated name.
To uncompress a file to its original filename, use the -N option
with gunzip. To see the value of this option, consider
the following sequence of commands:
rutabaga% gzip garbage.txt
rutabaga% mv garbage.txt.gz rubbish.txt.gz
If we were to
gunzip rubbish.txt.gz at this point, the uncompressed
file would be named
rubbish.txt, after the new (compressed) filename.
However, with the
-N option, we get:
rutabaga% gunzip -N rubbish.txt.gz
rutabaga% ls -l garbage.txt
-rw-r--r-- 1 mdw hack 312996 Nov 17 21:44 garbage.txt
gzip and gunzip can also compress or uncompress data
from standard input and output. If gzip is given no filenames to
compress, it attempts to compress data read from standard input. Likewise,
if you use the -c option with gunzip, it writes uncompressed
data to standard output. For example, you could pipe the output of a
command to gzip to compress the output stream and save it to a file
in one step, as in:
rutabaga% ls -laR $HOME | gzip > filelist.gz
This will produce a recursive directory listing of your home directory
and save it in the compressed file
filelist.gz. You can display
the contents of this file with the command:
rutabaga% gunzip -c filelist.gz | more
This will uncompress
filelist.gz and pipe the output to the
more command. When you use
gunzip -c, the file on disk
remains compressed.
The zcat command is identical to gunzip -c.
You can think of this as a version of cat for
compressed files. Linux even has a version of
the pager less for compressed files,
called zless.
When compressing files, you can use one of the options -1,
-2, through -9 to specify the speed and quality
of the compression used. -1 (also --fast) specifies the fastest
method, which compresses the files less compactly, while -9
(also --best) uses the slowest, but best compression method.
If you don't specify one of these options the default is
-6. None of these options has any bearing on how you use
gunzip; gunzip will be able to uncompress the file no matter
what speed option you use.
gzip is relatively new in the Unix world. The compression programs
used on most Unix systems are compress
and uncompress, which were
included in the original Berkeley versions of Unix. compress and
uncompress are very much like gzip and gunzip,
respectively; compress saves compressed files as
filename.Z as opposed to filename.gz,
and uses a slightly less efficient compression algorithm.
However,
the free software community has been moving to gzip for several
reasons. First of all, gzip works better. Second there has
been a patent dispute over the compression algorithm used by
compress--the results of which could prevent third parties
from implementing the compress algorithm on their own. Because
of this, the Free Software Foundation urged a move to gzip, which
at least the Linux community has embraced. gzip has been ported to
many architectures, and many others are following suit. Happily, gunzip
is able to uncompress the .Z format files
produced by compress.
Another compression/decompression
program has also emerged to take the lead from
gzip. bzip2 is the new
kid on the block and sports even better compression (on the
average about 10-20 percent better than
gzip), at the expense of longer compression
times. You cannot use bunzip2 to uncompress
files compressed with gzip and vice versa,
and since you cannot expect everybody to have
bunzip2 installed on their machine, you
might want to confine yourself to gzip for
the time being if you want to send the compressed file to
somebody else. However, it pays to have
bzip2 installed, because more and more
FTP servers now provide
bzip2-compressed packages in order to
conserve disk space and bandwidth. You can recognize
bzip2-compressed files from their typical
.bz2 file name extension.
While the command-line options of
bzip2 are not exactly the same as those of
gzip, those that have been described in
this section are. For more information, see the
bzip2 manual page.
The bottom line is that you should use
gzip/gunzip or bzip2/bunzip2 for your
compression needs. If you encounter a file with the extension
.Z,
it was probably produced by compress, and
gunzip can
uncompress it for you.
Earlier versions of gzip used
.z (lowercase) instead of .gz as the compressed-filename
extension. Because of the potential confusion with .Z, this was
changed. At any rate, gunzip retains backwards-compatibility with
a number of filename extensions and file types.
7.1.2. Using tar
tar is a general-purpose archiving utility capable of packing many
files into a single archive file, retaining information, such as file permissions
and ownership. The name tar stands for tape archive, because
the tool was originally used to archive files as backups on tape. However,
use of tar is not at all restricted to making tape backups, as we'll see.
The format of the tar command is:
tar functionoptions files
where
function is a single letter indicating the operation
to perform,
options is a list of (single-letter) options
to that function, and
files is the list of files to pack
or unpack in an archive. (Note that
function is not
separated from
options by any space.)
function can be one of:
- c
To create a new archive
- x
To extract files from an archive
- t
To list the contents of an archive
- r
To append files to the end of an archive
- u
To update files that are newer than those in the archive
- d
To compare files in the archive to those in the filesystem
You'll rarely use most of these functions; the more commonly used are
c, x, and t.
The most common options are:
- v
To print verbose information when packing or unpacking archives
- k
To keep any existing files when extracting--that
is, to not overwrite any existing files which are contained within the
tar file
- f filename
To specify that the tar file to
be read or written is filename
- z
To specify that the data to be written to the
tar file should be compressed or that the data in the tar file is
compressed with gzip
- v
To make tar show the files
it is archiving or restoring--it is good practice to use this so that
you can see what actually happens (unless, of course, you are writing
shell scripts)
There are others, which we will cover later in this section.
Although the tar syntax might appear complex at first, in practice
it's quite simple. For example, say we have a directory named
mt, containing these files:
rutabaga% ls -l mt
total 37
-rw-r--r-- 1 root root 24 Sep 21 1993 Makefile
-rw-r--r-- 1 root root 847 Sep 21 1993 README
-rwxr-xr-x 1 root root 9220 Nov 16 19:03 mt
-rw-r--r-- 1 root root 2775 Aug 7 1993 mt.1
-rw-r--r-- 1 root root 6421 Aug 7 1993 mt.c
-rw-r--r-- 1 root root 3948 Nov 16 19:02 mt.o
-rw-r--r-- 1 root root 11204 Sep 5 1993 st_info.txt
We wish to pack the contents of this directory into a single
tar
archive. To do this, we use the command:
tar cf mt.tar mt
The first argument to
tar is the
function (here,
c,
for create) followed by any
options. Here, we use the one
option
f mt.tar, to specify that the resulting tar archive
be named
mt.tar. The last argument is the name of the
file or files to
archive; in this case, we give the name of a directory, so
tar packs all files in that directory into the archive.
Note that the first argument to tar must be a function letter
followed by a list of options. Because of this, there's no reason
to use a hyphen (-) to precede the options as many Unix commands
require. tar allows you to use a hyphen, as in:
tar -cf mt.tar mt
but it's really not necessary. In some versions of
tar, the
first letter must be the
function,
as in
c,
t, or
x. In other versions, the order of letters does not matter.
The function letters as described here follow the so-called "old
option style." There is also a newer "short option
style" where you precede the function options with a hyphen, and
a "long option style," where you use long option names
with two hyphens. See the Info page for tar for
more details if you are interested.
It is often a good idea to use the v option with tar;
this lists each file as it is archived. For example:
rutabaga% tar cvf mt.tar mt
mt/
mt/st_info.txt
mt/README
mt/mt.1
mt/Makefile
mt/mt.c
mt/mt.o
mt/mt
If you use
v multiple times, additional information will
be printed, as in:
rutabaga% tar cvvf mt.tar mt
drwxr-xr-x root/root 0 Nov 16 19:03 1994 mt/
-rw-r--r-- root/root 11204 Sep 5 13:10 1993 mt/st_info.txt
-rw-r--r-- root/root 847 Sep 21 16:37 1993 mt/README
-rw-r--r-- root/root 2775 Aug 7 09:50 1993 mt/mt.1
-rw-r--r-- root/root 24 Sep 21 16:03 1993 mt/Makefile
-rw-r--r-- root/root 6421 Aug 7 09:50 1993 mt/mt.c
-rw-r--r-- root/root 3948 Nov 16 19:02 1994 mt/mt.o
-rwxr-xr-x root/root 9220 Nov 16 19:03 1994 mt/mt
This is especially useful as it lets you verify that
tar is doing the
right thing.
In some versions of tar, f must be the last letter in the list of options. This is because tar expects the f option to
be followed by a filename--the name of the tar file to read from
or write to. If you don't specify f filename at all,
tar assumes for historical reasons that it should use the
device /dev/rmt0 (that is, the first tape drive). In the
section "Section 8.1, "Making Backups","
in Chapter 8, "Other
Administrative
Tasks", we'll talk about using
tar in conjunction with a tape drive to make backups.
Now, we can give the file mt.tar to other people,
and they can extract it on their own system. To do this, they would
use the command:
tar xvf mt.tar
This creates the subdirectory
mt and places all the
original files into it, with the same permissions
as found on the original system.
The new files will be owned by the user running the
tar xvf (you) unless you are running as root, in
which case the original owner is preserved.
The
x option stands for
"extract."
The
v option is used again here
to list each file as it is extracted. This produces:
courgette% tar xvf mt.tar
mt/
mt/st_info.txt
mt/README
mt/mt.1
mt/Makefile
mt/mt.c
mt/mt.o
mt/mt
We can see that tar saves the pathname of each file relative to
the location where the tar file was originally created. That is,
when we created the archive using tar cf mt.tar mt, the only
input filename we specified was mt, the name of the
directory containing the files. Therefore, tar stores the directory
itself and all of the files below that directory in the tar file.
When we extract the tar file, the directory mt is created and
the files placed into it, which is the exact inverse of what was done to create
the archive.
By default, tar extracts all tar files relative to
the current directory where you execute tar. For example,
if you were to pack up the contents of your /bin directory
with the command:
tar cvf bin.tar /bin
tar would give the warning:
tar: Removing leading / from absolute path names in the archive.
What this means is that the files are stored in the archive within the
subdirectory
bin. When this tar file is extracted, the directory
bin is created in the working
directory of
tar--not as
/bin on the system where the extraction is being done.
This is very important and is meant to prevent terrible mistakes
when extracting tar files. Otherwise, extracting a tar file packed as,
say,
/bin, would trash the contents of your
/bin directory when
you extracted it. If you really wanted to extract such a tar file into
/bin, you would extract it from the root directory,
/.
You can override this behavior using the
P option when packing
tar files, but it's not recommended you do so.
Another way to create the tar file mt.tar would have been to cd
into the mt directory itself, and use a command such as:
tar cvf mt.tar *
This way the
mt subdirectory would not be stored in the
tar file; when extracted, the files would be placed directly in your
current working directory. One fine point of
tar etiquette is
to always pack tar files so that they contain a subdirectory, as we
did in the first example with
tar cvf mt.tar mt. Therefore,
when the archive is extracted, the subdirectory is also created and
any files placed there. This way you can ensure that the files
won't be placed directly in your current working directory; they
will be tucked out of the way and prevent confusion. This also
saves the person doing the extraction the trouble of having to
create a separate directory (should they wish to do so) to unpack
the tar file. Of course, there are plenty of situations where
you wouldn't want to do this. So much for etiquette.
When creating archives, you can, of course, give tar a list of
files or directories to pack into the archive. In the first
example, we have given tar the single directory mt,
but in the previous paragraph we used the wildcard *, which
the shell expands into the list of filenames in the current
directory.
Before extracting a tar file, it's usually a good idea to take a look
at its table of contents to determine how it was packed. This
way you can determine whether you do need to create a subdirectory yourself
where you can unpack the archive. A command such as:
tar tvf tarfile
lists the table of contents for the named
tarfile.
Note that when using the
t function, only one
v is
required to get the long file listing, as in this example:
courgette% tar tvf mt.tar
drwxr-xr-x root/root 0 Nov 16 19:03 1994 mt/
-rw-r--r-- root/root 11204 Sep 5 13:10 1993 mt/st_info.txt
-rw-r--r-- root/root 847 Sep 21 16:37 1993 mt/README
-rw-r--r-- root/root 2775 Aug 7 09:50 1993 mt/mt.1
-rw-r--r-- root/root 24 Sep 21 16:03 1993 mt/Makefile
-rw-r--r-- root/root 6421 Aug 7 09:50 1993 mt/mt.c
-rw-r--r-- root/root 3948 Nov 16 19:02 1994 mt/mt.o
-rwxr-xr-x root/root 9220 Nov 16 19:03 1994 mt/mt
No extraction is being done here; we're just displaying the archive's
table of contents. We can see from the filenames that this file
was packed with all files in the subdirectory
mt, so that when
we extract the tar file, the directory
mt will be created,
and the files placed there.
You can also extract individual files from a tar archive.
To do this, use the command:
tar xvf tarfile files
where
files is the list of files to extract. As we've
seen, if you don't specify any
files,
tar
extracts the entire archive.
When specifying individual files to extract, you must give the
full pathname as it is stored in the tar file. For example, if
we wanted to grab just the file mt.c from the previous
archive mt.tar, we'd use the command:
tar xvf mt.tar mt/mt.c
This would create the subdirectory
mt and place the file
mt.c within it.
tar has many more options than those mentioned here.
These are the features that you're likely to use most of the
time, but GNU tar, in particular,
has extensions that make
it ideal for creating backups and the like. See the tar
manual page and the following section
for more information.
7.1.3. Using tar with gzip
tar does not compress the data stored in its archives in
any way. If you are creating a tar file from three 200K files,
you'll end up with an archive of about 600K. It is common practice
to compress tar archives with gzip (or the older compress
program). You could create a gzipped tar file using the
commands:
tar cvf tarfile files
gzip -9 tarfile
But that's so cumbersome, and requires you to have enough space to
store the uncompressed
tar file before you
gzip it.
A much trickier way to accomplish the same task is to use an interesting
feature of tar that allows you to write an archive to standard
output. If you specify - as the tar file to read or write, the
data will be read from or written to standard input or output. For
example, we can create a gzipped tar file using the command:
tar cvf - files
| gzip -9 > tarfile.tar.gz
Here,
tar creates an archive from the named
files and
writes it to standard output; next,
gzip reads the data from
standard input, compresses it, and writes the result to its own standard
output; finally, we redirect the gzipped tar file to
tarfile.tar.gz.
We could extract such a tar file using the command:
gunzip -9c tarfile.tar.gz | tar xvf -
gunzip uncompresses the named archive file, writes the result to
standard output, which is read by
tar on standard input and
extracted. Isn't Unix fun?
Of course, both of these commands are rather cumbersome to type.
Luckily, the GNU version of tar provides the z option which
automatically creates or extracts gzipped archives. (We saved
the discussion of this option until now, so you'd truly appreciate its
convenience.) For example, we could use the commands:
tar cvzf tarfile.tar.gz files
and:
tar xvzf tarfile.tar.gz
to create and extract gzipped tar files. Note that you should
name the files created in this way with the
.tar.gz filename
extensions (or the equally often used
.tgz, which
also works on systems with limited filename capabilities), to
make their format obvious. The
z option works
just as well with other tar functions such as
t.
Only the GNU version of tar supports the z option; if you
are using tar on another Unix system, you may have to use one
of the longer commands to accomplish the same tasks.
Nearly all Linux systems use GNU tar.
When you want to use tar in
conjunction with bzip2, you need to tell
tar about your compression program preferences like
this:
tar cvf tarfile.tar.bz2 --use-compress-program=bzip2 files...
or, shorter:
tar cvf tarfile.tar.bz2 --use=bzip2 files...
or, shorter still:
tar cvIf tarfile.tar.bz2 files
The latter version only works with newer versions of GNU
tar that supports the I
option.
Keeping this in mind, you could write short shell scripts or
aliases to handle cookbook tar file creation and extraction for you.
Under bash, you could include the following functions in
your .bashrc:
tarc () { tar czvf $1.tar.gz $1 }
tarx () { tar xzvf $1 }
tart () { tar tzvf $1 }
With these functions, to create a gzipped tar file from a
single directory, you could use the command:
tarc directory
The resulting archive file would be named
directory.tar.gz.
(Be sure that there's no trailing slash on the directory name; otherwise
the archive will be created as
.tar.gz within the given
directory.)
To list the table of contents of a gzipped tar file, just use:
tart file.tar.gz
Or, to extract such an archive, use:
tarx file.tar.gz
7.1.4. tar Tricks
Because tar saves the ownership and permissions of files
in the archive and retains the full directory structure, as well
as symbolic and hard links, using tar is an excellent way to
copy or move an entire directory tree from one place to another
on the same system (or even between different systems, as we'll see).
Using the - syntax described earlier, you can write a tar file
to standard output, which is read and extracted on standard input
elsewhere.
For example, say that we have a directory containing two subdirectories:
from-stuff and to-stuff. from-stuff contains an entire
tree of files, symbolic links, and so forth--something that is
difficult to mirror precisely using a recursive
cp. In order
to mirror the entire tree beneath from-stuff to to-stuff,
we could use the commands:
cd from-stuff
tar cf - . | (cd ../to-stuff; tar xvf -)
Simple and elegant, right? We start in the directory
from-stuff
and create a tar file of the current directory, which is written to
standard output. This archive is read by a subshell (the commands
contained within parentheses); the subshell does a
cd to the target directory,
../to-stuff (relative to
from-stuff, that is), and then runs
tar xvf, reading
from standard input. No tar file is ever written to disk; the
data is sent entirely via pipe from one
tar process to another.
The second
tar process has the
v
option that prints
each file as it's extracted; in this way, we can verify that the command
is working as expected.
In fact, you could transfer directory trees from one machine to
another (via the network) using this trick; just include an appropriate
rsh command within the subshell on the right side of the pipe.
The remote shell would execute tar to read the archive on its
standard input. (Actually, GNU tar has facilities to read or write tar files automatically
from other machines over the network; see the
tar manual page for details.)