 |
Chapter 7. Indexing
As we saw back on
site.first (see Chapter 3, "Toward a Real Web Site"),
if there is no index.html file in ...
/htdocs, Apache concocts one called "Index of
/", where "/" means the
DocumentRoot directory. For many purposes this
will, no doubt, be enough. But since this jury-rigged index is the
first thing a client sees, you may want to do more.
7.1. Making Better Indexes in Apache
There is a wide range of possibilities; some are demonstrated at
... /site.fancyindex :
User webuser
Group webgroup
ServerName www.butterthlies.com
DocumentRoot /usr/www/site.fancyindex/htdocs
<Directory /usr/www/site.fancyindex/htdocs>
FancyIndexing on
AddDescription "One of our wonderful catalogs" catalog_summer.html
catalog autumn.html
IndexIgnore *.jpg
IndexIgnore ..
IndexIgnore icons HEADER README
AddIconByType (CAT,icons/bomb.gif) text/*
DefaultIcon icons/burst.gif
#AddIcon (DIR,icons/burst.gif) ^^DIRECTORY^^
HeaderName HEADER
ReadMeName README
</Directory>
When you type go on the server and
access http://www.butterthlies.com/ on the
browser, you should see a rather fancy display:
Welcome to BUTTERTHLIES INC Name Last Modified Size Description
--------------------------------------------------------------------
<bomb> catalog_autumn.html 23-Jul-1998 09:11 1k One of our wonderful catalogs
<bomb> catalog_summer.html 25-Jul-1998 10:31 1k One of our wonderful catalogs
<burst> index.html.ok 23-Jul-1998 09:11 1k
-------------------------------------------------------------------- Butterthlies Inc, Hopeful City, Nevada 99999
(This output is from Apache 1.3; the year is displayed in four-digit
format to cope with the Year 2000 problem.) How does all this work?
As you can see from the httpd.conf file, this
smart formatting is displayed directory by directory. The key
directive is IndexOptions.
7.1.1. IndexOptions
IndexOptions option option ...
Server config, virtual host, directory, .htaccess
This directive was altered by the
Apache Group as we went to press with this edition of the book;
therefore, its behavior is different before and after Apache version
1.3.2. The options are as follows:
- FancyIndexing
Turns on fancy indexing of directories (see Section 7.1.2, "FancyIndexing", later in this chapter).
Note that in versions of Apache prior to 1.3.2, the
FancyIndexing and IndexOptions
directives will override each other. You should use
IndexOptions FancyIndexing in preference to the
standalone Fancy-Indexing directive. As of Apache
1.3.2, a standalone FancyIndexing directive is
combined with any IndexOptions directive already
specified for the current scope.
- IconHeight[=
pixels] (Apache 1.3 and later)
The presence of this option, when used with
IconWidth, will cause the server to include HEIGHT
and WIDTH attributes in the <IMG> tag for the file icon. This
allows browsers to precalculate the page layout without having to
wait until all the images have been loaded. If no value is given for
the option, it defaults to the standard height of the icons supplied
with the Apache software.
- IconsAreLinks
This option makes the icons part of the anchor for the filename, for
fancy indexing.
- IconWidth[=
pixels] (Apache 1.3 and later)
The presence of this option, when used with
IconHeight, will cause the server to include
HEIGHT and WIDTH attributes in the <IMG> tag for the file icon.
This allows browsers to precalculate the page layout without having
to wait until all the images have been loaded. If no value is given
for the option, it defaults to the standard width of the icons
supplied with the Apache software.
- NameWidth=[
n | *] (Apache 1.3.2 and later)
The NameWidth keyword allows you to specify the
width of the filename column in bytes. If the keyword value is
" * ", then the column is automatically sized to the
length of the longest filename in the display.
- ScanHTMLTitles
Enables the extraction of the title from HTML documents for fancy
indexing. If the file does not have a description given by
AddDescription, then httpd
will read the document for the value of the <TITLE> tag. This
process is CPU- and disk-intensive.
- SuppressColumnSorting
If specified, Apache will not make the column headings in a fancy
indexed directory listing into links for sorting. The default
behavior is for them to be links; selecting the column heading will
sort the directory listing by the values in that column. Only
available in Apache 1.3 and later.
- SuppressDescription
This option will suppress the file description in fancy indexing
listings.
- SuppressHTMLPreamble (Apache 1.3 and later)
If the directory actually contains a file specified by the
HeaderName directive, the module usually includes
the contents of the file after a standard HTML preamble
(<HTML>, <HEAD>, etc.). The
SuppressHTMLPreamble option disables this
behavior, causing the module to start the display with the header
file contents. The header file must contain appropriate HTML
instructions in this case. If there is no header file, the preamble
is generated as usual.
- SuppressLastModified
This option will suppress the display of the last modification date
in fancy indexing listings.
- SuppressSize
This option will suppress the file size in fancy indexing listings.
There are some noticeable differences in the behavior of the
IndexOptions directive in recent (post-1.3.0)
versions of Apache. In Apache 1.3.2 and earlier, the default is that
no options are enabled. If multiple IndexOptions
could apply to a directory, then the most specific one is taken
complete; the options are not merged. For example, if the specified
directives are:
<Directory /web/docs>
IndexOptions FancyIndexing
</Directory>
<Directory /web/docs/spec>
IndexOptions ScanHTMLTitles
</Directory>
then only ScanHTMLTitles will be set for the
/web/docs/spec directory.
Apache 1.3.3 introduced some significant changes in the handling of
IndexOptions directives. In particular:
Multiple IndexOptions directives for a single
directory are now merged together. The result of the previous example
will now be the equivalent of IndexOptions
FancyIndexing ScanHTMLTitles. Incremental syntax (i.e., prefixing keywords with "+" or
"-") has been added.
Whenever a "+" or "-" prefixed keyword is
encountered, it is applied to the current
IndexOptions settings (which may have been
inherited from an upper-level directory). However, whenever an
unprefixed keyword is processed, it clears all inherited options and
any incremental settings encountered so far. Consider the following
example:
IndexOptions +ScanHTMLTitles -IconsAreLinks FancyIndexing
IndexOptions +SuppressSize
The net effect is equivalent to IndexOptions
FancyIndexing +SuppressSize,
because the unprefixed FancyIndexing discarded the
incremental keywords before it but allowed them to start accumulating
again afterward.
To unconditionally set the IndexOptions for a
particular directory, clearing the inherited settings, specify
keywords without either "+" or "-"
prefixes.
7.1.2. FancyIndexing
FancyIndexing on_or_off
Server config, virtual host, directory, .htaccess
FancyIndexing
turns fancy indexing on. The user can
click on a column title to sort the entries by value. Clicking again
will reverse the sort. Sorting can be turned off with the
SuppressColumnSorting keyword for
IndexOptions (see earlier in this chapter).
We can specify a description for individual files or for a list of
them. We can exclude files from the listing with
IndexIgnore.
7.1.3. IndexIgnore
IndexIgnore file1 file2 ...
Server config, virtual host, directory, .htaccess
IndexIgnore
is followed by a list of files or
wildcards to describe files. As we see in the following example,
multiple IndexIgnores add to the list rather than
replacing each other. By default, the list includes ".".
Here we want to ignore the *.jpg files (which
are, after all, no use without the .html files
that display them) and the parent directory, known to Unix and to
Win32 as "..":
...
<Directory /usr/www/fancyindex.txt/htdocs>
FancyIndexing on
AddDescription "One of our wonderful catalogs" catalog_autumn.html catalog_summer.html
IndexIgnore *.jpg ..
</Directory>
You might want to use
IndexIgnore for security reasons as well: what the
eye doesn't see, the mouse finger can't steal.[51] You can put in extra
IndexIgnore lines, and the effects are cumulative,
so we could just as well write:
[51]Well, OK, you should never rely on this, but it doesn't
hurt, right?
<Directory /usr/www/fancyindex.txt/htdocs>
FancyIndexing on
AddDescription "One of our wonderful catalogs" catalog_autumn.html catalog_summer.html
IndexIgnore *.jpg
IndexIgnore ..
</Directory>
We can add visual sparkle to our page, without which success on the
Web is most unlikely, by giving icons to the files with the
AddIcon directive. Apache has more icons than you
can shake a stick at in its ... /icons
directory. Without spending some time exploring, one doesn't
know precisely what each one looks like, but
bomb.gif sounds promising. The
icons directory needs to be specified relative
to the DocumentRoot directory, so we have made a
subdirectory ... /htdocs/icons and copied
bomb.gif into it. We can attach the bomb icon to
all displayed .html files with:
...
AddIcon icons/bomb.gif .html
7.1.4. AddIcon
AddIcon icon_name name
Server config, virtual host, directory, .htaccess
AddIcon
expects the URL of an icon, followed
by a file extension, a wildcard expression, a partial filename, or a
complete filename to describe the files to which the icon will be
added. We can iconify subdirectories off the
DocumentRoot with
^^DIRECTORY^^, or make blank lines format properly
with ^^BLANKICON^^. Since we have the convenient
icons directory to practice with, we can iconify
it with:
AddIcon /icons/burst.gif ^^DIRECTORY^^
Or we can make it disappear with:
...
IndexIgnore icons
...
Not all
browsers can display icons. We can cater to those that cannot by
providing a text alternative alongside the icon URL:
AddIcon ("DIR",/icons/burst.gif) ^^DIRECTORY^^
This line will print the word DIR where the
burst icon would have appeared to mark a
directory (that is, the text is used as the ALT
description in the link to the icon). You could, if you wanted, print
the word "Directory" or "This is a
directory." The choice is yours.
Examples:
AddIcon (IMG,/icons/image.xbm) .gif .jpg .xbm
AddIcon /icons/dir.xbm ^^DIRECTORY^^
AddIcon /icons/backup.xbm *~
AddIconByType should be used in preference to
AddIcon, when possible.
7.1.5. AddAlt
AddAlt string file file ...
Server config, virtual host, directory, .htaccess
AddAlt
sets alternate text to display for
the file if the client's browser can't display an icon.
The string must be
enclosed in double quotes.
7.1.6. AddDescription
AddDescription string file1 file2 ...
Server config, virtual host, directory, .htaccess
AddDescription
expects a description string in
double quotes, followed by a file extension, partial filename,
wildcards, or full filename:
<Directory /usr/www/fancyindex.txt/htdocs>
FancyIndexing on
AddDescription "One of our wonderful catalogs" catalog_autumn.html
catalog_summer.html
IndexIgnore *.jpg
IndexIgnore ..
AddIcon (CAT,icons/bomb.gif) .html
AddIcon (DIR,icons/burst.gif) ^^DIRECTORY^^
AddIcon icons/blank.gif ^^BLANKICON^^
DefaultIcon icons/blank.gif
</Directory>
Having achieved these wonders, we might now want to be a bit more
sensible and choose our icons by MIME type using the
AddIconByType directive.
7.1.7. DefaultIcon
DefaultIcon url
Server config, virtual host, directory, .htaccess
DefaultIcon
sets a default icon to display for
unknown file types. url
points to the icon.
7.1.8. AddIconByType
AddIconByType icon mime_type1 mime_type2 ...
Server config, virtual host, directory, .htaccess
AddIconByType
takes as an argument an icon URL,
followed by a list of
MIME types. Apache looks for the type
entry in mime.types, either with or without a
wildcard. We have the following MIME types:
...
text/html html htm
text/plain text
text/richtext rtx
text/tab-separated-values tsv
text/x-setext text
...
So, we could have one icon for all text files by including the line:
AddIconByType (TXT,icons/bomb.gif) text/*
Or we could be more specific, using four icons,
a.gif, b.gif,
c.gif, and d.gif :
AddIconByType (TXT,/icons/a.gif) text/html
AddIconByType (TXT,/icons/b.gif) text/plain
AddIconByType (TXT,/icons/c.gif) text/tab-separated-values
AddIconByType (TXT,/icons/d.gif) text/x-setext
Let's try out the simpler case:
<Directory /usr/www/fancyindex.txt/htdocs>
FancyIndexing on
AddDescription "One of our wonderful catalogs" catalog_autumn.html
catalog_summer.html
IndexIgnore *.jpg
IndexIgnore ..
AddIconByType (CAT,icons/bomb.gif) text/*
AddIcon (DIR,icons/burst.gif) ^^DIRECTORY^^
</Directory>
For a further refinement, we can use
AddIconByEncoding to give a special icon to
encoded files.
7.1.9. AddAltByType
AddAltByType string mime_type1 mime_type2 ...
Server config, virtual host, directory, .htaccess
AddAltByType
provides a text string for the
browser to display if it cannot show an icon. The string must be
enclosed in double quotes.
7.1.10. AddIconByEncoding
AddIconByEncoding icon mime_encoding1 mime_encoding2 ...
Server config, virtual host, directory, .htaccess
AddIconByEncoding
takes an icon name followed by a list
of MIME encodings. For instance, x-compress files
can be iconified with:
...
AddIconByEncoding (COMP,/icons/d.gif) application/x-compress
...
7.1.11. AddAltByEncoding
AddAltByEncoding string mime_encoding1 mime_encoding2 ...
Server config, virtual host, directory, .htaccess
AddAltByEncoding
provides a text
string for the browser to display if it can't put up an icon.
The string must be enclosed in double
quotes.
Next, in our relentless drive for perfection, we can print standard
headers and footers to our menus with the
HeaderName and ReadmeName
directives.
7.1.12. HeaderName
HeaderName filename
Server config, virtual host, directory, .htaccess
This
directive inserts a header, read from
filename, at the top of the index. The
name of the file is taken to be relative to the directory being
indexed. Apache will look first for
filename.html
and, if that is not found, then
filename.
7.1.13. ReadmeName
ReadmeName filename
Server config, virtual host, directory, .htaccess
filename
is taken to be the name of the file
to be included, relative to the directory being indexed. Apache tries
to include
filename.html as an
HTML document and, if that fails, as text.
If we simply call the file HEADER, Apache will
look first for HEADER.html and display it if
found. If not, it will look for HEADER and
display that. The HEADER file can be:
Welcome to BUTTERTHLIES, Inc.
and the README file:
Butterthlies Inc., Hopeful City, Nevada 99999
to correspond with our index.html. We
don't want HEADER and
README to appear in the menu themselves, so we
add them to the IndexIgnore directive:
<Directory /usr/www/fancyindex.txt/htdocs>
FancyIndexing on
AddDescription "One of our wonderful catalogs"
catalog_autumn.html catalog_summer.html
IndexIgnore *.jpg
IndexIgnore .. icons HEADER README
AddIconByType (CAT,icons/bomb.gif) text/*
AddIcon (DIR,icons/burst.gif) ^^DIRECTORY^^
HeaderName HEADER
ReadMeName README
</Directory>
Since HEADER and README can
be HTML scripts, you can wrap the directory listing up in a whole lot
of fancy interactive stuff if you want.
But, on the whole, FancyIndexing is just a cheap
and cheerful way of getting something up on the Web. For an elegant
Net solution, study the next section.
 |  |  | | 6.5. Browsers and HTTP/1.1 |  | 7.2. Making Our Own Indexes |
Copyright © 2001 O'Reilly & Associates. All rights reserved.
|
 |
|