 |
11.5. Logging the Action
Apache
offers a wide range of options for controlling the format of the log
files. In line with current thinking, older methods
(RefererLog, AgentLog, and
CookieLog) have now been replaced by the
config_log_module. To illustrate this, we have
taken ... /site.authent and copied it to
... /site.logging so that we can play with the
logs:
User webuser
Group webgroup
ServerName www.butterthlies.com
IdentityCheck on
NameVirtualHost 192.168.123.2
<VirtualHost www.butterthlies.com>
LogFormat "customers: host %h, logname %l, user %u, time %t, request %r,
status %s,bytes %b,"
CookieLog logs/cookies
ServerAdmin sales@butterthlies.com
DocumentRoot /usr/www/site.logging/htdocs/customers
ServerName www.butterthlies.com
ErrorLog /usr/www/site.logging/logs/customers/error_log
TransferLog /usr/www/site.logging/logs/customers/access_log
ScriptAlias /cgi_bin /usr/www/cgi_bin
</VirtualHost>
<VirtualHost sales.butterthlies.com>
LogFormat "sales: agent %{httpd_user_agent}i, cookie: %{http_Cookie}i,
referer: %{Referer}o, host %!200h, logname %!200l, user %u, time %t,
request %r, status %s,bytes %b,"
CookieLog logs/cookies
ServerAdmin sales_mgr@butterthlies.com
DocumentRoot /usr/www/site.logging/htdocs/salesmen
ServerName sales.butterthlies.com
ErrorLog /usr/www/site.logging/logs/salesmen/error_log
TransferLog /usr/www/site.logging/logs/salesmen/access_log
ScriptAlias /cgi_bin /usr/www/cgi_bin
<Directory /usr/www/site.logging/htdocs/salesmen>
AuthType Basic
AuthName darkness
AuthUserFile /usr/www/ok_users/sales
AuthGroupFile /usr/www/ok_users/groups
require valid-user
</Directory>
<Directory /usr/www/cgi_bin>
AuthType Basic
AuthName darkness
AuthUserFile /usr/www/ok_users/sales
AuthGroupFile /usr/www/ok_users/groups
#AuthDBMUserFile /usr/www/ok_dbm/sales
#AuthDBMGroupFile /usr/www/ok_dbm/groups
require valid-user
</Directory>
</VirtualHost>
There are a number of directives.
11.5.1. ErrorLog
ErrorLog filename|syslog[:facility]
Default: ErrorLog logs/error_log
Server config, virtual host
The
ErrorLog
directive sets the name of the file to which the server will log any
errors it encounters. If the filename does not begin with a slash
("/"), it is assumed to be relative to the server root.
If the filename begins with a pipe ("|"), it is assumed
to be a command to spawn a file to handle the error log.
Apache 1.3 and above: Using syslog instead of a
filename enables logging via syslogd(8) if the
system supports it. The default is to use syslog
facility local7, but you can override this by
using the
syslog:facility syntax,
where facility can be one of the names
usually documented in syslog(1).
Your security could be compromised if the directory where log files
are stored is writable by anyone other than the user who starts the
server.
11.5.2. TransferLog
TransferLog [ file | '|' command ]
Default: none
Server config, virtual host
TransferLog
specifies the file in which to store the log of accesses to the site.
If it is not explicitly included in the Config file, no log will be
generated.
- file
A filename relative to the server root (if it doesn't start
with a slash), or an absolute path (if it does).
- command
A program to receive the agent log information on its standard input.
Note that a new program is not started for a virtual host if it
inherits the TransferLog from the main server. If
a program is used, it runs using the permissions of the user who
started httpd. This is root if the server was
started by root, so be sure the program is
secure. A useful Unix program to send to is
rotatelogs,[56] which can be found
in the Apache support subdirectory. It closes
the log periodically and starts a new one, and is useful for
long-term archiving and log processing. Traditionally, this is done
by shutting Apache down, moving the logs elsewhere, and then
restarting Apache, which is obviously no fun for the clients
connected at the time!
[56]Written by one of
the authors of this book (BL).
11.5.3. LogFormat
LogFormat format_string [nickname]
Default: "%h %l %u %t \"%r\" %s %b"
Server config, virtual host
LogFormat
sets the information to be included
in the log file and the way in which it is written. The default
format is the Common Log Format (CLF), which is expected by
off-the-shelf log analyzers such as wusage
(http://www.boutell.com/ ) or
ANALOG, so if you want to use one of them, leave
this directive alone.[57]
The CLF format is:
[57]Actually, some log analyzers
support some extra information in the log file, but you need to read
the analyzer's documentation for details.
host ident authuser date request status bytes
- host
Domain name of the client or its IP number.
- ident
If IdentityCheck is enabled and the client machine
runs identd, then this is the identity
information reported by the client.
- authuser
If the request was for a password-protected document, then this is
the user ID.
- date
The date and time of the request, in the following format:
[day/month/year:hour:minute:second
tzoffset].
- request
Request line from client, in double quotes.
- status
Three-digit status code returned to the client.
- bytes
The number of bytes returned, excluding headers.
The log format can be customized using a
format_string. The commands in it have the
format
%[condition]key_letter
; the condition need not be
present. If it is, and the specified condition is not met, the output
will be a "-". The
key_letter s are as follows:
- b
Bytes sent.
- {env_name}e
The value of the environment variable
env_name.
- f
The filename being served.
- a
Remote IP address
- h
Remote host.
- {header_name}i
Contents of header_name: header line(s)
in the request sent from the client.
- l
Remote log name (from identd, if supplied).
- {note_name}n
The value of a note. A note is a named entry
in a table used internally in Apache for passing information between
modules.
- {header_name}o
The contents of the header_name header
line(s) in the reply.
- P
The PID of the child Apache handling the request.
- p
The server port.
- r
First line of request.
- s
Status: for requests that were internally redirected, this is the
status of the original request.
- >s
Status of the last request.
- t
Time, in common log time format.
- U
The URL requested.
- u
Remote user (from auth ; this may be bogus if
return status [ %s ] is 401).
- v
The server virtual host.
The format string can have ordinary text of your choice in it in
addition to the %
directives.
11.5.4. CustomLog
LogFormat file|pipe format|nickname
Server config, virtual host
The first argument is the filename to which log records should be
written. This is used exactly like the argument to
TransferLog; that is, it is either a full path,
relative to the current server root, or a pipe to a program.
The format argument specifies a format for each line of the log file.
The options available for the format are exactly the same as for the
argument of the LogFormat directive. If the format
includes any spaces (which it will do in almost all cases), it should
be enclosed in double quotes.
Instead of an actual format string, you can use a format nickname
defined with the LogFormat directive.
11.5.5. site.authent -- Another Example
site.authent
is set up with two
virtual hosts, one for customers and one for salespeople, and each
has its own logs in ... /logs/customers and
... /logs/salesmen. We can follow that scheme
and apply one LogFormat to both, or each can have
its own logs with its own LogFormats inside the
<VirtualHost> directives. They can also have
common log files, set up by moving ErrorLog and
TransferLog outside the
<VirtualHost> sections, with different
LogFormats within the sections to distinguish the
entries. In this last case, the LogFormat files
could look like this:
<VirtualHost www.butterthlies.com>
LogFormat "Customer:..."
...
</VirtualHost>
<VirtualHost sales.butterthlies.com>
LogFormat "Sales:..."
...
</VirtualHost>
Let's experiment with a format for customers, leaving
everything else the same:
<VirtualHost www.butterthlies.com>
LogFormat "customers: host %h, logname %l, user %u, time %t, request %r
status %s, bytes %b,"
...
We have inserted the words host,
logname, and so
on, to make it clear in the file what is doing what. In real life you
probably wouldn't want to clutter the file up in this way
because you would look at it regularly and remember what was what,
or, more likely, process the logs with a program that would know the
format. Logging on to www.butterthlies.com and
going to summer catalog
produces this log file:
customers: host 192.168.123.1, logname unknown, user -, time [07/Nov/
1996:14:28:46 +0000], request GET / HTTP/1.0, status 200,bytes -
customers: host 192.168.123.1, logname unknown, user -, time [07/Nov/
1996:14:28:49 +0000], request GET /hen.jpg HTTP/1.0, status 200,
bytes 12291,
customers: host 192.168.123.1, logname unknown, user -, time [07/Nov
/1996:14:29:04 +0000], request GET /tree.jpg HTTP/1.0, status 200,
bytes 11532,
customers: host 192.168.123.1, logname unknown, user -, time [07/Nov/
1996:14:29:19 +0000], request GET /bath.jpg HTTP/1.0, status 200,
bytes 5880,
This is not too difficult to follow. Notice that while we have
logname unknown, the user is
"-", the usual report for an unknown value. This is
because customers do not have to give an ID; the same log for
salespeople, who do, would have a value here.
We can improve things by inserting lists of conditions based on the
error codes after the % and before the command
letter. The error codes are defined in the HTTP/1.0
specification:
200 OK
302 Found
304 Not Modified
400 Bad Request
401 Unauthorized
403 Forbidden
404 Not found
500 Server error
503 Out of resources
501 Not Implemented
502 Bad Gateway
The list from HTTP/1.1 is as follows:
100 Continue
101 Switching Protocols
200 OK
201 Created
202 Accepted
203 Non-Authoritative Information
204 No Content
205 Reset Content
206 Partial Content
300 Multiple Choices
301 Moved Permanently
302 Moved Temporarily
303 See Other
304 Not Modified
305 Use Proxy
400 Bad Request
401 Unauthorized
402 Payment Required
403 Forbidden
404 Not Found
405 Method Not Allowed
406 Not Acceptable
407 Proxy Authentication Required
408 Request Time-out
409 Conflict
410 Gone
411 Length Required
412 Precondition Failed
413 Request Entity Too Large
414 Request-URI Too Large
415 Unsupported Media Type
500 Internal Server Error
501 Not Implemented
502 Bad Gateway
503 Service Unavailable
504 Gateway Time-out
505 HTTP Version not supported
You can use "!" before a code to mean
"if not." !200 means "log this
if the response was not OK." Let's
put this in salesmen:
<VirtualHost sales.butterthlies.com>
LogFormat "sales: host %!200h, logname %!200l, user %u, time %t, request %r,
status %s,bytes %b,"
...
An attempt to log in as fred with the password
don't know produces the
following entry:
sales: host 192.168.123.1, logname unknown, user fred, time [19/Aug/
1996:07:58:04 +0000], request GET HTTP/1.0, status 401, bytes -
However, if it had been the infamous Bill with the password
theft, we would see:
host -, logname -, user bill, ...
because we asked for host and logname to be logged only if the
request was not OK. We can combine more than one condition, so that
if we only want to know about security problems on sales, we could
log usernames only if they failed to authenticate:
LogFormat "sales: bad user: %400,401,403u"
We can also extract data from the HTTP headers in both directions:
%[condition]{user-agent}i
prints the user agent (i.e., the software the client is running) if
condition is met. The old way of doing
this was AgentLog
logfile and ReferLog
logfile.
 |  |  | | 11.4. Server Info |  | 12. Extra Modules |
Copyright © 2001 O'Reilly & Associates. All rights reserved.
|
 |
|