 |
Chapter 9. Proxy Server
An important concern on the Web is
keeping the Bad Guys out of your network (see Chapter 13, "Security"). One established technique is to keep the
network hidden behind a firewall; this works well, but as soon as you
do it, it also means that everyone on the same network suddenly finds
that their view of the Net has disappeared (rather like people living
near Miami Beach before and after the building boom). This becomes an
urgent issue at Buttherthlies, Inc., as competition heats up and
naughty-minded Bad Guys keep trying to break our security and get in.
We install a firewall and, anticipating the instant outcries from the
marketing animals who need to get out on the Web and surf for prey,
we also install a proxy server to get them out there.
So, in addition to the Apache that serves clients visiting our sites
and is protected by the firewall, we need a copy of Apache to act as
a proxy server to let us, in our turn, access other sites out on the
Web. Without the proxy server, those inside are safe but blind.
9.1. Proxy Directives
We
are not concerned here with firewalls, so we take them for granted.
The interesting thing is how we configure the proxy Apache to make
life with a firewall tolerable to those behind it.
site.proxy has three subdirectories:
cache, proxy, real. The Config file from
... /site. proxy/proxy is as follows:
User webuser
Group webgroup
ServerName www.butterthlies.com
Port 8000
ProxyRequests on
CacheRoot /usr/www/site.proxy/cache
CacheSize 100000
The points to notice are that:
On this site we use ServerName
www.butterthlies.com. The Port number is set to 8000 so that we can
change proxies without having to change users' Configs. We turn ProxyRequests on and provide a directory
for the cache, which we will discuss later in this chapter. CacheRoot is set up in a special directory. CacheSize is set to 100000 kilobytes.
9.1.1. ProxyRequests
ProxyRequests [on|off]
Default: off
Server config
This directive turns proxy serving on. Even
if ProxyRequests is off,
ProxyPass directives are still honored.
9.1.2. ProxyRemote
ProxyRemote remote-server = protocol://hostname[:port]
Server config
This
directive defines remote proxies to this proxy.
remote-server is either the name of a URL
scheme that the remote server supports, a partial URL for which the
remote server should be used, or " * " to indicate that
the server should be contacted for all requests.
protocol is the
protocol that should be used to communicate with the remote server.
Currently, only HTTP is supported by this module. For example:
ProxyRemote ftp http://ftpproxy.mydomain.com:8080
ProxyRemote http://goodguys.com/ http://mirrorguys.com:8000
ProxyRemote * http://cleversite.com
9.1.3. ProxyPass
ProxyPass path url
Server config
This
command runs on an ordinary server and translates requests for a
named directory and below to a demand to a proxy server. So, on our
ordinary Butterthlies site, we might want to pass requests to
/secrets onto a proxy server
darkstar.com:
ProxyPass /secrets http://darkstar.com
Unfortunately, this is less useful than it might appear, since the
proxy does not modify the HTML returned by
darkstar.com. This means that URLs embedded in the
HTML will refer to documents on the main server unless they have been
written carefully. For example, suppose a document
one.html is stored on
darkstar.com with the URL
http://darkstar.com/one.html, and we want it to
refer to another document in the same directory. Then the following
links will work, when accessed as
http://www.butterthlies.com/secrets/one.html:
<A HREF="two.html">Two</A>
<A HREF="/secrets/two.html">Two</A>
<A HREF="http://darkstar.com/two.html">Two</A>
But this example will not work:
<A HREF="/two.html">Not two</A>
When accessed directly, through
http://darkstar.com/one.html, these links work:
<A HREF="two.html">Two</A>
<A HREF="/two.html">Two</A>
<A HREF="http://darkstar.com/two.html">Two</A>
But the following doesn't:
<A HREF="/secrets/two.html">Two</A>
9.1.4. ProxyDomain
ProxyDomain Domain
Server config
This
directive is only useful for Apache proxy servers within intranets.
The ProxyDomain directive specifies the default
domain to which the Apache proxy server will belong. If a request to
a host without a domain name is encountered, a redirection response
to the same host with the configured
Domain appended will be generated.
9.1.5. NoProxy
NoProxy { Domain | SubNet | IpAddr | Hostname }
Server config
This
directive is only useful for Apache proxy servers within intranets.
The NoProxy directive specifies a list of subnets,
IP addresses, hosts, and/or domains, separated by spaces. A request
to a host that matches one or more of these is always served
directly, without forwarding to the configured
ProxyRemote proxy server(s).
9.1.6. ProxyPassReverse
ProxyPassReverse path url
Server config, virtual host
A
reverse proxy is a way to share load between several
servers -- the frontend server simply accepts requests and
forwards them to one of several backend servers. The optional module
mod_rewrite has some special stuff in it to
support this. This directive lets Apache adjust the URL in the
Location response header. If a
ProxyPass (or mod_rewrite)
has been used to do reverse proxying, then this directive will
rewrite Location headers coming back from the
reverse proxied server so that they look as if they came from
somewhere else (normally this server, of course).
Copyright © 2001 O'Reilly & Associates. All rights reserved.
|
 |
|