On the other hand, tunneling can also allow inherently insecure protocols to cross your firewall. For this reason, it may be advantageous to use a firewall solution that does content-based checking of HTTP connections, so that you can disallow connections that are actually tunneling other protocols. This can be quite difficult to do.
Different programs use different methods of "tunneling". These range from simply running their normal protocol on port 80, to including support for HTTP proxying using the "CONNECT" method (discussed later in the section about HTTP proxying), to actually using HTTP with a data type that the client handles specially.
Some of these are much easier to filter out than others. For instance, almost any content checking, whether it's an intelligent packet filter or an HTTP-aware proxy, will get rid of people running protocols other than HTTP on port 80. Similarly, most HTTP proxies will let you control what destinations can be used with CONNECT, and you should restrict them carefully to just the destinations that you need.
Tunneling that actually uses HTTP, on the other hand, is very difficult to filter out successfully. In order to get rid of it, you need to do content filtering on the HTTP stream and remove the relevant data types. Relatively few firewalls support this functionality, and it's very difficult to do successfully in any case. The problem is that if you remove only the data types that you know are being used for tunneling, you are setting up a policy that allows connections by default, which is guaranteed to leave you with a continuous stream of new problems. On the other hand, if you accept only data types that you believe to be safe, you are going to have a continuous stream of new complaints from users, because many data types are in use on the web, and they change rapidly.
Fortunately, the uses for tunneling that actually uses HTTP are fairly limited. The HTTP protocol is set up to support interactions that look like normal web browsing; the client sends a query, and the server sends an answer. The client can't send any information except the initial query, which is of limited size. This model works well for tunneling some other protocols (for instance, it's fine for tunneling RealAudio) but poorly for tunneling protocols that need prolonged interaction between the client and the server. This doesn't prevent people from tunneling any protocol they like over HTTP, but it does at least make it more difficult and less efficient.
There is unfortunately no good solution to the general problem of tunneled protocols. Using proxying to make sure that connections are using HTTP, and controlling the use of CONNECT, will at least limit your exposure.
These days, other programs and even hardware devices may provide HTTP interfaces. For instance, you can buy a power strip with a built-in web server, allowing you to turn its outlets on and off from a web browser. These servers do not behave like the servers we have been discussing, and the fact that they speak the HTTP protocol doesn't give you any particularly good idea of what their security vulnerabilities may be.
You will have to assess the security of each of these servers separately. Some of the questions you should ask are:
any information access services (notably HTTP, WAIS, and Gopher) were designed so that the servers don't have to run on a fixed well-known port on all machines. A standard well-known port was established for each of these services, but the clients and servers are all capable of using alternate ports as well. When you reference one of these servers, you can include the port number it's running on (assuming that it's not the standard port for that service) in addition to the name of the machine it's running on. For example, an HTTP URL of the form http://host.domain.example/file.html is assumed to refer to a server on the standard HTTP port (port 80); if the server were on an alternate port (port 8000, for example), the URL would be written http://host.domain.example:8000/file.html.
The protocol designers had two valid reasons for designing these services this way:
Some servers also use nonstandard ports to run secondary servers. Traditionally, HTTP proxies use port 8080, and administrative servers use a port number one higher than the server they're controlling (81 for administering a standard web server and 8081 for administering a proxy server).
Your firewall will probably prevent people on your internal network from setting up their own servers at nonstandard ports (you're not going to want to allow inbound connections to arbitrary ports above 1023). You could set up such servers on a bastion host, but wherever possible, it's kinder to other sites to leave your servers on the standard port.
Direction | Source Addr. | Dest. Addr. | Protocol | Source Port | Dest. Port | ACK Set | Notes |
---|---|---|---|---|---|---|---|
In | Ext | Int | TCP | >1023 |
80[41]
|
[42]
|
Request, external client to internal server |
Out | Int | Ext | TCP | 80[41] | >1023 | Yes | Response, internal server to external client |
Out | Int | Ext | TCP | >1023 | 80[41] | [42] | Request, internal client to external server |
In | Ext | Int | TCP | 80[41] | >1023 | Yes | Response, external server to internal client |
[41]80 is the standard port number for HTTP servers, but some servers run on different port numbers.[42]ACK is not set on the first packet of this type (establishing connection) but will be set on the rest.
HTTP proxies of various kinds are extremely common, and many incorporate caching, which can provide significant performance advantages for most sites. (A caching proxy is one that makes a copy of the requested data, so that if somebody else requests the same data, the proxy can fulfill the request with the copy instead of going back to the original server to request the data again.) In addition, many sites are worried about the content that people access via HTTP and use proxies to control accessibility (for instance, to prevent access to sites containing pornography, stock prices, or sports scores, all of which are common nonbusiness uses of the web by employees).
Clients that are speaking to HTTP proxy servers use HTTP, but they use slightly different commands from the ones they'd normally use. A client that wants to get the document known as "http://amusinginformation.example/foodle" without using a proxy will connect to the host amusinginformation.example and send a command much like "GET /foodle HTTP/1.1". In order to use an HTTP proxy, the client will connect to the proxy instead and issue the command as "GET http://amusinginformation.example/foodle HTTP/1.1". The proxy will then connect to amusinginformation.example and send "GET /foodle HTTP/1.1" and return the resulting page to the client.
Some HTTP proxy servers support commands that normal HTTP servers don't support. For instance, they may allow a client to issue commands like "FTP ftp://amusinginformation.example/foodle" (to have the proxy server transfer the named file via FTP and return it to the client) or "CONNECT amusinginformation.example:873" (to have the proxy server make a TCP connection to the named port and relay information between it and the client). There is no standard for these additional commands, although FTP and CONNECT are two of the most common. Most web browsers will support using an HTTP proxy server for FTP and Gopher connections, and common web proxies (for instance, Microsoft Proxy Server) will support FTP and Gopher.
Some clients that are not web browsers will allow you to use an HTTP proxy server for protocols other than HTTP, and most of them depend on using CONNECT, which makes the HTTP proxy server into a generic proxy. For instance, Lotus Notes and rsync clients both are able to use HTTP proxies to get to their servers via CONNECT.
Using an HTTP proxy server as a generic proxy in this way is convenient but not particularly secure. Few HTTP proxy servers provide any interesting control or logging on the protocols used with CONNECT. You will want to be very restrictive about what protocols you allow this way.
It's extremely important to prevent external users from connecting to your HTTP proxy servers. If your HTTP proxy server can make inbound connections, external users can use it as a platform to attack internal servers they would not otherwise be able to get to (this is particularly dangerous if they can use CONNECT to get to arbitrary services). Even if the proxy server can't be used this way, it can be used to attack third parties.
People often search actively for open HTTP proxy servers. Some of these people are hostile and want to use the proxy servers as attack platforms, but some of them just want to use the proxy servers to access web sites that would otherwise be unavailable to them because of filtering rules at their site (or in a few cases, filtering imposed by national governments). Either way, it's probably not to your advantage to let them use your site. Being nice to people behind restrictive filters is tempting, but in the long run, it will merely use up your bandwidth and get you added to the list of filtered sites.
In addition, HTTP clients may provide name and/or IP address information to servers, leaking information about your internal numbering and naming schemes. HTTP clients may provide "From:" headers, telling the server the user's email address (as the user told it to the browser), and proxies may add "Via:" headers indicating the IP addresses of proxies that a request (or response) has passed through.
Two defined protocols actually provide privacy using encryption and strong authentication for HTTP. The one that everyone knows is usually called HTTPS and is denoted by using https in the URL. The other, almost unknown protocol, is called Secure HTTP and is denoted by using shttp in the URL.
The goal of HTTPS is to protect your communication channel when retrieving or sending data. HTTPS currently uses TLS and SSL to achieve this. Chapter 14, "Intermediary Protocols", contains more technical information on TLS and SSL.
The goal of Secure HTTP is to protect individual objects rather than the communications channel. This allows, for example, individual pages on a web server to be digitally signed -- a web client can check the signature when the page is downloaded. If someone replaces the page without re-signing, then the signature check will fail, causing an alert to be displayed. Similarly, a secure form that is submitted to a web server can be a self-contained digitally signed object. This means that the object can be stored and used later to prove or dispute the transaction.
The use of Secure HTTP could have significant advantages for the consumer in the world of electronic commerce. If a company claims that it has a digitally signed object indicating your desire to purchase 2,000 rubber chickens but the digital signature doesn't match, then you can argue that you did not make the request. If the signature does match, then it can only mean one of two things; either you requested the chickens, or your private key has been stolen. In contrast, when you use HTTPS, your identity is not bound to the transaction but to the communication channel. This means that HTTPS cannot protect you from someone switching your order for rubber chickens to live ones, once it has been made, or just ordering chickens on your behalf.
Direction | Source Addr. | Dest. Addr. | Protocol | Source Port | Dest. Port | ACK Set | Notes |
---|---|---|---|---|---|---|---|
In | Ext | Int | TCP | >1023 | 443 |
[43]
|
Request, external client to internal server |
Out | Int | Ext | TCP | 443 | >1023 | Yes | Response, internal server to external client |
Out | Int | Ext | TCP | >1023 | 443 | [43] | Request, internal client to external server |
In | Ext | Int | TCP | 443 | >1023 | Yes | Response, external server to internal client |
[43]ACK is not set on the first packet of this type (establishing connection) but will be set on the rest.
Proxying for HTTPS is normally done using the CONNECT primitive (discussed earlier in the section on proxying HTTP). This allows the real client to exchange certificate information with the server, but it also serves as a generic proxy for any protocol running on the ports that the proxy allows for HTTPS. Since HTTPS is encrypted, the proxy can't do any verification on the contents of the connection. You should be cautious about the ports that you allow for HTTPS.