Friday, July 24, 2009

Setting Up Transparent Proxy

This is an excellent article on setting up a transparent web proxy and Anti-virus check and URL backlisting. Republished with permission from Fulvio Ricci. Many Thanks.
-Jd

The purpose of this document is to describe the creation of a Web Proxy with antivirus check of web pages and site blacklisting/whitelisting. The document is divided into the following sections:
  • Why use a web proxy with antivirus check?
  •  Transparent Proxy Mode
  •  Configuration and activation of the proxy service
    

Why use a web proxy with antivirus?

Web pages are more and more frequently the means by which worms and viruses are spread on the Internet. Websites, whether intentionally or because they are vulnerable and are therefore modified without the knowledge of the legitimate authors, sometimes have executable code references that can infect users' computers. Moreover, the situation has worsened since a number of vulnerabilities in the image display system has allowed viruses to be carried in JPEG files. Lastly, the growing use of Java applets is increasing the number of multiplatform viruses spread via http and operating regardless of the platform (PC, palmtop, mobile phone) or operating system on which they work.

The best solution for this type of problem is to provide all client devices that connect to the internet with a good antivirus program with real-time protection, checking every single incoming file. However, this may not be enough for two reasons: no antivirus program, even those having signature self-updating mechanisms, can provide a 100% guarantee against every virus; real-time check of content entering is considerably burdensome in computational terms and particularly on devices whose performance is not too good, it can slow down the system to the point of making users disable antivirus real-time protection.

For these reasons, virus check is increasingly done upstream, before potential viruses are able to reach the user's client. In other words, centralized antivirus systems are used on servers offering a particular service. The most widespread example is that of e-mail servers, which have a system that analyze incoming and outgoing messages via SMTP and scan attachments for viruses. In this case, application of antivirus check on an SMTP gateway is quite natural, since e-mails are obliged to pass through it, before reaching the user's inbox.
For the http service, this is not so insignificant, since a LAN client may potentially connect directly to any of the web servers available on the Internet. The solution to this problem involves introducing an application-level gateway to the LAN to collect client http requests and forward them to the relevant web servers. This application gateway is called a Web Proxy and since it is capable of interpreting the http protocol, it not only filters on the basis of URLs, but also breaks down the content being carried (HTML, JavaScript, Java Applet, images, ...) and scans it for viruses.

One of the most common functions of proxies so far has been web caches, that is, the archiving on disk of web pages that have already been visited, in order to accelerate display of the same URLs for later requests. The purpose of this is also to reduce bandwidth consumption on the Internet and one of the best-known proxy systems, capable of performing web cache functions is Squid, distributed with Open Source license.

Transparent Proxy Mode

One of the biggest problems when using a proxy server is that of configuring all web browsers to use it. It is therefore necessary to specify its IP address or host name and the TCP port on which it responds (usually port 8080). This could be burdensome in the case of LANs with numerous users, but even worse, it might not guarantee against users removing this configuration to gain direct access to the web, thus avoiding antivirus check, access logging and blacklists.

To solve this problem, you can use a Transparent Proxy mode which involves automatically capturing the client requests on TCP 80 port. Obviously, to be able to capture these web requests, it must be configured as a network gateway, so that client Internet traffic goes through it. It will automatically capture http requests whether this is a level 2 gateway (bridge between Ethernet, WIFI or VPN interface) or layer 3 gateway (router). It is nevertheless important to specify on which network interfaces or IP subnets these requests are to be redirected. This is done by adding so-called HTTP Capturing Rules

 There may be several reasons why it is necessary to exclude the intervention of the transparent proxy on some clients and some web servers. For example, one web server may restrict access only to clients with a certain IP on its ACLs. In this case, if the proxy captured requests to the above server, it would be reached via its IP and this would prevent access. On the other hand, it would not be possible to authorize the IP address of the proxy on the web server's ACLs, since this would mean allowing indiscriminate access to all clients using the proxy. It is clear, then, that the only solution is to avoid the capture of requests by the transparent proxy.

Configuration and activation of the proxy service


Configuration of the proxy service with antivirus check is very simple. After configuring the  box to act as a router and after configuring it on the clients as the default gateway, or configuring it as a bridge and interposing it on a point of the LAN at which traffic flows to and from the Internet, simply enable the flag [Enabled] so that the proxy can start working. As mentioned in the previous paragraph, the web requests that are actually intercepted and submitted to the proxy are those specified through configuration of the [HTTP Capturing Rules].


Proxy configuration web interface

Note that, start-up of the proxy service is very slow compared to other services, and on hardware that is not very fast it can take up to 30-40 seconds. This is due to the need of the ClamAV antivirus libraries to load and decrypt a large number of virus signatures in their memory. To prevent this from blocking the web configuration interface and start-up scripts for long intervals, the service is started asynchronously.

Acess log and privacy
Being an application gateway capable of interpreting http requests, in order to work correctly, a web proxy decrypts the URLs visited by users. By default, this information is not  sent to  the system logs, which, if associated with the IP address of the clients requesting web pages, can help to trace the content visited from the users.

Moreover, it is important to be aware that, as enabling the NAT on an Internet access router, each client external request is made by the router itself, in the same way http requests passing through a proxy appear to be made from the IP address of the proxy server. This may cause difficulties in tracing the identity of a user who has performed illicit actions on remote servers. A possible solution to this problem, which is less invasive in privacy terms, could be to activate logging of the Connection Tracking. In this way, any TCP/UDP connection is recorded in the logs showing the source IP, source port, destination IP and destination port. Hence, it will not be possible to track the content of user activity, but a trace will be kept of connections made. Again, in this case it is necessary to consult local legislation before enabling connection tracking.

Antivirus check of images

For a long time it was thought that a file containing a JPEG or GIF image could not contain a virus, because it is simply made up of data formatted in a preset format, interpretable by the viewing system of the operating system. Recently, however, some image rendering components have shown that they are vulnerable if they are not updated with patches. A suitably constructed image could create a Buffer Overrun and execute arbitrary code on the system. It is easy to understand the seriousness of this, given that most hypertext content on the WWW is in image form.

Website blacklisting and whitelisting

It is often necessary to block the display of a number of websites since their content is considered unsuitable for the users of the web service. An example is adult-only material, which should not be displayed on computers to which children have access. One very effective solution for this problem is forcing web clients to access the Internet through a proxy, which, through Content Filtering software such as DansGuardian, examines the content of html pages blocking those thought to belong to an undesired category. The mechanisms of these filters can be compared to those of antispamming systems. Unfortunately, however, it is not clear whether the DansGuardian release licence is compatible for integration within a system  and, hence, it was not used in order avoid the risk of licence violation.

Configuration of the web proxy blacklist


Blacklists and whitelists consist of a sequence of URLs arranged on distinct lines. Each line may correspond to several web pages when the * character is used. To block the site http://www.example.com place www.example.com/* on the blacklist, whereas the line www.example.com, without *, would only block the home page of that site.
The whitelist has priority over the blacklist. In other words, if a web page corresponds to a blacklist item and, at the same time, is found on the whitelist, access is allowed to the page.
Moreover, note that the purpose of the whitelist is not only to allow access to pages that would otherwise be prohibited by the blacklist, but also to bypass antivirus check. Please take careful note of this.
If the LAN administrator wants to adopt the policy of providing access to a limited number of sites, s/he can specify the */* line in the blacklist, which will prevent access to all pages except those included on the whitelist.

Testing proxy and antivirus function

There are basically be two reasons why the proxy might not work correctly. First of all, it is necessary to ensure whether the  box is configured as a router or a bridge, and also that traffic to and from Internet actually goes through it. Secondly, you must be certain of the correct configuration of the [HTTP Capturing Rules], which determine which http requests are actually redirected towards the proxy process (havp listens on 127.0.0.1:8080). In particular, if http request capture is imposed on a network interface that is part of a bridge, you must be sure that at least one IP address has been defined on the latter.
The easiest way to check whether the proxy is working correctly is to temporarily enable logging of all accesses and display the proxy log after requesting the web pages of a client.
Once certain that the proxy captures the web requests as expected, check that the ClamAV antivirus software is working correctly. To do this, first check on the freshclam logs that the signatures are updated regularly. Then, go to the URL http://www.eicar.org/anti_virus_test_file.htm to check whether the EICAR-AV-Test test virus (said to be harmless by the authors) is captured and blocked.
Lastly, note that the proxy cannot serve https requests (http encrypted with SSL/TLS) given that, not having the private key of the web server, it cannot decrypt the content and the URLs of this request encapsulated in encrypted tunnels.

No comments:

Post a Comment

 
Link Directory