As you probably know, squid is a web proxy cache. It has a variety of uses,
from speeding up the connection to web servers by caching repeated requests,
to blocking pages with commercial or pornographic content. Squid is very
robust and developed under GPL.
23.6.2004 08:00 | Petr Houštěk | read 39134×
DISCUSSION
Caching is the way to store requested Internet objects on a server closer
to a client. Local Squid cache can reduce both access time as well as
bandwidth consumption. Squid also provides some kind of security and
anonymity.
Because of the licence you can run Squid on almost every Unix-like operating
systems. Let's suppose you run it on Linux. Installation is quite
simple – for details look
here.
You can also use a precompiled package from your favourite distribution.
Now the Squid is installed. We just have to configure it. The Squid
configuration files are kept in the directory /usr/local/squid/etc by default,
but in your favourite distribution they can be moved to another location (for
example on Debian there is squid.conf in /etc). Squid uses a lot of default
settings, so that it can run even with a zero length configuration file. But it
isn't very useful, because by default Squid denies access to all browsers.
Let's create some basic configuration to make the server working. The first
option in the squid.conf file is http_port option which sets the port that
Squid will listen on &ndash it can be more that one number (for example
3128 and 8080). To set where to store the cache, there is an option cache_dir.
The first option of cache_dir sets where to store cache, then it's size value
in megabytes and the number of subdirectories in first and second tier (it is
recommended to leave here the defaults). Another important option is cache_mem,
which tells Squid how much memory can be used for in-transit objects, hot
objects and negative objects.
Now we have to allow users to use the proxy. We can use this temporary solution
to allow all users (for details see the section about acl).
acl all src 0.0.0.0/0.0.0.0
http_access allow all
Now the very basic configuration is done and we can start the Squid for the
first time. So you can configure your web browser to use it.
Access control lists – ACL
ACLs (access control lists) are very important part of Squid configuration.
Basic authentication should be always used. The primary use of access control
is to stop unauthorised users using your cache. There are two elements to
access control – classes and operators. The class refers to a
set of users. The set can also refer to the ip, http request, filename
extensions, etc). The classes can be put through the operators –
for example to allow http access, to redirect it somewhere else, etc.
ACL classes
The syntax of acl is
acl name type string1 string2 string3 ...
The types are source or destination ip address, source or destination domain,
regular expression match of requested domain, words in the requested URL,
words in the source or destination domain, current time, destination port,
protocol (HTTP, FTP), method, browser type, name, anonymous system number,
username/password pair, SNMP community. The decision string is used to check
if the acl matches given connection. The squid first checks the type field and
according to it decide how to use the decision string. The decision string
could be an ip address, a network, a regular expression ... Now let's take a
closer look at some types.
Source/Destination ip address
The most used example is like this
acl myNet src 192.168.0.0/255.255.0.0
http_access allow myNet
This acl matches all address from 192.168.0.0 to 192.168.255.255 and allows
them to use your cache. All others connections will be denied. Squid adds an
invisible line to the end http_access deny all if the last line tells him to
accept or http access allow all if the last line tells him to deny. For example
if you have this acl set
acl myIP src 192.168.5.13
acl badNet src 192.168.5.0/255.255.255.0
http_access allow myIP
http_access deny badNet
squid will deny all connections from net 192.168.5.0/255.255.255.0 (except
192.168.5.13). If you connect from another network (for example 192.168.1.13),
squid will accept this connection.
To decide according to the destination ip address squid use the type dst (use
is quite similar).
Source/Destination domain
This acl matches requests with proper source or destination domain. The types
are srcdomain and dstdomain. The source domain option is not recommended,
because the attacker who controls the reverse DNS entries for the attacking
host will be able to manipulate these entries to bypass the srcdomain acl. The
dstdomain matches the destination domain. It can be used for example to block
some well-known porn sites. You should also block the site's ip, because
without it someone can access the site typing the ip in their browser. Here is
an example – you want to block the site www.example.com. The ip
address is 10.11.12.13. The entry is
acl badDomain dstdomain example.com
acl badIp dst 10.11.12.13
acl myNet src 192.168.0.0/255.255.0.0
acl all src 0.0.0.0/0.0.0.0
http_access deny badDomain
http_access deny badIp
http_access allow myNet
http_access deny all
Regular expression
With the described types you can only filter sites by destination domain. The
matching based on the regular expressions allows you to make much more precise
filtering. Regular expression in Squid are case-sensitive by default, to make
them case-insensitive use prefix -i. For example you want to deny access to all
requests with word sex (or SEX, SEx, etc.). Than the proper acl is like
acl badUrl url_regex -i sex
To block all files with video content you can make an acl like
acl badUrl url_regex -i \.avi
you can also combine these two rules
acl badUrl url_regex -i sex.\*\.avi
Regular expressions can be used also for checking the source and destination
domains. The types are srcdom_regex and dstdom_regex.
Current time/date
This type matches the requests according to the current time. The often wish
is to filter unwanted sites during the work time. It can be done by combining
the time and dstdomain (or dstdom_regex) ACLs. The syntax of time acl is here
acl name time [day-list] [start_hour:minute-end_hour:minute]
The day-list is a list of single characters indicating the days that the acl
applies to. Here is the list: S – Sunday, M – Monday,
T – Tuesday, W – Wednesday, H – Thursday,
F – Friday, A – Saturday. For weekends you can use
acl weekends time SA
Destination port
The most used port is 80 (that is the port the web servers almost always
listen on). Some servers listen on other ports too such as 8080. The SSL
connections use port 443. By default there is a list of Safe_ports defined in
the squid.conf. This is the line from squid.conf
acl Safe_ports port 80 21 443 563 70 210 1025-65535
which means that destination ports 80, 21, 443, 563, 70, 210, 1025, 1026, ...
65534, 65535 are matched. To deny all other ports you can use
http_access deny !Safe_ports
Method
There are several methods – get, post and connect. The get method is
used for downloading, the post method for uploading and the connect method for
ssl connections. The typical thing is to block connect type requests to non-SSL
ports. The connect method allows the traffic in any direction at any time, so
if you have a improperly configured proxy, you can connect to a telnet server
on a distant machine from the cache server and to bypass the packet filters.
So we can use this example
acl connect_method method CONNECT
acl SSL_PORTS port 443 563
http_access deny connect_method !SSL_PORTS
The acl operators will be described in the next volume.