May 22, 2012 by Kulbir Saini on Administration, Fedora, Git, HowTo, Installation, Linux, Tips - Tricks

Insanely Awesome Web Interface for Your Git Repos

Almost 80-90 people visit How To: Install and Configure GitWeb everyday in search of setting up a web interface for their git repositories. Though gitweb is nice, it’s a bit painful to setup and the web interface is not that appealing. The other day I received this email from Klaus Silveira

Hello Kulbir,
I saw your article about installing Gitweb and i decided to send this shameless self-promotion. Maybe you could try my open-source project, GitList https://github.com/klaussilveira/gitlist

I’m looking for beta testers and supporters. FLOSS. 🙂

So, I thought I’ll just give it a try. Today, I got it working and was blown away by the amazing interface! It’s almost like a super simplified version of GitHub. I was so impressed that I immediately setup a demo website at git.gofedora.com for others to look at and fall in love :-)Another good thing about GitList is that it’s very simple to setup. Below is a step by step process to install and configure GitList to expose your public Git repositories to the internet.

What You Need?

You need the following packages before you can setup GitList.

Apache (with mod_rewrite)
Git
PHP

Installing Required Packages

Most modern operating systems have the above mentioned packages installed by default. Even if you don’t have them already, you can use your OS package manager to install them quickly. To install on Fedora/RedHat/CentOS using yum, use the following command

[root@whitemagnet.com ~]$ yum install php git httpd

For Ubuntu/Debian, use the following command

[root@whitemagnet.com ~]$ apt-get install php git apache2

Assumptions

For setting up GitList, I am assuming the following directory paths and other variables.

Path to public Git repositories : /home/saini/code/public/
Path to Apache document root : /var/www/html/
Path to Git executable : /usr/bin/git (Use “which git” to find out for your OS)
Web URL for browsing git repos : mygit.example.com/gitlist/

Installing and Configuring GitList

Follow the following simple steps to install and configure GitList.

Step 1 : Clone GitList repository from GitHub to /var/www/html/gitlist/

[root@whitemagnet.com ~]$ cd /var/www/html/
[root@whitemagnet.com html]$ git clone git://github.com/klaussilveira/gitlist.git gitlist

Step 2 : Create cache directory and make it globally writable

[root@whitemagnet.com ~]$ cd gitlist
[root@whitemagnet.com gitlist]$ mkdir cache
[root@whitemagnet.com gitlist]$ chmod 777 cache

Step 3 : Configure GitList using config.ini

Open the config.ini (in gitlist directory) and set the option properly. Refer the sample shown below.

[git]
client = '/usr/bin/git' ; Your git executable path
repositories = '/home/saini/code/public/' ; Path to your repositories (with ending slash)
 
[app]
baseurl = 'http://mygit.exmaple.com/gitlist' ; Base URL of the application (without ending slash)

Step 4 : Make sure your Apache can read your .htaccess file in gitlist directory

GitList utilizes Apache’s mod_rewrite module to provide nice URLs. Make sure your Apache is configured to read .htaccess from the gitlist directory. Open your Apache config file (generally located at /etc/httpd/conf/httpd.conf or /etc/apache2/ports.conf) and look for the following

<Directory "/var/www/html">

In this segment, make sure you have AllowOverride All as below.

<Directory "/var/www/html">
# Other lines omitted
AllowOverride All
# Other lines omitted
</Directory>

Step 5 : Reload or restart Apache daemon if needed

[root@whitemagnet.com ~]$ apachctl -k restart (or apache2ctl -k restart for Ubuntu/Debian)

Step 6 : Get some sample repositories in your public repo directory

Get some sample repositories in your public repo directory from GitHub.

[root@whitemagnet.com ~]$ cd /home/saini/code/public/
[root@whitemagnet.com ~]$ git clone git://github.com/kulbirsaini/intelligentmirror.git
[root@whitemagnet.com ~]$ git clone git://github.com/kulbirsaini/Railscasts-Sync.git
[root@whitemagnet.com ~]$ git clone git://github.com/zilkey/active_hash.git

That’s all! Now, go to http://mygit.example.com/gitlist to discover your public git repos via a cool web interface! Leave a comment if you face any issues.

November 19, 2008 by Kulbir Saini on Administration, Configuration, Hacks, HowTo, Installation, Internet, Linux, Nameserver, Server

How To: Configure Caching Nameserver (named)

Mission

To configure a caching nameserver on a local machine which will cascade to another previously configured and functional nameserver (may or may not be caching. It’ll generally be your ISP nameserver or the one provided by your organization).

Advantage

Reduces the delay in domain name resolution drastically as the requests for frequently accessed websites are served from cache.

Working

named gets a request for domain resolution.
It checks whether the request can be satisfied from cache. If the answer is in cache and not stale, the request is satisfied from cache itself saving a lot of time 🙂
If request can’t be satisfied from cache, named queries the first parent. If it replies with the answer, then named will cache the response and subsequent requests for the same domain name will be satisfied from the cache.
In case first parent fails to reply, named will query the second parent and so on.

(The working is my understanding of caching-nameserver using wireshark as traffic analysis tool and caching-nameserver may not behave exactly as explained above.)

How to install

named is by default on most of the systems by the package name ‘caching-nameserver‘. If its not present on your system, install using

[root@localhost ~]# yum install caching-nameserver [ENTER]
# If that doesn't work try this
[root@localhost ~]# yum install bind [ENTER]

How to configure

The main configuration file for named resides in /var/named/chroot/etc/named.caching-nameserver.conf which is also soft linked from /etc/named.caching-nameserver.conf . named configuration file supports C/C++ style comments.

For a caching nameserver which will cascade to another nameserver, there is nothing much to be configured. You need to configure “options” block. Below is a configuration file for a machine with IP address 172.17.8.64 cascading to two nameserver 192.168.36.204 and 192.168.36.210. The comments inline explain what each option does.

options {
  // Set the port to 53 which is standard port for DNS.
  // Add the IP address on which named will listen separated by semi-colons.
  // It'll be your own IP address.
  listen-on port 53 {127.0.0.1; 172.17.8.64;};
  // These are default. Leave them as it is.
  directory   "/var/named";
  dump-file   "/var/named/data/cache_dump.db";
  statistics-file "/var/named/data/named_stats.txt";
  memstatistics-file "/var/named/data/named_mem_stats.txt";
  // The machines which are allowed to query this nameserver.
  // Normally you'll allow only your machine. But you can allow other machines also.
  // The address should be separated by semi-colons. To allow a network 172.16.31.0/24,
  // the line would be
  // allow-query {localhost; 172.16.31.0/24; };
  // Don't forget the semi-colons.
  allow-query     { localhost; 172.17.8.64; };
  recursion yes;
  // The parent nameservers. List all the nameserver which you can query.
  forwarders { 192.168.36.204; 192.168.36.210; };
  forward first;
};
logging {
        channel default_debug {
                file "data/named.run";
                severity dynamic;
        };
};
zone "." IN {
  type hint;
  file "named.ca";
};
include "/etc/named.rfc1912.zones";

Start caching-nameserver

Now start the caching-nameserver using the following command

[root@localhost ~]# server named start [ENTER]

[root@localhost ~]# /etc/init.d/named start [ENTER]

To make named start every time your reboot your machine use following command

[root@localhost ~]# chkconfig named on [ENTER]

Using caching-nameserver

To use your caching-nameserver, open /etc/resolv.conf file and add the following line

nameserver 127.0.0.1

Comment all other lines in the file, so that finally the file looks like

; generated by /sbin/dhclient-script
#search wlan.iiit.ac.in
#nameserver 192.168.36.204
#nameserver 192.168.36.210
nameserver 127.0.0.1

Now your system will use your own nameserver (in caching mode) for resolving all domain names. To test if your nameserver use the following command

[root@localhost ~]# dig fedora.co.in [ENTER]

Now if you use that command for the second time, the resolution time will be around 2-3 milli seconds while first time it would be around 400-700 milli seconds.

Example

Below is two subsequent runs of dig for fedora.co.in . Notice the Query time.

[root@bordeaux SPECS]# dig fedora.co.in
; &lt;&lt;&gt;&gt; DiG 9.4.2rc1 &lt;&lt;&gt;&gt; fedora.co.in
;; global options:  printcmd
;; Got answer:
;; -&gt;&gt;HEADER&lt;&lt;- opcode: QUERY, status: NOERROR, id: 7839
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
;; QUESTION SECTION:
;fedora.co.in.                  IN      A
;; ANSWER SECTION:
fedora.co.in.           83629   IN      A       72.249.126.241
;; AUTHORITY SECTION:
fedora.co.in.           79709   IN      NS      ns.fedora.co.in.
;; ADDITIONAL SECTION:
ns.fedora.co.in.        79709   IN      A       72.249.126.241
;; Query time: 531 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Nov 19 18:04:47 2008
;; MSG SIZE  rcvd: 79
[root@bordeaux SPECS]# dig fedora.co.in
; &lt;&lt;&gt;&gt; DiG 9.4.2rc1 &lt;&lt;&gt;&gt; fedora.co.in
;; global options:  printcmd
;; Got answer:
;; -&gt;&gt;HEADER&lt;&lt;- opcode: QUERY, status: NOERROR, id: 64233
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
;; QUESTION SECTION:
;fedora.co.in.                  IN      A
;; ANSWER SECTION:
fedora.co.in.           83625   IN      A       72.249.126.241
;; AUTHORITY SECTION:
fedora.co.in.           79705   IN      NS      ns.fedora.co.in.
;; ADDITIONAL SECTION:
ns.fedora.co.in.        79705   IN      A       72.249.126.241
;; Query time: 1 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Nov 19 18:04:51 2008
;; MSG SIZE  rcvd: 79
[root@bordeaux SPECS]#

November 8, 2008 by Kulbir Saini on Administration, Hacks, Linux, Programming

Hack: A Fast Network Scanning Program

I was searching for a simple tool which can do a port scanning in a huge network quickly without making me wait for ages. I first thought of using nmap, but it was a bit too complex and it takes a lot of time to discover the machines even after optimizing the parameters. After searching a lot, I wrote to one of my seniors, Sandeep Kumar, asking the details of his program which maintains a list of active FTP servers in the network. He replied with a reference to his own findings about the network scanning tools. He is using an enhanced version of a program originally written by Troy Robinson. I tried the program out of curiosity and found out that its damn fast as compared to nmap (no literal comparison) 🙂 The program can be downloaded from here.

How to use

Compile the program using gcc as

[root@localhost ~]# gcc NetworkScanner.c [ENTER]

Now create a file IPRange.txt containing the IP address ranges for your network. The contents of the file may be

172.16.*.* Meaning all the IP address with first two parts as 172.16 and rest of the address will be generated by permutations.

172.16.1-16.* Meaning the first two parts are fixed. Third part will vary from 1 to 16. And the fourth part will be permuted from 0 to 255.

So an IPRange.txt may look like

1 2	172.16.1-16.* 192.168.36.*

Now run the program as

[root@localhost ~]# ./a.out port_to_be_scanned Parallel_attempts IP_list_file output.txt [ENTER]

Parallel_attempts is the number of processes that’ll be forked for scanning the network port. It is safe to have its value as 255. A very high value may hog the network or may even slow down your machine. So an example run would be

[root@localhost ~]# ./a.out 21 255 IPRange.txt Output.txt [ENTER]

Benchmarks

I carried out a lot of test on my network using the following setup and parameters

Machine : AMD X2 5600+ (2.6GHz Dual Core), 4GB 800MHz DDR2 RAM, Gigabit Ethernet Card (on 100mbps network).

Port : 21 (FTP)

IPRange.txt : Total 16896 IP Addresses

Machines on wired (100mbps) network
172.16.1-48.* 
192.168.36.*
Machines on wireless (54mbps) network
172.17.0-16.*

Network Scanner Benchmarks
Parallel Attempts	Scanning Time (seconds)	Upload Bandwidth (kbps)
255	180	13
512	90	25
1024	47	55
2048	25	100
4096	14	205
6144	11	307
8192	9	374

The interval between two scans was almost 30-40 seconds. I think parallelism beyond 8192 will crash my machine, so I didn’t try. You can try it at your own risk 🙂 I hope this program help you scan your network.

May 2, 2008 by Kulbir Saini on Administration, Configuration, HowTo, Installation, Proxy Server, Server, Squid

How To: Configure Squid Proxy Server

Mission

To configure squid for simple proxying without caching anything.

Use Cases

When you want to have control on what people browse on your lan.
When number of machine is more than the number of IP addresses you can afford to buy.
When you want to help this holy world in saving some IPV4 addresses 😛

Assumptions

You have a machine connected directly to internet that you are going to use as a proxy server for other machines on your network.
The machines on your network are using 192.168.0.0/16 as private address space. You can use anyone/multiple address spaces of the available but for this howto we assume 192.168.0.0/16 as the local network.
The local IP address of the machine which will run squid proxy server is 192.168.36.204. You can have any IP, but for this howto we assume this.

How to proceed

First of all ensure that you have squid installed. After installing squid, you need to set access control in squid configuration file which resides in /etc/squid by default. Open /etc/squid/squid.conf and add/edit following lines according to your preferences. Few lines already exist in the configuration file, you can add the rest.

# The port on which squid will listen for requests
http_port 8080
# If 'cgi-bin' or '?' is in query, squid should not check with neighbours'/parents' cache
# and should go to target web-server.
hierarchy_stoplist cgi-bin ?
# If url contains 'cgi-bin' or '?', then it must not be cached
acl QUERY urlpath_regex cgi-bin \?
cache deny QUERY
acl apache rep_header Server ^Apache
broken_vary_encoding allow apache
# Absolute path to squid access log.
access_log /var/log/squid/access.log squid
refresh_pattern ^ftp:           1440    20%     10080
refresh_pattern ^gopher:        1440    0%      1440
refresh_pattern .               0       20%     4320
# Access control list to control every IP address
acl all src 0.0.0.0/0.0.0.0
# Access control list for source machine in LAN
acl lan_src src 192.168.0.0/16
# Access control list for destination machine in LAN
acl lan_dst dst 192.168.0.0/16
# Access control list to manage squid cache
acl manager proto cache_object
# Access control list to define IP address allowed for source localhost
acl localhost src 127.0.0.1/255.255.255.255
# Access control list to define IP addresses allowed for localhost as destination
acl to_localhost dst 127.0.0.0/8
# Access control list to define Safe ports that should be allowed by default
acl SSL_ports port 443 563 1863 5190 5222 5050 6667
acl Safe_ports port 80          # http
acl Safe_ports port 21          # ftp
acl Safe_ports port 443         # https
acl Safe_ports port 70          # gopher
acl Safe_ports port 210         # wais
acl Safe_ports port 1025-65535  # unregistered ports
acl Safe_ports port 280         # http-mgmt
acl Safe_ports port 488         # gss-http
acl Safe_ports port 591         # filemaker
acl Safe_ports port 777         # multiling http
acl CONNECT method CONNECT
# Allow cache management only from localhost
http_access allow manager localhost
# Deny cache management from remote hosts
http_access deny manager
# Deny http access via all the ports which are not listed as safe
http_access deny !Safe_ports
# Deny all connections via all ports which are not listed as safe
http_access deny CONNECT !SSL_ports
# Allow http access from localhost
http_access allow localhost
# Allow http access from machines on LAN
http_access allow lan_src
http_access deny all
http_reply_access allow all
icp_access allow all
# Deny caching for everyone so that there is not caching at all
cache deny all
coredump_dir /var/spool/squid
# Never allow direct connection to machines on the internet
prefer_direct off
never_direct allow all
# Allow direct connetion if the destination machine is on LAN
always_direct allow lan_dst
# Delete this line if you don't have /etc/hosts file
hosts_file /etc/hosts
# Allow AIM connections
# Delete the following 9 lines if you don't want people to connect to AIM
acl AIM_ports port 5190 9898 6667
acl AIM_domains dstdomain .oscar.aol.com .blue.aol.com .freenode.net
acl AIM_domains dstdomain .messaging.aol.com .aim.com
acl AIM_hosts dstdomain login.oscar.aol.com login.glogin.messaging.aol.com toc.oscar.aol.com irc.freenode.net
acl AIM_nets dst 64.12.0.0/255.255.0.0
acl AIM_methods method CONNECT
http_access allow AIM_methods AIM_ports AIM_nets
http_access allow AIM_methods AIM_ports AIM_hosts
http_access allow AIM_methods AIM_ports AIM_domains
# Allow connections to Yahoo Messenger
# Delete the following 6 lines if you don't want people to connect to Yahoo Messenger
acl YIM_ports port 5050
acl YIM_domains dstdomain .yahoo.com .yahoo.co.jp
acl YIM_hosts dstdomain scs.msg.yahoo.com cs.yahoo.co.jp
acl YIM_methods method CONNECT
http_access allow YIM_methods YIM_ports YIM_hosts
http_access allow YIM_methods YIM_ports YIM_domains
# Allow connections to Google Talk
# Delete the following 6 lines if you don't want people to connect to Google Talk
acl GTALK_ports port 5222 5050
acl GTALK_domains dstdomain .google.com
acl GTALK_hosts dstdomain talk.google.com
acl GTALK_methods method CONNECT
http_access allow GTALK_methods GTALK_ports GTALK_hosts
http_access allow GTALK_methods GTALK_ports GTALK_domains
# Allow connections to MSN
# Delete the following 6 lines if you don't want people to connect to Google Talk
acl MSN_ports port 1863 443 1503
acl MSN_domains dstdomain .microsoft.com .hotmail.com .live.com .msft.net .msn.com .passport.com
acl MSN_hosts dstdomain messenger.hotmail.com
acl MSN_nets dst 207.46.111.0/255.255.255.0
acl MSN_methods method CONNECT
http_access allow MSN_methods MSN_ports MSN_hosts

# The port on which squid will listen for requests http_port 8080 # If 'cgi-bin' or '?' is in query, squid should not check with neighbours'/parents' cache # and should go to target web-server. hierarchy_stoplist cgi-bin ? # If url contains 'cgi-bin' or '?', then it must not be cached acl QUERY urlpath_regex cgi-bin \? cache deny QUERY acl apache rep_header Server ^Apache broken_vary_encoding allow apache # Absolute path to squid access log. access_log /var/log/squid/access.log squid refresh_pattern ^ftp: 1440 20% 10080 refresh_pattern ^gopher: 1440 0% 1440 refresh_pattern . 0 20% 4320 # Access control list to control every IP address acl all src 0.0.0.0/0.0.0.0 # Access control list for source machine in LAN acl lan_src src 192.168.0.0/16 # Access control list for destination machine in LAN acl lan_dst dst 192.168.0.0/16 # Access control list to manage squid cache acl manager proto cache_object # Access control list to define IP address allowed for source localhost acl localhost src 127.0.0.1/255.255.255.255 # Access control list to define IP addresses allowed for localhost as destination acl to_localhost dst 127.0.0.0/8 # Access control list to define Safe ports that should be allowed by default acl SSL_ports port 443 563 1863 5190 5222 5050 6667 acl Safe_ports port 80 # http acl Safe_ports port 21 # ftp acl Safe_ports port 443 # https acl Safe_ports port 70 # gopher acl Safe_ports port 210 # wais acl Safe_ports port 1025-65535 # unregistered ports acl Safe_ports port 280 # http-mgmt acl Safe_ports port 488 # gss-http acl Safe_ports port 591 # filemaker acl Safe_ports port 777 # multiling http acl CONNECT method CONNECT # Allow cache management only from localhost http_access allow manager localhost # Deny cache management from remote hosts http_access deny manager # Deny http access via all the ports which are not listed as safe http_access deny !Safe_ports # Deny all connections via all ports which are not listed as safe http_access deny CONNECT !SSL_ports # Allow http access from localhost http_access allow localhost # Allow http access from machines on LAN http_access allow lan_src http_access deny all http_reply_access allow all icp_access allow all # Deny caching for everyone so that there is not caching at all cache deny all coredump_dir /var/spool/squid # Never allow direct connection to machines on the internet prefer_direct off never_direct allow all # Allow direct connetion if the destination machine is on LAN always_direct allow lan_dst # Delete this line if you don't have /etc/hosts file hosts_file /etc/hosts # Allow AIM connections # Delete the following 9 lines if you don't want people to connect to AIM acl AIM_ports port 5190 9898 6667 acl AIM_domains dstdomain .oscar.aol.com .blue.aol.com .freenode.net acl AIM_domains dstdomain .messaging.aol.com .aim.com acl AIM_hosts dstdomain login.oscar.aol.com login.glogin.messaging.aol.com toc.oscar.aol.com irc.freenode.net acl AIM_nets dst 64.12.0.0/255.255.0.0 acl AIM_methods method CONNECT http_access allow AIM_methods AIM_ports AIM_nets http_access allow AIM_methods AIM_ports AIM_hosts http_access allow AIM_methods AIM_ports AIM_domains # Allow connections to Yahoo Messenger # Delete the following 6 lines if you don't want people to connect to Yahoo Messenger acl YIM_ports port 5050 acl YIM_domains dstdomain .yahoo.com .yahoo.co.jp acl YIM_hosts dstdomain scs.msg.yahoo.com cs.yahoo.co.jp acl YIM_methods method CONNECT http_access allow YIM_methods YIM_ports YIM_hosts http_access allow YIM_methods YIM_ports YIM_domains # Allow connections to Google Talk # Delete the following 6 lines if you don't want people to connect to Google Talk acl GTALK_ports port 5222 5050 acl GTALK_domains dstdomain .google.com acl GTALK_hosts dstdomain talk.google.com acl GTALK_methods method CONNECT http_access allow GTALK_methods GTALK_ports GTALK_hosts http_access allow GTALK_methods GTALK_ports GTALK_domains # Allow connections to MSN # Delete the following 6 lines if you don't want people to connect to Google Talk acl MSN_ports port 1863 443 1503 acl MSN_domains dstdomain .microsoft.com .hotmail.com .live.com .msft.net .msn.com .passport.com acl MSN_hosts dstdomain messenger.hotmail.com acl MSN_nets dst 207.46.111.0/255.255.255.0 acl MSN_methods method CONNECT http_access allow MSN_methods MSN_ports MSN_hosts

Now, start the squid proxy server as

service squid start

Also, if you want squid to be started every time you boot the machine, execute the following command

chkconfig --level 345 squid on

You have a squid proxy server running now. You can ask clients to configure there browsers to use 192.168.36.204 as a proxy server with 8080 as proxy port. Command line utilities like elinks, lynx, yum, wget etc. can be asked to use proxy by exporting http_proxy variable as below. Users can also add these lines to ~/.bashrc file to avoid exporting every-time.

export http_proxy='http://192.168.36.204:8080'
export ftp_proxy='http://192.168.36.204:8080'

I highly recommend the book “Squid Proxy Server 3.1: Beginner’s Guide (Paperback)” for further reading.

You can get our complete MB5-294 exam pass resources including our latest MB6-288 and MB7-227 training courses. Our 70-272 and 70-162 are also playing vital role in IT world.

March 5, 2008 by Kulbir Saini on Administration, Configuration, HowTo, Internet, Proxy Server, Server, Squid

How To: Configure Hierarchicy of Proxy Servers (Squid)

Yesterday I came across this idea of caching all the data that I browse on my hard disk so that the average load time of a website decreases. Actually the idea is I’ll cache all the static data that I browse like images, static html pages, CSS files and similar things which does not change frequently and can be served from the cache. But while setting up the proxy server on my machine, I faced the problem that my machine which is going to act as a proxy server is behind my institute’s proxy. So, a simple caching proxy server can’t serve my needs and I have to really figure out how to setup a hierarchical proxy server. Below we’ll see how to setup a hierarchical proxy server.

Approach

When I thought of setting up a caching proxy server, squid immediately struck my mind. Actually I don’t know about any other proxy servers. I never setup proxy server before this ( I tried a lot of time, but in vain). So, I started googling about squid setup. There were a lot of tutorials, but either they were too small to get things going or they were too verbose that I couldn’t manage to read them. So, I directly jump into squid configuration file squid.conf . And with references from here and there, I managed to setup the proxy server successfully.

Note: The configurations below worked on Fedora 7 with squid 2.6STABLE16. The same configurations may work with other squid versions and on other operating systems as well, but try them at your own risk.

Part 1 : Setting up simple proxy server with squid

Setting up a very simple and usable proxy server is really easy. You need to add/edit only 2-3 lines /etc/squid/squid.conf to get started.

Add your ip to the access list.

1
2
3

acl myip src 172.17.8.175 #<your_ip_which_will_use_the_proxy_server> (e.g. )
http_access allow myip
http_port 8080 #<http_proxy_port> (this is 3128 by default. you can set it to anything you like. e.g. 8080)

Save the squid.conf file. Then issue these commands.

1 2	[root@localhost squid]# squid -z [Enter] (as root) (This needs to be executed only once.) [root@localhost squid]# service squid start [Enter] (as root)

If you want to start the squid server on boot, issue this command.

[root@localhost squid]# chkconfig --level 345 squid on [Enter] (as root)

Now, your machine is a proxy server. You can setup your browser to use the machine as a proxy server.

Conditions

The proxy server will work only if your machine has a public IP and is directly connected to internet.

Part 2: Setting up a hierarchical caching proxy server with squid

The above setup works fine if a machine is directly connected to internet. But my machine itself is behind a proxy, so setting up a proxy on my machine is of no use unless the proxy on my machine uses the institute proxy for connecting to internet. So, here we jump into squid.conf again and this time we have to really do some brain storming. If you are a newbie to Linux and don’t know how to make a system work when nothing seems to help, you will probably be better off by using institute’s proxy.

Here is the scenario.

1. Your browser sends a content request to proxy on your machine.
2. Check: if a cache HIT from institute proxy cache (HIT means content was found in cache)
	2a. Check: if content is older than the original upstream content
		2aa. Fetch content from upstream and serve the client
	2b. else
		2ba. Serve the content from the cache
3. Check: if cache HIT from proxy on your machine
	3a. Check: if content is older than the original upstream content
		3aa. Fetch content from upstream and serve the client
	3b. else
		3ba. Serve the content from the cache
4. Cache MISS from both the proxies
	4a. Fetch the content from upstream and serve the client

The above method of operation is very basic and is my understanding of squid. It may not be the exact squid behavior.

Now, lets see the configurations needed for setting up the hierarchical caching proxy server with squid.

Assumptions

I assume that we already have squid setup at institute’s proxy whether in caching mode or not. The best way to add/edit the following lines in your squid.conf is to search for particular parameter and then edit the value to set as given.

I also assume that you have simple proxy server setup on your machine and now we want to make it act as child proxy of the institute’s proxy.

Configuration

# Your local machine will act as a sibling proxy
cache_peer 172.17.8.175 sibling 3128 3130 no-query weight=10
# The institute's proxy server will act as a parent proxy
# 'default' mean the last-resort
cache_peer 192.168.36.204 parent 8080 3130 no-query proxy-only no-digest default
# allow accessing peer cache for access list 'myip'
cache_peer_access 172.17.8.175 allow myip
# Don't cache dynamic content
hierarchy_stoplist cgi-bin ?
acl QUERY urlpath_regex cgi-bin \?
cache deny QUERY
# Size of main memory to be used for caching
cache_mem 200 MB
# max size of content to be stored in main memory
maximum_object_size_in_memory 7000 KB
# policy for cache replacement if memory is full
cache_replacement_policy heap LFUDA
# the directory to be used for storing cache on your hdd
cache_dir aufs /var/spool/squid 200 16 256
# max file descriptor open at a time .. 0(unlimited)
max_open_disk_fds 0
# min object size to cache on hdd
minimum_object_size 0 KB
# max object size to cache on hdd
maximum_object_size 16384 KB
# access log
access_log /var/log/squid/access.log squid
refresh_pattern ^ftp:           1440    20%     10080
refresh_pattern ^gopher:        1440    0%      1440
refresh_pattern .               0       20%     4320
store_avg_object_size 20 KB
acl apache rep_header Server ^Apache
broken_vary_encoding allow apache
refresh_stale_hit 5 seconds
acl SSL_ports port 443 563 1863 5190 5222 5050 6667
# Allow AIM protocols
acl AIM_ports port 5190 9898 6667
acl AIM_domains dstdomain .oscar.aol.com .blue.aol.com .freenode.net
acl AIM_domains dstdomain .messaging.aol.com .aim.com
acl AIM_hosts dstdomain login.oscar.aol.com login.glogin.messaging.aol.com toc.oscar.aol.com irc.freenode.net
acl AIM_nets dst 64.12.0.0/255.255.0.0
acl AIM_methods method CONNECT
http_access allow AIM_methods AIM_ports AIM_nets
http_access allow AIM_methods AIM_ports AIM_hosts
http_access allow AIM_methods AIM_ports AIM_domains
# Allow Yahoo Messenger
acl YIM_ports port 5050
acl YIM_domains dstdomain .yahoo.com .yahoo.co.jp
acl YIM_hosts dstdomain scs.msg.yahoo.com cs.yahoo.co.jp
acl YIM_methods method CONNECT
http_access allow YIM_methods YIM_ports YIM_hosts
http_access allow YIM_methods YIM_ports YIM_domains
# Allow GTalk
acl GTALK_ports port 5222 5050
acl GTALK_domains dstdomain .google.com
acl GTALK_hosts dstdomain talk.google.com
acl GTALK_methods method CONNECT
http_access allow GTALK_methods GTALK_ports GTALK_hosts
http_access allow GTALK_methods GTALK_ports GTALK_domains
# Allow MSN
acl MSN_ports port 1863 443 1503
acl MSN_domains dstdomain .microsoft.com .hotmail.com .live.com .msft.net .msn.com .passport.com
acl MSN_hosts dstdomain messenger.hotmail.com
acl MSN_nets dst 207.46.111.0/255.255.255.0
acl MSN_methods method CONNECT
http_access allow MSN_methods MSN_ports MSN_hosts
# Turn this off if hierarchical behavior is needed
nonhierarchical_direct off
never_direct deny myip
hosts_file /etc/hosts
coredump_dir /var/spool/squid

# Your local machine will act as a sibling proxy cache_peer 172.17.8.175 sibling 3128 3130 no-query weight=10 # The institute's proxy server will act as a parent proxy # 'default' mean the last-resort cache_peer 192.168.36.204 parent 8080 3130 no-query proxy-only no-digest default # allow accessing peer cache for access list 'myip' cache_peer_access 172.17.8.175 allow myip # Don't cache dynamic content hierarchy_stoplist cgi-bin ? acl QUERY urlpath_regex cgi-bin \? cache deny QUERY # Size of main memory to be used for caching cache_mem 200 MB # max size of content to be stored in main memory maximum_object_size_in_memory 7000 KB # policy for cache replacement if memory is full cache_replacement_policy heap LFUDA # the directory to be used for storing cache on your hdd cache_dir aufs /var/spool/squid 200 16 256 # max file descriptor open at a time .. 0(unlimited) max_open_disk_fds 0 # min object size to cache on hdd minimum_object_size 0 KB # max object size to cache on hdd maximum_object_size 16384 KB # access log access_log /var/log/squid/access.log squid refresh_pattern ^ftp: 1440 20% 10080 refresh_pattern ^gopher: 1440 0% 1440 refresh_pattern . 0 20% 4320 store_avg_object_size 20 KB acl apache rep_header Server ^Apache broken_vary_encoding allow apache refresh_stale_hit 5 seconds acl SSL_ports port 443 563 1863 5190 5222 5050 6667 # Allow AIM protocols acl AIM_ports port 5190 9898 6667 acl AIM_domains dstdomain .oscar.aol.com .blue.aol.com .freenode.net acl AIM_domains dstdomain .messaging.aol.com .aim.com acl AIM_hosts dstdomain login.oscar.aol.com login.glogin.messaging.aol.com toc.oscar.aol.com irc.freenode.net acl AIM_nets dst 64.12.0.0/255.255.0.0 acl AIM_methods method CONNECT http_access allow AIM_methods AIM_ports AIM_nets http_access allow AIM_methods AIM_ports AIM_hosts http_access allow AIM_methods AIM_ports AIM_domains # Allow Yahoo Messenger acl YIM_ports port 5050 acl YIM_domains dstdomain .yahoo.com .yahoo.co.jp acl YIM_hosts dstdomain scs.msg.yahoo.com cs.yahoo.co.jp acl YIM_methods method CONNECT http_access allow YIM_methods YIM_ports YIM_hosts http_access allow YIM_methods YIM_ports YIM_domains # Allow GTalk acl GTALK_ports port 5222 5050 acl GTALK_domains dstdomain .google.com acl GTALK_hosts dstdomain talk.google.com acl GTALK_methods method CONNECT http_access allow GTALK_methods GTALK_ports GTALK_hosts http_access allow GTALK_methods GTALK_ports GTALK_domains # Allow MSN acl MSN_ports port 1863 443 1503 acl MSN_domains dstdomain .microsoft.com .hotmail.com .live.com .msft.net .msn.com .passport.com acl MSN_hosts dstdomain messenger.hotmail.com acl MSN_nets dst 207.46.111.0/255.255.255.0 acl MSN_methods method CONNECT http_access allow MSN_methods MSN_ports MSN_hosts # Turn this off if hierarchical behavior is needed nonhierarchical_direct off never_direct deny myip hosts_file /etc/hosts coredump_dir /var/spool/squid

That’s the minimal configuration you need for running squid in hierarchical way. Save the squid.conf file and start/restart/reload the squid service. Setup your browser to use your machine as proxy and while using it’ll cache all the static content. You should experience some reduction in average page load time.

Advantages

I am currently using squid in above configuration. And its turning out to be nice for me. I am browsing websites faster and saving a chunk of bandwidth for my institute.

Disadvantages

Introduction of another proxy server increases the latency for dynamic content.

Notice

The above configurations and views are a result of my understanding of squid. If you feel this may break your system or it may have adverse effects, don’t use them. At least don’t use these on a production system.

Tech Stuff

How Tos, Tutorials, Tips and Tricks

Administration

Insanely Awesome Web Interface for Your Git Repos

What You Need?

Installing Required Packages

Assumptions

Installing and Configuring GitList

How To: Configure Caching Nameserver (named)

Mission

Advantage

Working

How to install

How to configure

Start caching-nameserver

Using caching-nameserver

Hack: A Fast Network Scanning Program

How to use

Benchmarks

How To: Configure Squid Proxy Server

Mission

Use Cases

Assumptions

How to proceed

How To: Configure Hierarchicy of Proxy Servers (Squid)