IntelligentMirror: GSOC Project Update

Brief Introduction

IntelligentMirror can be used to create a mirror of static HTTP content on your local network. When you download something (say a software package) from Internet, it is stored/cached on a local machine on your network and subsequent downloads of that particular software package are supplied from the storage/cache of the local machine. This facilitate the efficient usage of bandwidth and also reduces the average download time. IntelligentMirror can also do pre-fetching of RPM packages from fedora repositories spread all over the world and can also pre-populate the local repo with popular packages like mplayer, vlc, gstreamer which are normally accessed immediately after a fresh install.

Definition for a lay man

Think of Internet as a hard disk, your proxy server as a cache and your Intranet as a CPU. Now, whenever your CPU needs to process something, it needs data from cache. If data is not there in cache, it’ll be fetched from RAM and/or hard disk. IntelligentMirror sits on your proxy server and keep caching packages in a browsable manner which can be served via http for subsequent requests.

For further details about IntelligentMirror, go here.

Update

After getting the hosting space on fedorahosted.org, I pushed the code I have written. You can check the source tree here.

We are buidling IntelligentMirror as a plugin to squid which taps requests from clients and checks them against a cache. Checkout how to write a custom redirector or how to tap requests to squid. And acts accordingly. We are working on live streaming the partially downloaded package to the end user while caching it.

If you have any suggestion, feel free to leave them as a comment here or edit the wiki page 🙂

 

Info: Yum Bug Day on May 30

Hey all yum lovers,

We (the yum people) are organising Yum Bug Day on May 30, 2008. We try to kill as much bugs as possible. For a list of bugs, check Yum Bugs and Yum-Utils Bugs. If you think you can fix, provide info about one or more or you have any suggestions, just drop in #yum on freenode.net or shoot a mail at yum-devel.

In Seth’s words,

Yum Bug Day!
It’s exciting! It’s Fun! You get to tell people things like: Already
fixed, WONTFIX, CANTFIX, NOTABUG, and ‘ooo yah, that’s pretty broken’.

So, what are you waiting for?? 😛

 

Review: Spicebird – A Collaboration Platform

Well, I happened to attend this workshop on “How to build business around open source tools” organized by Twincling Society and IIIT Hyderabad. There I came to know about Spicebird. Spicebird is a single platform for many collaboration needs. It provides e-mail, calendaring and instant messaging with intuitive integration and unlimited extensibility. Spicebird is being developed by a Hyderabad based Indian start-up named Synovel (All four founders are alumni of IIIT Hyderabad). Below we look at some features that Spicebird provides.

1. Tabbed Interface

The tabbed interface for different utilities like mail, calendar, contacts, tasks etc. looks pretty clean. The interface is not at all cluttered in any way and navigation to different utilities is straight forward. You don’t have to brainstorm before getting something done.

2. Familiar Interface & Crisp Icon Set

Spicebird has an interface similar to loads of mozilla based application out there. The settings, preferences and the way things have been managed are familiar. So people who are switching from other open source email clients will not face any problems at all. Spicebird uses icons from Tango Project. The icons used are really good looking.

3. Nice Home Tab

The way Home tab has been organized is really appealing. You can add applets which includes rss feeds from you favourite blogs, mail folder views, calendar, upcoming events and Date & Time. Geeks love rss feeds. And what can be better than having it on your home tab all the time along with your mails. Event applet comes handy to remind you of the upcoming meetings and deadlines. And its on home tab all the time 🙂 Date & Time is specially helpful when you collaborate with people in different timezones. So you can add their timezone on home tab and you know when is the right time to ping them.

Spicebird Home Tab

4. Email

Email experience is more or less like any other open source email client. But Spicebird provides some intitutive features like if it finds that the content of a mail is about a meeting, it’ll give an option for creating a calendar event for the same. This is a really good feature and this is just the begining. Spicebird is still beta.

SpiceBird Intutive Mail

5. Instant Messaging

This is a really cool feature from collaboration point of view and which makes Spicebird different from the masses. Spicebird is supporting IM via any jabber server. So if you are a startup, setup your own jabber server on Intranet and use it for collaboration. Mind blowing!! This also includes Gmail/GTalk. So you can just say bye bye to your messenger and start using it right away with GTalk. Plus this will import all your contacts to your local address book. Another real good feature which is not there in lot of other email clients.

SpiceBird Instant Message using Jabber, GTalk

6. Calendar & Task Management

Another good feature. Integrated calendar and task management. You can quickly add tasks and events. And you need not check your calendar for upcoming events, add upcoming event applet on home tab and you will have them all the time in front of your eyes 🙂

Spicebird Calendar and Task Manager

Conclusion

Whether you are a startup which is looking for tools to collaborate or a user who is excited about using open source tools, just go and download Spicebird from here and explore a new way of managing things at a single place 🙂

You can look at Spicebird Roadmap here and checkout the video demo of Spicebird here.

 

How To: Configure Squid Proxy Server

Mission

To configure squid for simple proxying without caching anything.

Use Cases

  1. When you want to have control on what people browse on your lan.
  2. When number of machine is more than the number of IP addresses you can afford to buy.
  3. When you want to help this holy world in saving some IPV4 addresses 😛

Assumptions

  1. You have a machine connected directly to internet that you are going to use as a proxy server for other machines on your network.
  2. The machines on your network are using 192.168.0.0/16 as private address space. You can use anyone/multiple address spaces of the available but for this howto we assume 192.168.0.0/16 as the local network.
  3. The local IP address of the machine which will run squid proxy server is 192.168.36.204. You can have any IP, but for this howto we assume this.

How to proceed

First of all ensure that you have squid installed. After installing squid, you need to set access control in squid configuration file which resides in /etc/squid by default. Open /etc/squid/squid.conf and add/edit following lines according to your preferences. Few lines already exist in the configuration file, you can add the rest.

# The port on which squid will listen for requests
http_port 8080
# If 'cgi-bin' or '?' is in query, squid should not check with neighbours'/parents' cache
# and should go to target web-server.
hierarchy_stoplist cgi-bin ?
# If url contains 'cgi-bin' or '?', then it must not be cached
acl QUERY urlpath_regex cgi-bin \?
cache deny QUERY
acl apache rep_header Server ^Apache
broken_vary_encoding allow apache
# Absolute path to squid access log.
access_log /var/log/squid/access.log squid
refresh_pattern ^ftp:           1440    20%     10080
refresh_pattern ^gopher:        1440    0%      1440
refresh_pattern .               0       20%     4320
# Access control list to control every IP address
acl all src 0.0.0.0/0.0.0.0
# Access control list for source machine in LAN
acl lan_src src 192.168.0.0/16
# Access control list for destination machine in LAN
acl lan_dst dst 192.168.0.0/16
# Access control list to manage squid cache
acl manager proto cache_object
# Access control list to define IP address allowed for source localhost
acl localhost src 127.0.0.1/255.255.255.255
# Access control list to define IP addresses allowed for localhost as destination
acl to_localhost dst 127.0.0.0/8
# Access control list to define Safe ports that should be allowed by default
acl SSL_ports port 443 563 1863 5190 5222 5050 6667
acl Safe_ports port 80          # http
acl Safe_ports port 21          # ftp
acl Safe_ports port 443         # https
acl Safe_ports port 70          # gopher
acl Safe_ports port 210         # wais
acl Safe_ports port 1025-65535  # unregistered ports
acl Safe_ports port 280         # http-mgmt
acl Safe_ports port 488         # gss-http
acl Safe_ports port 591         # filemaker
acl Safe_ports port 777         # multiling http
acl CONNECT method CONNECT
# Allow cache management only from localhost
http_access allow manager localhost
# Deny cache management from remote hosts
http_access deny manager
# Deny http access via all the ports which are not listed as safe
http_access deny !Safe_ports
# Deny all connections via all ports which are not listed as safe
http_access deny CONNECT !SSL_ports
# Allow http access from localhost
http_access allow localhost
# Allow http access from machines on LAN
http_access allow lan_src
http_access deny all
http_reply_access allow all
icp_access allow all
# Deny caching for everyone so that there is not caching at all
cache deny all
coredump_dir /var/spool/squid
# Never allow direct connection to machines on the internet
prefer_direct off
never_direct allow all
# Allow direct connetion if the destination machine is on LAN
always_direct allow lan_dst
# Delete this line if you don't have /etc/hosts file
hosts_file /etc/hosts
# Allow AIM connections
# Delete the following 9 lines if you don't want people to connect to AIM
acl AIM_ports port 5190 9898 6667
acl AIM_domains dstdomain .oscar.aol.com .blue.aol.com .freenode.net
acl AIM_domains dstdomain .messaging.aol.com .aim.com
acl AIM_hosts dstdomain login.oscar.aol.com login.glogin.messaging.aol.com toc.oscar.aol.com irc.freenode.net
acl AIM_nets dst 64.12.0.0/255.255.0.0
acl AIM_methods method CONNECT
http_access allow AIM_methods AIM_ports AIM_nets
http_access allow AIM_methods AIM_ports AIM_hosts
http_access allow AIM_methods AIM_ports AIM_domains
# Allow connections to Yahoo Messenger
# Delete the following 6 lines if you don't want people to connect to Yahoo Messenger
acl YIM_ports port 5050
acl YIM_domains dstdomain .yahoo.com .yahoo.co.jp
acl YIM_hosts dstdomain scs.msg.yahoo.com cs.yahoo.co.jp
acl YIM_methods method CONNECT
http_access allow YIM_methods YIM_ports YIM_hosts
http_access allow YIM_methods YIM_ports YIM_domains
# Allow connections to Google Talk
# Delete the following 6 lines if you don't want people to connect to Google Talk
acl GTALK_ports port 5222 5050
acl GTALK_domains dstdomain .google.com
acl GTALK_hosts dstdomain talk.google.com
acl GTALK_methods method CONNECT
http_access allow GTALK_methods GTALK_ports GTALK_hosts
http_access allow GTALK_methods GTALK_ports GTALK_domains
# Allow connections to MSN
# Delete the following 6 lines if you don't want people to connect to Google Talk
acl MSN_ports port 1863 443 1503
acl MSN_domains dstdomain .microsoft.com .hotmail.com .live.com .msft.net .msn.com .passport.com
acl MSN_hosts dstdomain messenger.hotmail.com
acl MSN_nets dst 207.46.111.0/255.255.255.0
acl MSN_methods method CONNECT
http_access allow MSN_methods MSN_ports MSN_hosts

Now, start the squid proxy server as

service squid start

Also, if you want squid to be started every time you boot the machine, execute the following command

chkconfig --level 345 squid on

You have a squid proxy server running now. You can ask clients to configure there browsers to use 192.168.36.204 as a proxy server with 8080 as proxy port. Command line utilities like elinks, lynx, yum, wget etc. can be asked to use proxy by exporting http_proxy variable as below. Users can also add these lines to ~/.bashrc file to avoid exporting every-time.

export http_proxy='http://192.168.36.204:8080'
export ftp_proxy='http://192.168.36.204:8080'

I highly recommend the book “Squid Proxy Server 3.1: Beginner’s Guide (Paperback)” for further reading.

 

How To: Write Custom Redirector or Rewritor Plugin For Squid in Python

Mission

To write a custom Python program which can act as a plugin for Squid to redirect a given URL to another URL. This is useful when already existing redirector plugins for Squid doesn’t suit your needs or you want everything of your own.

Use Cases

  1. When you want to redirect URLs using a database like mysql or postgresql.
  2. When you want to redirect based on mappings stored in simple text files.
  3. When you want to build a redirector which can learn by itself using AI techniques 😛

How to proceed

From Squid FAQ,

The redirector program must read URLs (one per line) on standard input, and write rewritten URLs or blank lines on standard output. Note that the redirector program can not use buffered I/O. Squid writes additional information after the URL which a redirector can use to make a decision.

The format of the line read from the standard input by the program is as follows.

1
2
3
URL ip-address/fqdn ident method
# for example
http://saini.co.in 172.17.8.175/saini.co.in - GET -

The implementation sounds very simple and it is indeed very simple to implement. The only thing that should be taken care of is the unbuffered I/O. You should immediately flush the output to standard output once decision is taken.

For this howto, we assume we have a method called ‘modify_url()‘ which returns either a blank line or a modified URL to which the client should be redirected.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#!/usr/bin/env python
 
import sys
def modify_url(line):
    list = line.split(' ')
    # first element of the list is the URL
    old_url = list[0]
    new_url = '\n'
    # take the decision and modify the url if needed
    # do remember that the new_url should contain a '\n' at the end.
    if old_url.endswith('.avi'):
        new_url = 'http://fedora.co.in/errors/accessDenied.html' + new_url
    return new_url
 
while True:
    # the format of the line read from stdin is
    # URL ip-address/fqdn ident method
    # for example
    # http://saini.co.in 172.17.8.175/saini.co.in - GET -
    line = sys.stdin.readline().strip()
    # new_url is a simple URL only
    # for example
    # http://fedora.co.in
    new_url = modify_url(line)
    sys.stdout.write(new_url)
    sys.stdout.flush()

Save the above file somewhere. We save this example file in /etc/squid/custom_redirect.py. Now, we have the function for redirecting clients. We need to configure squid to use custom_redirect.py . Below is the squid configuration for telling squid to use the above program as redirector.

1
2
3
4
5
6
# Add these lines to /etc/squid/squid.conf file.
# /usr/bin/python should be replaced by the path to python executable if you installed it somewhere else.
redirect_program /usr/bin/python /etc/squid/custom_redirect.py
# Number of instances of the above program that should run concurrently.
# 5 is good enough but you should go for 10 at least. Anything below 5 would not work properly.
redirect_children 5

Now, start/reload/restart squid. That’s all we need to write and use a custom redirector plugin for squid.