IntelligentMirror: RPM and DEB Caching Improved (0.5)

After spending a lot of time with youtube cache, now I am trying to devote some time to update intelligentmirror with required features and enhancements that youtube cache already enjoys. In the same direction here is version 0.5 of intelligentmirror.

Improvements

  • Added max_parallel_downloads options to controll the maximum threading fetching from upstream to cache the packages.
  • Fine grained control on logging via max_logfile_size and max_logfile_backups option.
  • Added setup script to help you install intelligentmirror. No need to execute commands one by one for installation. Just run
 [root@localhost]# python setup.py install [ENTER]
  • Added update script (update-im). So in case you decide to change the locations for caching rpm/deb packages, just run
 [root@localhost]# update-im [ENTER]

OR

 [root@localhost]# /usr/sbin/update-im [ENTER]
  • Download scheduler similar to youtube cache is added to facilitate the download queing in case of large number of requests.
  • More informative logging.
  • cache.log is not flooding anymore with XMLRPC logs and python tracebacks.
  • Added extensive exception handling thoughout the program.

Availability

  1. RPMs for Fedora/Red Hat/Cent OS
  2. Source RPMs for Fedora/Red Hat/Cent OS
  3. Source Tar balls

Installation and Configuration

INSTALL and README files should help you throughout the installation and configuration process.

In case you have questions, ask them here in comments. Suggestions for improvement are welcome 🙂

 

IntelligentMirror Gets Even More Intelligent (1.0.1)

Warning : This version of IntelligentMirror is compatible with only squid-2.7 as of now. It is NOT compatible even with squid-3.0.

IntelligentMirror Version 1.0.1

I have been following squid development regularly (at least the part in which I am interested) and they have introduced a new directive in squid-2.7 known as StoreUrlRewrite (storeurl_rewrite_program). Using this directive you can instruct squid to cache url A (http://abc.com/foo/bar/version/crap.rpm) as url B (http://proxy.fedora.co.in/intelligentmirror/crap.rpm). In simple words you can direct squid to cache any url as any other url without any extra efforts.

So keeping the above directive in mind, I have worked out a different version of intelligentmirror especially for squid-2.7.

IntelligentMirror : Old method of operation

  1. IntelligentMirror gets a client request for a URL.
  2. Check: if URL is not in (RPM, metadata file)
    • Then its none of our business.
    • Let proxy handle it the normal way.
    • Done and exit.
  3. Check: if RPM/metadata is available in cache
    • Stream the RPM/metadata from cache.
    • Done and exit.
  4. Check: if RPM/metadata is not available in cache
    • Download in parallel for caching in some dir and stream.
    • Done and exit.

IntelligentMirror : New method of operation

  1. IntelligentMirror gets a client request for a URL.
  2. Check: if request for rpm
    1. Direct squid to cache the request as http://<same_host_all_the_time>/intelligentmirror/<rpmname>.rpm
  3. Check: if request for deb
    1. Direct squid to cache the request as http://<same_host_all_the_time>/intelligentmirror/<debname>.deb
  4. Done and exit.

So your squid will see every request for an rpm package as a request http://<same_host_all_the_time>/intelligentmirror/<rpmname>.rpm. So, if you happen to request the same rpm from a different mirror, it’ll still be served from cache 🙂

Improvements

  1. No need to check if the url supplied by squid is for rpm or not because storeurl_rewrite_program has an acl controller attached which will invoke intelligentmirror for urls ending in .rpm .
  2. No need to check if the url is already cached or not. No need to worry about the directory where you are going to store the packages. No human intervention is needed in maintaining the cache. Almighty squid is doing everything for us.
  3. No need to worry if the target package has changed because of the resigning or whatever because squid will do that for you.
  4. No need to actually download the package in parallel for caching because squid is already doing that.
  5. No need to worry about the hashing algorithms and storage optimizations for the cached content.

Availability

  1. RPM for Fedora/Red Hat
  2. Source RPM for Fedora/Red Hat
  3. Source Tarball

Install and Configure

The install and configure files should be enough to guide you through the installation if you choose the tar ball way. Otherwise you can always install from rpm from the above link.

Note1: You have to configure your squid to use intelligentmirror as a plugin even if you install via rpm. Check the configure file at the above link.

Note2: StoreUrlRewrite will probably be available in squid-3.1.

 

IntelligentMirror: RPM and DEB Caching Improved (0.4)

IntelligentMirror version 0.4 is available now. There have been significant improvements in intelligent mirror since last release.

Improvements

  1. Fixed defunct process problem. You will not see defunct python processes hanging around anymore. Previously every forked daemon used to got defucnt because parent never waited for the forked child to finish.
  2. IntelligentMirror now supports caching of Debian packages just like rpms. So now IntelligentMirror is best suited shared environments where people have different tastes.
  3. Intelligent Mirror now uses url_rewrite_program instead of redirect_program. This boosts the efficiency of IntelligentMirror by a significant factor as url_rewrite_program has an acl controller url_rewrite_access. And using url_rewrite_access only requests for rpm/deb packages will be passed to Intelligent Mirror. So, IM now need not process each and every incoming request. Also, it has redirector_bypass directive which will bypass IM in case all the instances of IM are busy serving requests. So, squid will not die with a fatal error in case of huge requests.
  4. Options to enable/disable caching for rpm and Debian packages have been added.
  5. Options to control the total size of caching directories and the size of individual package to be cached have also been introduced.
  6. Proxy authentication is also supported now just the way it is supported in yum.
  7. Packages are not checked for last-modified time anymore. Because in principle two rpms A and B can only have same name iff they have the same contents. So, the delay in response time in case of hits has reduced.

Availability

  1. RPMs for Fedora/Red Hat
  2. Source RPMs for Fedora/Red Hat
  3. Source Tar balls

Installation and configuration is easy and the INSTALL and README files should serve the purpose.

In case you have any suggestions or problems, leave a comment here or file a ticket on project page.