MSN Crawlers Pawned

After seeing the way MSN crawled my last post, I just realized why Microsoft could never do good in Search Engine Market :)

I wonder why MSN would crawler same page from two different machines. I wonder if a single page can be divided further for crawling. Checkout the screenshots below :)

Why Google Wins

A few minutes later. Three Microsoft machines were crawling the same page πŸ˜›

MSN Crawler Pawned

I have conformed using ARIN that all these IPs belong to Microsoft πŸ˜€

 

Google Translator Pawned

Well, I was getting bored and just googled my domain name. As I was browsing through the search results, I clicked on a Spanish site containing the word gofedora. I clicked on google’s “Translate this page” and was surprised by the result. Watch it yourself below.

PS : Direct link for translated website.

PS2 : The actual website.

 

How To: Wireless LAN with Broadcom BCM4312 in Fedora 11

Fedora 11 does have support for Broadcom wireless drivers, but it didn’t really work out on a friend’s laptop. Finally we got it working and I thought I’ll just note the steps down. Below are the three easy steps you need to take to make it work properly.

Step 1

Install needed packages

[root@fedora ~]$ yum install broadcom-wlΒ  wl-kmod

Step 2

Once the packages are installed successfully, reboot your laptop.

Step 3

Use the following command

[root@fedora ~]$ system-config-network

And add a new wireless device wlan0 or whatever you want by filling the required fields properly. If you want the device to be managed by NetworkManager, you can do so while editing the device you just added.

Activate the device. And you are on wifi :)

 

Info: ATI Drivers 9.7 does not work in Fedora 11 (2.6.29+)

Yesterday, AMD released ATI Catalystβ„’ 9.7 Proprietary Linux x86/x86_64 Display Drivers. I happened to checkout the website today. Initially I was very excited about it hoping that these drivers will work with 2.6.29+ and I’ll be able to use my ATI Radeon HD 3200 which is lying dead since a fortnight or so. I downloaded the drivers immediately and switched to Fedora 11 default kernel. Installed the drivers and checked the install log located at /usr/share/ati/fglrx-install.log. And I saw a failure. AMD disappointed me, yet another time :(

In case you happen to screw your graphics display while trying to install ATI drivers, use the following command to uninstall fglrx.

[root@fedora ~]$ /usr/share/ati/fglrx-uninstall.sh

Well, I am back to square one. Have to wait for another month and I hope next release will have support for kernel 2.6.29+.

Update : Drivers are working now. Move on to How To: Install ATI Catalyst (fglrx) 9.8 Drivers on Fedora 11.

 

How To: Install/Configure GNUMP3d – Streaming Audio Server

Mission

GNUMP3d is the GNU Streaming MP3/Media Server written in perl. Our mission is to setup GNUMP3d and stream audio over LAN or over internet. Below are the essential steps to install and configure GNUMP3d.

Download

Download latest version of GNUMP3d from GNUMP3d Website.

Extract

[kulbirsaini@fedora ~]$ tar -xjf gnump3d-x.x.tar.bz2

Install

[root@fedora ~]$ cd gnump3d-3.0
[root@fedora ~]$ make install (as root)

Now gnump3d is installed on your system. Now you need to configure it according to your taste.

Configure

The configuration file is located at /etc/gnump3d/gnump3d.conf. For casual use, you just need to configure port, binding_host and root.

# Port to which gnump3d will be accessible via web interface or via a media player like xmms or winamp.
port = 1111
# The IP Address where gnump3d will bind itself.
binding_host = 172.17.8.64
# If you want the stream to be accessible via a fully qualified domain name, set hostname variable.
# You don't need to set this in most cases e.g. while setting up gnump3d on LAN.
hostname = gofedora.com
# The directory where are your music files resides.
root = /stuff/Music/

Though you can skip rest of the configuration, you may try to explore other options. My gnump3d.conf file can be download from here.

Thats all you need to do to configure gnump3d.

Indexing

Now you need to index all you music collection (the audio files in gnump3d root). Run the following command to index

[root@fedora ~]$ gnump3d-index --verbose

Run gnump3d

Once the indexing is done, you are all set to run gnump3d. By default gnump3d tries to index all files whenever you start it, to avoid this we need to use –fast option.

[root@fedora ~]$ gnump3d --fast

By default gnump3d runs in foreground. If you want it to go in background and run quietly, run it as follows.

[root@fedora ~]$ gnump3d --fast --background

Accessing Media Server

To access your gnump3d streaming media server, please visit url http://ip_address:port/ .

Run at startup

If you want gnump3d to start when your computer starts add the following line to /etc/rc.local file.

gnump3d --fast --background

Feel free to comment in case you have a problem.

 

How To: Install and Configure GitWeb

UPDATE : I recommend using GitList instead of GitWeb. GitList is much easier to setup and has a better web interface. Continue reading this post if you looking for GitWeb setup instructions specifically.

Goal

Setting up gitweb (web interface for SCM software git) for your project’s git repository for public access and developer commits via ssh.

Assumptions

  1. You already have your project’s git repository.
  2. You have hosting space somewhere to host gitweb.
  3. You have root access.
  4. You are using Apache as webserver.

Example for this howto

Project : VideoCache
Domain for gitweb : git.cachevideos.com
URL for git access for videocache : http://git.cachevideos.com/videocache.git
Actual path on server : /home/saini/domains/cachevideos.com/git
Git repository : /home/saini/projects/videocache/

Installation

Installation is pretty easy. Just one single command would do everything.

[root@localhost ~]# yum install gitweb (do as root)

This will create a directory /var/www/git which is default for gitweb.

Copy the directory /var/www/git/ to /home/saini/domains/cachevideos.com/git

[root@localhost ~]# cp -r /var/www/git /home/saini/domains/cachevideos.com/git

Configuration

1. GitWeb

Open the file /etc/gitweb.conf (it may or may not be there) and add the following lines to it.

# Change This
$projectroot = '/home/saini/domains/cachevideos.com/git';
# Change This
$site_name = "Kulbir Saini's git trees.";
# Don't Change the variables below
$my_uri = "/";
$home_link = '/';
@stylesheets = ("/gitweb.css");
$favicon = "/git-favicon.png";
$logo = "/git-logo.png";

2. Apache

Open the file /etc/httpd/conf.d/git.conf and clear all the lines that are already there and add the following lines to it

  DocumentRoot /home/saini/domains/cachevideos.com/git
  ServerName git.cachevideos.com
  ErrorLog "/home/saini/domains/cachevideos.com/logs/error_log"
  CustomLog "/home/saini/domains/cachevideos.com/logs/access_log" combined
  SetEnv  GITWEB_CONFIG  /etc/gitweb.conf
  DirectoryIndex gitweb.cgi
 
    Allow from all
    AllowOverride all
    Order allow,deny
    Options +ExecCGI
    AddHandler cgi-script .cgi
 
      SetHandler cgi-script
 
    RewriteEngine on
    RewriteRule ^[a-zA-Z0-9_\-]+\.git/?(\?.*)?$ /gitweb.cgi%{REQUEST_URI} [L,PT]

3. Git repository configuration

Go to your git repository (/home/saini/projects/videocache/) and make the following changes.

(a). Open file .git/description and add a short nice description for your project.

videocache is a squid url rewriter plugin written in Python to facilitate youtube, metacafe, dailymotion, google, vimeo, msn soapbox, tvuol.uol.com.br, blip.tv, break.com videos and wrzuta.pl audio caching.

(b). Open file .git/config and append the following lines

[gitweb]
  owner = "Kulbir Saini"

Copy project’s git repository for gitweb

Copy the /home/saini/projects/videocache/.git directory to /home/saini/domains/cachevideos.com/git/videocache.git

[root@localhost ~]# cp -r /home/saini/projects/videocache/.git /home/saini/domains/cachevideos.com/git/videocache.git

Finishing Step

Restart Apache webserver.

[root@localhost ~]# service httpd restart

Now you can browser a list of your projects’ git repositories at http://git.cachevideos.com/ .

Adding another project repository

Just copy the project repository’s .git directory to /home/saini/domains/cachevideos.com/git/prjoect_name.git. And it’ll be shown on the list.

Committing (pushing) to the repository

For committing to the repository via ssh use the following command.

# Pushing everything (Please see the username)
[root@localhost videocache]# git push --all ssh://saini@git.cachevideos.com/~saini/domains/cachevideos.com/git/videocache.git

To update tags on the remote repository use this command.

# Pushing all tags
[root@localhost videocache]# git push --tags ssh://saini@git.cachevideos.com/~saini/domains/cachevideos.com/git/videocache.git

Well, if you consider just the web interface and committing part for your project, thats all. But things can be fine tuned further. Below are few hacks!

1. Enabling nice urls.

By default the urls for browsing repository via git web are pretty crappy and difficult to remember. The RewriteRule and RewriteEngine lines in your Apache configuration file (/etc/httpd/conf.d/git.conf) takes care of that and produce nice and clean urls.

So you can browser the repository via http://git.cachevideos.com/videocache.git instead of http://git.cachevideos.com/?p=videocache.git;a=summary.

2. Enabling remote ls (git-ls-remote or git ls-remote)

This is the most trickiest part. If you try the command below, it won’t produce any output

[root@localhost ~]# git-ls-remote http://git.cachevideos.com/videocache.git

You need to go to project’s repository in gitweb and then run the following command to update the server info for git.

[root@localhost ~]# cd /home/saini/domains/cachevideos.com/git/videocache.git/
[root@localhost ~]# git-update-server-info

Try the ls-remote command now and it should succeed by producing all the branches and tags in the remote repository.

But there is a problem, you have to run the above command after every commit to the remote repository. To solve this issue, you can enable post-update hook for the project’s repository in gitweb. Use the following command to enable it.

[root@localhost ~]# cd /home/saini/domains/cachevideos.com/git/videocache.git/
[root@localhost ~]# chmod +x post-update

The above command will update the server info automatically every time you commit.

Thats all you need to do for setting up gitweb. I hope this will be helpful.

 

How To: Boot Fedora Faster

Note: These tricks apply to any Linux based OS. But I have tested them only on Fedora, so can’t say whether they’ll work on other Linux(s).

My current Fedora installation is now almost one and a half years old. Yes. I am still using Fedora 7 πŸ˜€ I have Fedora 10 on my other machine. Coming to the agenda, my Fedora installation has grown beyond control and I have services from named, squid, drbl, privoxy, vsftpd, vbox*, smb and what not on a personal desktop. These services really force my system startup to slow down to more than two minutes. While shutting down, its very easy to just cut the power supply but while booting up I can’t help and it frustrates me. And what frustrates me further that I have 4GB DDR2 RAM and AMD64 X2 5600+ (2.8GHz x 2) and booting time is still more than two minutes.

Agenda

  • Boot Fedora faster using whatever techniques possible.

Remove the services from normal order and delay their execution to a later stage. So, services like network, squid, privoxy, named, vsftpd, smb etc. doesn’t make sense unless I am not logged in and using them. Let us start them after we have login screen.

Turn off all the services by using the command

[root@bordeaux ~]# chkconfig service_name off

where service_name is the service you want to turn off.

Now create a file /etc/startup.sh. Enter a line like this

[root@bordeaux ~]# service service_name start

for every service that you have turned off in the Step 1.1 and you want it to be running after your machine starts up. Now, your startup.sh file should look like this

service network start &
service sshd start &
modprobe it87 &
modprobe k8temp &
/usr/bin/iptraf -s eth0 -B &
/usr/bin/iptraf -s lo -B &
service squid start &
service privoxy start &
service httpd start &
service mysqld start &
service named start &
service smb start &
service vboxdrv start &
service vboxnet start &
service vsftpd start &

Add the following line to /etc/rc.local file

/bin/bash /etc/startup.sh &

Done!!! Notice the &s in both files. They are for execution in background so that a process can block boot process. You’ll observe a drop of 10-20 seconds in system startup time.

Problem with Hack #1 : The execution is not really parallel. It executes like a process in the background. So we can’t get the real advantage of parallel execution.

Hack #2 solves this problem. Now we don’t put processes in background. We use daemon forking to fork a separate daemon process which will start all the services for us in parallel. Here we’ll get the real advantage and startup time will decrease further.

This step is totally similar to Step 1.1. So skipping it.

This step is also similar to Step 1.2. The /etc/startup.sh file should look like this.

service network start
service xinetd start
service crond start
service anacron start
service atd start
service sshd start
service rpcbind start
service rpcgssd start
service rpcimapd start
modprobe it87
modprobe k8temp
/usr/bin/iptraf -s eth0 -B
/usr/bin/iptraf -s lo -B
service nasd start
service squid start
service privoxy start
service httpd start
service iptables start
service lm_sensors start
service mysqld start
service named start
service nfs start
service nfslock start
service smb start
service vboxdrv start
service vboxnet start
service vsftpd start
service autofs start
service smartd start

Notice the absence of &s in the file.

Download the attached startup.py file attached at the end of this post or copy paste the following code to /etc/startup.py file.

#!/usr/bin/env python
# (C) Copyright 2008 Kulbir Saini
# License : GPL
import os
import sys
def fork_daemon(f):
    """This function forks a daemon."""
    # Perform double fork
    r = ''
    if os.fork(): # Parent
        # Wait for the child so that it doesn't defunct
        os.wait()
        # Return a function
        return  lambda *x, **kw: r
    # Otherwise, we are the child
    # Perform second fork
    os.setsid()
    os.umask(077)
    os.chdir('/')
    if os.fork():
        os._exit(0)
    def wrapper(*args, **kwargs):
        """Wrapper function to be returned from generator.
        Executes the function bound to the generator and then
        exits the process"""
        f(*args, **kwargs)
        os._exit(0)
    return wrapper
 
def start_services(startup_file):
    command = '/bin/bash ' + startup_file + ' > /dev/null 2> /dev/null '
    os.system(command)
    return
 
if __name__ == '__main__':
    forkd = fork_daemon(start_services)
    forkd(sys.argv[1])
    print 'Executing ', sys.argv[1], '[  OK  ]'

Add the following line to your /etc/rc.local file.

/usr/bin/python /etc/startup.py /etc/startup.sh

Thats it. Done!!! Now you’ll experience a boost of about 25-30 seconds of decrease in boot time.

Stats of my machine

With all services started in normal order : 2minutes.
With Hack #1 : 1minute 42 seconds.
With Hack #2 : 1minute.

Warning : These hacks may break your system and can make it unusable. Use at your own risk.

 

Crack: Google Authentication Services are Vulnerable

There is a vulnerability in the way Google authentication service works. Whenever you login to any of the Google’s online services like GMail, Orkut, Groups, Docs, Youtube, Calendar etc., you are redirected to an authentication server which authenticates against the entered username and password and redirect back to the required service (GMail, Youtube etc.) setting the session variables.

Now, if you are able to grab the url used to set the session variables, you can login as the user to whom that url belongs from any machine on the Internet (need not be the machine belonging to the same subnet) without entering the username and password of the user.

The proxy servers in the organizations can be used to exploit this vulnerability. Squid is the most popular proxy server used. In the default configuration, squid strips the query terms of a url before logging. So, this vulnerability can’t be exploited. But if you turn off the stripping mechanism by adding the line shown below, then squid will log the complete url.

strip_query_terms off

So, after turning stripping mechanism off, the log will contain urls which will look like this

http://www.google.co.in/accounts/SetSID?ssdc=1&sidt=Q5UrfB0BAAA%3D.oHVGErODzffQ%2Bms%2FOKfk53g5naReDKehRNHOBsmJlBu3VTNXjF03SbgX%2FVEEhmImhR4mlu5IAAjM%2BdbuXvMMSIb0oU8IGCYpnLcSNkbCIrG%2BQnm81YmX5%2Brcrq7U6Qx65%2F1yaQ2NzgmKD94jg0Iw13iXDen3qD5qn6L%2FhmmYWwTrcOeuTzGbO%2BAehpjEU3mrWapRafaq3b4kxyigJ68s8QrGQqZTINNE%2Bs%2BoIkZWmGt5kNzoT8fkVAsWJeu3CKFkxj4oVMngeDvpwb1nyFpsJCltOzmAr46fTxVJSpvQdx0%3D.BMLtjUdIDCcuszktZSvYzA%3D%3D&continue=http%3A%2F%2Fwww.orkut.com%2FRedirLogin.aspx%3Fmsg%3D0%26ts%3D1226148773097%3A1226148773386%3A1226148774868%26auth%3DDQAAAIcAAAC1pPE1QT4chKgrU4B3oyKZrQRkEVPtYlclpESQoXV_d9x9gdoe75Z0hfJ_22Pn5tVMR7j-uV5YCps3NB48L0bFlDeX-4PGHVT6Loztp_ru3tAy_gxDa9_YAEbz4d9CO4wD2VTKtzax9zvpGgrnJVZQfoWPkkIomUmxDtVGoH7g3fA3UjS0vdBJ2PJtgFMElso

Replace .co.in with your tld specific to your country. If you paste this url in any browser, it’ll directly log you in and you can do whatever you want to that account. Remember that all such urls remains valid only for two minutes. So, if you use that url after two minutes, it’ll lead nowhere.

At the time of writing this post Orkut, Google Docs, Google Calendar, Google Books and Youtube are vulnerable.

So, make sure your squid has stripping mechanism turned on and your squid server is properly firewalled.

You can watch the Video proof for Orkut on Blip.tv, Youtube.

 

Humour: Funny Apache Logs

The other day I was debugging my drupal installation and had a look at the Apache error logs. And this is what I found πŸ˜€

[root@gofedora html]# tail -f /var/log/httpd/error_log
[Fri Nov 28 21:00:16 2008] [warn] long lost child came home! (pid 23229)
[Fri Nov 28 21:00:16 2008] [warn] long lost child came home! (pid 23230)
[Fri Nov 28 21:00:16 2008] [warn] long lost child came home! (pid 23231)
[Fri Nov 28 21:00:16 2008] [warn] long lost child came home! (pid 23232)
[Fri Nov 28 21:00:16 2008] [warn] long lost child came home! (pid 23233)
[Fri Nov 28 21:00:16 2008] [warn] long lost child came home! (pid 23234)
[Fri Nov 28 21:00:16 2008] [warn] long lost child came home! (pid 23235)
[Fri Nov 28 21:00:16 2008] [warn] long lost child came home! (pid 23236)
[Fri Nov 28 21:00:16 2008] [warn] long lost child came home! (pid 23237)
[Fri Nov 28 21:00:16 2008] [warn] long lost child came home! (pid 23238)

Reminded me of famous Indian saying, “Kumbh ke mele mein khoya wapis aa gaya“.

And Apache logs it as a warning. You long lost child has come home. You gotta run πŸ˜›

 

Info: Fedora 10 – Cambridge Released

Fedora 10 aka Cambridge is available now.

Fedora 10 - Cambridge Released

Get your copy of Fedora 10 now.

Well, if normal Fedora, doesn’t suite you, checkout the Fedora 10 spins and get whichever suits you :)

Also for KDE fans, there is a special corner.

After you finished downloading, burn the ISO and check the completely detailed Fedora 10 Installation Guide. Or if you want to upgrade from existing Fedora version to Fedora 10, check the upgrade guide.

Want to read whats new and what is there inside Fedora 10, checkout the release notes in your language.

In case your CD/DVD drive is burnt πŸ˜› , or you don’t have immediate access to optical media, check How to install Fedora 10 without CD/DVD.

Still stuck somewhere, there are a lot of ways to get help. Proceed with whatever way suites you :)