Nutch Search Engine Finally Working!

omar.jpgIf you recall a few weeks back I posted about building your own google, well I finally did it, my very own google search engine is finally up and running.

It was not an easy job at all, and very very frustrating at times but very rewarding indeed!

There are still some issues that need to be ironed out, for instance the cache links among a few others give an error message when you click on them, but all in all these are minor issues compared to the hurdels I jumped over to get this engine going.

my nutch engineI managed to spider the cnn.com website (only a few pages as an experiment) and feed the resuts of the crawl into my search engine.  Try searching for weather on CNN using my search engine and check out the results.

I will be experimenting further with nutch including deeper and multiple crawls as well as fixing the odd bug or two that currently exist.

I will also start reading up on the nutch technology to better understand it and to get a better feel for its potential.

Hopefully in the next few months I will begin creating new websites that will cater to vertical search and see where that takes me.

I’ll keep you all posted.  In the meatime if you have any ideas or questions please feel free to post a comment or two.

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...

1 Comment »

  1. omar said,

    September 5, 2007 @ 11:14 am

    an update on this…

    I had to take the engine down as it was using up too many resources and crashing my server!

RSS feed for comments on this post · TrackBack URI

Leave a Comment

Add to Technorati Favorites