Concurrent Thread-based Web Crawler

Michael J Graham


Supervised by David W Walker; Moderated by P L Rosin

In this project you will develop a multi-threaded Java program for crawling the Web that. Each web page encountered will be processed in some way that is dependent on its content. The software will offer options for constraining the search; for example, to just one web site.

To do this project you must be familiar with Java and programming with threads.

Initial Plan (19/10/2012) [Zip Archive]

Interim Report (14/12/2012) [Zip Archive]

Final Report (03/05/2013) [Zip Archive]

Publication Form