[PDF]

Recognising place names in text documents


Craig D Harris

06/05/2016

Supervised by Chris B Jones; Moderated by Dave Marshall

In order to index documents with regard to geographic space it is necessary to recognise and geocode (i.e attach coordinates to) place names in the textual content. This information retrieval project is concerned with developing machine learning methods to perform this task. A gazetteer consisting of a list of place names will be used to assist in the process. The task can be challenging because of the difficulty of distinguishing genuine place names from other terms, such as the names of people and organisations, and because some place names (such as Newport) are ambiguous due to different places having the same name. The machine learning process will employ various types of evidence that a name is a place name, such as whether it occurs in a gazetteer, whether it is preceded by spatial prepositions such as near or towards and whether it is associated with place type terms such as town or river.


Initial Plan (31/01/2016) [Zip Archive]

Final Report (06/05/2016) [Zip Archive]

Publication Form