[PDF]

Idiom search engine


Callum Hughes

15/05/2020

Supervised by Irena Spasic; Moderated by Martin Caminada

Idioms are phrases (i.e. groups of words) whose meaning may not be deducible from those of the individual words (e.g. "bury the hatchet" = "end a quarrel or conflict and become friendly"). One difficulty in finding idioms in text is the fact that they can vary, so a simple string search is not appropriate. For example, searching for "bury the hatchet" would miss this idiom in the following two sentences:

Christmas looks to be a time for burying the hatchet or exhuming it for re-examination.

From the look of things, the hatchet has been long buried.

The aim of this project is to implement an idiom search engine, which would take an idiom as input and find all of its occurrences in text.

The skills required for this project include knowledge of web development technologies and basic text processing.

All software developed in this project will be under the General Public Licence: https://www.gnu.org/licenses/gpl-3.0.html


Initial Plan (03/02/2020) [Zip Archive]

Final Report (15/05/2020) [Zip Archive]

Publication Form