[PDF]

Predicting habitat type from archived biological records


Linglin Yu

02/10/2025

Supervised by Chris B Jones; Moderated by Oktay Karakus

There are millions of records of biological samples in museums and herbaria, many of which include a natural language description of the habitat where the sample was collected. The data have the potential to be used to reconstruct historical habitats. The project will experiment with using machine learning methods to predict habitat type based on such text data. Classifiers will be trained using ground truth data on habitat in the form of digital land cover maps from recent decades, with a view to measuring how well the biological collections textual data can predict standard habitat types.


Final Report (02/10/2025) [Zip Archive]

Publication Form