This project will develop a program that can compare two sets of source code and identify significant similarities that indicate collusion. To cope with code obfuscation techniques some differences will need to be ignored (e.g. comments variations in white space, variable names, shifted blocks of text) while other small but idiosyncratic differences will be highlighted. Thus the code needs to be analysed both at a structural level as well as at a surface level.