Towards a Best Practices Approach to Dataset Image Augmentation in Deep Learning Image Classifier Training

Daniel Harborne


Supervised by Dave Marshall; Moderated by Kirill Sidorov

It is well established that one weakness of the widely adopted deep learning approach to image classification is the large volumes of labelled training data required to produce a robust model. This can be particularly challenging if instances of some classes within the classification task are rare within the task domain and thus, making finding example data to train with difficult. Further to this, even when example data is available for all classes, annotating large volumes of data usually requires considerable human resource. In an attempt to mitigate both these issues, image augmentation techniques (such as flipping or rotating) can be applied to labelled examples generating additional labelled images with minimal commitment of human time. The general effectiveness of dataset augmentation has been shown in existing literature, however there is a lack of a concrete study to evaluate the performance gains for different permutations of image augmentation techniques. Concretely, there is no established best practices as to the augmentation techniques that are likely to provide the best performance increase. When considering permutations of augmentation techniques, two factors are adjustable: the specific techniques used to produce new images (e.g. do you apply any level of rotation during augmentation or not) and the intensity ranges for the techniques used. For example, if you were to augment an image in a dataset purely by rotation - doing so from 0 degrees of rotation up to 180 degrees in steps of 10 produces 19 images in the augmented dataset (including the original). In practice, it is not feasible to apply all available augmentation techniques at all possible intensities due to the number of permutations this would create and thus, the large processing time and storage this would require is prohibitive. In this work, permutations are generated across a range of popular image augmentation techniques and a range of intensities to which they can be applied, datasets are generated for these permutations and ultimately, the performance increase of models trained on these augmented datasets are evaluated with the aim of establishing a baseline best practice for incorporating image augmentation in to the image classification training pipeline.

Initial Plan (04/02/2019) [Zip Archive]

Final Report (10/05/2019) [Zip Archive]

Publication Form