Hearing the future: Predicting the next piece of audio

Max Green


Supervised by Dave Marshall; Moderated by Bailin Deng

The basic ideas is based on a sequence of audio can you predict the next few seconds.

Deep learning networks (E.g LTSM, Recurrent Neural networks) can be used. Training data is abundant: any audio of a few seconds and be utilised. Take a few second segment and use this build a model that predicts the next segment.

A variety of interesting questions need to researched: * What the the best format for the input audio * What type and configuration of network is best * Format of training data: how long does input segment need to be, how long a segment can be predicted reliably.

Initial Plan (06/02/2023) [Zip Archive]

Final Report (19/05/2023) [Zip Archive]

Publication Form