Hearing the future: Predicting the next piece of audio

Yiwei Su


Supervised by Dave Marshall; Moderated by Nervo Verdezoto Dias

The basic ideas is based on a sequence of audio can you predict the next few seconds.

Deep learning networks (E.g LTSM, Recurrent Neural networks) can be used. Training data is abundant: any audio of a few seconds and be utilised. Take a few second segment and use this build a model that predicts the next segment.

A variety of interesting questions need to researched: * What the the best format for the input audio * What type and configuration of network is best * Format of training data: how long does input segment need to be, how long a segment can be predicted reliably.

Final Report (09/10/2023) [Zip Archive]

Publication Form