Recent advances in AI has made it possible to generate images and videos from text description (see https://stability.ai/blog/stable-diffusion-public-release for example). The aim of this project is to develop a pipeline that automatically generates a music video for songs. The video can consist of a sequence of images or videos segments generated from the lyrics, potentially with the help of other user inputs.