Image to Speech System

Yanqiao Huang


Supervised by Bailin Deng; Moderated by Yuhua Li

In this project, you will build a prototype system that converts the text inside an image to a speech. Such a system can be useful for people with visual impairment. The hardware consists of a computer attached to a camera. The system will use computer vision libraries to extract text that is inside an image captured by the camera, and then use text-to-speech APIs to convert the text to audio. The system is expected to be deployed on a PC with a webcam, or on a raspberry PI with a camera module.

Initial Plan (05/02/2023) [Zip Archive]

Final Report (07/05/2023) [Zip Archive]

Publication Form