[PDF]

Towards Human-Level Game AI with Deep Reinforcement Learning


Connor Jones

07/05/2025

Supervised by Frank C Langbein; Moderated by Matthew J W Morgan

We aim to develop reinforcement learning algorithms capable of training game AI bots that exhibit human-like performance, leveraging the same (or at least similar) information available to human players. Given the difficulty of this task, generally lower than human performance is expected. For this project a specific game should be chosen, based on its suitability for reinforcement learning and the availability of necessary resources. A suitable reinforcement learning algorithm has to be selected, implemented, trained and tested (e.g. Q learning, policy gradients, or actor-critic methods), considering the algorithm's strengths and weaknesses in relation to the specific challenges of the chosen game. Given the time it is likley an existing game implementation should be used, rather than also coding the game itself (e.g. doom/vizdoom, openai/deepmind examples, trackmania, etc. or maybe a more universal agent playing retro games).

Potentially multi-agent reinfrocement learning approaches, coordinating multiple collaborating AI agents may be explored; or exploring how the knowledge learned in one game could be transferred to another game (e.g. starting with existing solutions already); or instead explore how AI bots acn interact with human players in a meanigful and engaging way (NPCs in RPGs and LLMs?).

This is a difficult project, requiring advanced maths and programming skills, and a strong understanding of reinforcement learning with deep learning (see Sutton&Barto, Introduction to Reinforcement Learning for the theory; there are many game related examples, papers, videos on the internet). Also ensure you have access to sufficient computational resources, including GPUs, for training and evaluation. You may consider cloud-based solutions or university computing facilities.

Ideally, the code would be made available under the AGPL v3 or compatible license to integrate with our other code.


Initial Plan (03/02/2025) [Zip Archive]

Final Report (07/05/2025) [Zip Archive]

Publication Form