[PDF]

Training a Dota 2 Bot issuing High-level Strategic Commands using Deep Reinforcement Learning


Trapsilo Bumi

05/09/2024

Supervised by Federico Liberatore; Moderated by Kirill Sidorov

Dota 2 is a popular MOBA game that has been used as a testbed for various artificial intelligence research projects. One aspect of research in Dota 2 is creating an intelligent agent using deep reinforcement learning, recently popularized by the success of OpenAI Five [1] in defeating the world champion at the time. However, creating a bot of that skill requires a large amount of time and computing power, none of which are in reach of the average researcher.

This study proposes a novel approach to training a Dota 2 bot, using high-level strategic commands instead of low-level micromanagement. This approach has proved to produce a working bot with a fraction of the training time and compute power that traditional micromanagement approaches have used. In studying the novel approach, two different algorithms are used and compared for training the neural network with deep reinforcement learning, namely PPO and DQN.

The results showed that the DQN algorithm had a slight advantage in efficiency as shown by the win rate and the match durations played, but the PPO algorithm shows promise if trained for more than 70 matches.

Keywords: Dota 2, deep reinforcement learning, PPO, DQN

[1] OpenAI, C. Berner, G. Brockman, B. Chan, V. Cheung, P. Dębiak, C. Dennison, D. Farhi, Q. Fischer, S. Hashme, C. Hesse, R. Józefowicz, S. Gray, C. Olsson, J. Pachocki, M. Petrov, H. P. d. O. Pinto, J. Raiman, T. Salimans, J. Schlatter, J. Schneider, S. Sidor, I. Sutskever, J. Tang, F. Wolski and S. Zhang, “Dota 2 with Large Scale Deep Reinforcement Learning,” December 2019.


Final Report (05/09/2024) [Zip Archive]

Publication Form