Learning Effective Security Strategies through Reinforcement Learning

Yizhou Shen


Supervised by Tingting Li; Moderated by Ian M Cooper

Defending cyber security is a significantly unfair game between defenders and attackers, as defenders need to be cautious all the time to detect and react to every single attack, whilst attackers only need to strike once at any time. Rapid development in Artificial Intelligence (AI) provides the potential for distributed, adaptive defensive measures at machine speed and scale. It is possible now the defender can be trained as an intelligent agent to develop strategies to respond to an attacker across an entire network automatically and effectively.

The project will use the experimental platform based on the OpenAI Gym interface and a Markov Game environment provided by idsgame (https://github.com/Limmen/gym-idsgame) to investigate the techniques that can be employed to train the defensive agents, e.g. Deep Reinforcement Learning and Game Theory. The simulation consists of a Markov game model in which an attacker and a defender are simulated. An interface to a partially observed Markov decision process (POMDP) model of the Markov game is provided as well as baseline results for various reinforcement learning algorithms.

The student is expected to explore the in-depth knowledge of cyber security simulation and relevant techniques to train and evaluate defensive agents. Some novel techniques are expected to apply to the simulation to compare with baseline results. Reference: [1] https://github.com/Limmen/gym-idsgame [2] Hammar, Kim, and Rolf Stadler. "Finding effective security strategies through reinforcement learning and Self-Play." 2020 16th International Conference on Network and Service Management (CNSM). IEEE, 2020. [3] Hammar, Kim, and Rolf Stadler. "Learning Security Strategies through Game Play and Optimal Stopping." arXiv preprint arXiv:2205.14694 (2022).

Initial Plan (02/02/2023) [Zip Archive]

Final Report (12/05/2023) [Zip Archive]

Publication Form