Using Machine Learning to determine the likelihood of a shot resulting in a goal in football.

Joel Harris


Supervised by Hiroyuki Kido; Moderated by Yazmin Ibanez Garcia

The term expected goals (xG) is a statistical measurement in football to help determine the likelihood of a shot resulting in a goal. An xG model uses historical data of shots to estimate the likelihood of a goal on a scale between 0 and 1. The xG is a crucial measurement in the sport as it is the most accurate predictor of future team and player performance that is available. Expected goals has previously been calculated using equation modeling. However, more accurate results can now be determined using machine learning. The first issue to be addressed in the project is the sorting of data. A script must be written to extract and sort relevant data so that the model can be built. The next issue is creating an accurate model that considers many parameters. Traditional xG models may only take into consideration the shot distance and angle. However, different types of shots don’t carry the same xG. For example, a header is less likely to go in than a regular shot. I will attempt to achieve this with machine learning.

Initial Plan (06/02/2023) [Zip Archive]

Final Report (19/05/2023) [Zip Archive]

Publication Form