[PDF]

Comparative Analysis of Deep Learning Vs Traditional Machine Learning Processes in Stock Price Prediction


Daniel Goldney

10/05/2024

Supervised by Xianfang Sun; Moderated by Alexia Zoumpoulaki

This project aims to conduct a comparative analysis of deep learning and traditional machine learning models for short-term stock price prediction. In particular, long short-term memory (LSTM) is evaluated against support vector regression (SVR) and random forest (RF) models. Each model is evaluated using a dataset consisting of NASDAQ-listed stocks combined with macroeconomic indicators, technical metrics, public interest data, and recent movements of correlated stocks in order to obtain a comprehensive and representative dataset. A data cleaning and preprocessing method is utilised, incorporating a novel feature selection approach for the LSTM model that combines feature permutation with k-fold cross validation to provide an indicator of relative feature importance.

The main points that are compared are threefold; firstly their predictive capabilities from one day to the next. Secondly, the performance of each model over increasing intervals ranging from 1 day ahead to 30 days ahead. Finally, their ability at generating a profit in a day trading scenario. The performance is evaluated using mean absolute error, directional accuracy of the prediction and the cumulative profit measured in dollars (USD).

Random forest is found to predict price movements with the highest accuracy and lowest error, achieving a directional accuracy of 75.68%, closely followed by LSTM with 72.38% accuracy. SVR is shown to perform the worst of the models with 66.34% accuracy. Moreover, RF is the only model that generates a profit using the bespoke trading algorithm devised for this project, yielding a net profit of $148 over 180 days with a starting capital of $1000. Both the LSTM and SVR generate a net loss, losing $12 and $466 respectively. All models show relatively stable accuracies over the course of the 30 day intervals, due to the inherent increasing nature of the stock market meaning over longer intervals stocks tend to go up more than they go down. However, predictive errors increase consistently as the interval becomes longer, with no model significantly outperforming the others.


Initial Plan (04/02/2024) [Zip Archive]

Final Report (10/05/2024) [Zip Archive]

Publication Form