Public transportation systems play important role in the quality of life of citizens
in any metropolitan city. However, public transportation authorities face
criticisms from commuters due to irregularities in bus arrival times. For example,
transit bus users often complain when they miss the bus because it arrived too
early or too late at the bus stop. Due to these irregularities, commuters may miss
important appointments, wait for too long at the bus stop, or arrive late for work.
This thesis seeks to predict the occurrence of irregularities in bus arrival times by
developing machine learning models that use GPS locations of transit buses provided
by the Toronto Transit Commission (TTC) and hourly weather data. We
found that in nearly 37% of the time, buses either arrive early or late by more than
5 minutes, suggesting room for improvement in the current strategies employed by
transit authorities. We compared the performance of three machine learning models,
for which our Long Short-Term Memory (LSTM) [13] model outperformed all
other models in terms of accuracy. The error rate for LSTM model was the lowest
among Artificial Neural Network (ANN) and support vector regression (SVR). The
improved accuracy achieved by LSTM is due to its ability to adjust and update the
weights of neurons while maintaining long-term dependencies when encountering
new stream of data.
Author Keywords: ANN, LSTM, Machine Learning