Artificial intelligence
SPAF-network with Saturating Pretraining Neurons
In this work, various aspects of neural networks, pre-trained with denoising autoencoders (DAE) are explored. To saturate neurons more quickly for feature learning in DAE, an activation function that offers higher gradients is introduced. Moreover, the introduction of sparsity functions applied to the hidden layer representations is studied. More importantly, a technique that swaps the activation functions of fully trained DAE to logistic functions is studied, networks trained using this technique are reffered to as SPAF-networks. For evaluation, the popular MNIST dataset as well as all \(3\) sub-datasets of the Chars74k dataset are used for classification purposes. The SPAF-network is also analyzed for the features it learns with a logistic, ReLU and a custom activation function. Lastly future roadmap is proposed for enhancements to the SPAF-network.
Author Keywords: Artificial Neural Network, AutoEncoder, Machine Learning, Neural Networks, SPAF network, Unsupervised Learning
Time Series Algorithms in Machine Learning - A Graph Approach to Multivariate Forecasting
Forecasting future values of time series has long been a field with many and varied applications, from climate and weather forecasting to stock prediction and economic planning to the control of industrial processes. Many of these problems involve not only a single time series but many simultaneous series which may influence each other. This thesis provides methods based on machine learning of handling such problems.
We first consider single time series with both single and multiple features. We review the algorithms and unique challenges involved in applying machine learning to time series. Many machine learning algorithms when used for regression are designed to produce a single output value for each timestamp of interest with no measure of confidence; however, evaluating the uncertainty of the predictions is an important component for practical forecasting. We therefore discuss methods of constructing uncertainty estimates in the form of prediction intervals for each prediction. Stability over long time horizons is also a concern for these algorithms as recursion is a common method used to generate predictions over long time intervals. To address this, we present methods of maintaining stability in the forecast even over large time horizons. These methods are applied to an electricity forecasting problem where we demonstrate the effectiveness for support vector machines, neural networks and gradient boosted trees.
We next consider spatiotemporal problems, which consist of multiple interlinked time series, each of which may contain multiple features. We represent these problems using graphs, allowing us to learn relationships using graph neural networks. Existing methods of doing this generally make use of separate time and spatial (graph) layers, or simply replace operations in temporal layers with graph operations. We show that these approaches have difficulty learning relationships that contain time lags of several time steps. To address this, we propose a new layer inspired by the long-short term memory (LSTM) recurrent neural network which adds a distinct memory state dedicated to learning graph relationships while keeping the original memory state. This allows the model to consider temporally distant events at other nodes without affecting its ability to model long-term relationships at a single node. We show that this model is capable of learning the long-term patterns that existing models struggle with. We then apply this model to a number of real-world bike-share and traffic datasets where we observe improved performance when compared to other models with similar numbers of parameters.
Author Keywords: forecasting, graph neural network, LSTM, machine learning, neural network, time series