Mathematics

SPAF-network with Saturating Pretraining Neurons

Type:
Names:
Creator (cre): Burhani, Hasham, Thesis advisor (ths): Wenying, Feng, Degree committee member (dgc): Hurley, Richard, Degree committee member (dgc): Abdella, Kenzu, Degree granting institution (dgg): Trent University
Abstract:

In this work, various aspects of neural networks, pre-trained with denoising autoencoders (DAE) are explored. To saturate neurons more quickly for feature learning in DAE, an activation function that offers higher gradients is introduced. Moreover, the introduction of sparsity functions applied to the hidden layer representations is studied. More importantly, a technique that swaps the activation functions of fully trained DAE to logistic functions is studied, networks trained using this technique are reffered to as SPAF-networks. For evaluation, the popular MNIST dataset as well as all \(3\) sub-datasets of the Chars74k dataset are used for classification purposes. The SPAF-network is also analyzed for the features it learns with a logistic, ReLU and a custom activation function. Lastly future roadmap is proposed for enhancements to the SPAF-network.

Author Keywords: Artificial Neural Network, AutoEncoder, Machine Learning, Neural Networks, SPAF network, Unsupervised Learning

2016

Time Series Algorithms in Machine Learning - A Graph Approach to Multivariate Forecasting

Type:
Names:
Creator (cre): Zhou, Ryan, Thesis advisor (ths): Feng, Wenying, Degree committee member (dgc): Alam, Omar, Degree granting institution (dgg): Trent University
Abstract:

Forecasting future values of time series has long been a field with many and varied applications, from climate and weather forecasting to stock prediction and economic planning to the control of industrial processes. Many of these problems involve not only a single time series but many simultaneous series which may influence each other. This thesis provides methods based on machine learning of handling such problems.

We first consider single time series with both single and multiple features. We review the algorithms and unique challenges involved in applying machine learning to time series. Many machine learning algorithms when used for regression are designed to produce a single output value for each timestamp of interest with no measure of confidence; however, evaluating the uncertainty of the predictions is an important component for practical forecasting. We therefore discuss methods of constructing uncertainty estimates in the form of prediction intervals for each prediction. Stability over long time horizons is also a concern for these algorithms as recursion is a common method used to generate predictions over long time intervals. To address this, we present methods of maintaining stability in the forecast even over large time horizons. These methods are applied to an electricity forecasting problem where we demonstrate the effectiveness for support vector machines, neural networks and gradient boosted trees.

We next consider spatiotemporal problems, which consist of multiple interlinked time series, each of which may contain multiple features. We represent these problems using graphs, allowing us to learn relationships using graph neural networks. Existing methods of doing this generally make use of separate time and spatial (graph) layers, or simply replace operations in temporal layers with graph operations. We show that these approaches have difficulty learning relationships that contain time lags of several time steps. To address this, we propose a new layer inspired by the long-short term memory (LSTM) recurrent neural network which adds a distinct memory state dedicated to learning graph relationships while keeping the original memory state. This allows the model to consider temporally distant events at other nodes without affecting its ability to model long-term relationships at a single node. We show that this model is capable of learning the long-term patterns that existing models struggle with. We then apply this model to a number of real-world bike-share and traffic datasets where we observe improved performance when compared to other models with similar numbers of parameters.

Author Keywords: forecasting, graph neural network, LSTM, machine learning, neural network, time series

2020

Solving Differential and Integro-Differential Boundary Value Problems using a Numerical Sinc-Collocation Method Based on Derivative Interpolation

Type:
Names:
Creator (cre): Ross, Glen Charles, Thesis advisor (ths): Abdella, Kenzu, Degree committee member (dgc): Pollanen, Marco, Degree granting institution (dgg): Trent University
Abstract:

In this thesis, a new sinc-collocation method based upon derivative interpolation is developed for solving linear and nonlinear boundary value problems involving differential as well as integro-differential equations. The sinc-collocation method is chosen for its ease of implementation, exponential convergence of error, and ability to handle to singularities in the BVP. We present a unique method of treating boundary conditions and introduce the concept of the stretch factor into the conformal mappings of domains. The result is a method that achieves great accuracy while reducing computational cost. In most cases, the results from the method greatly exceed the published results of comparable methods in both accuracy and efficiency. The method is tested on the Blasius problem, the Lane-Emden problem and generalised to cover Fredholm-Volterra integro-differential problems. The results show that the sinc-collocation method with derivative interpolation is a viable and preferable method for solving nonlinear BVPs.

Author Keywords: Blasius, Boundary Value Problem, Exponential convergence, Integro-differential, Nonlinear, Sinc

2020

Problem Solving as a Path to Understanding Mathematics Representations: An Eye-Tracking Study

Type:
Names:
Creator (cre): Kim, Seyeon, Thesis advisor (ths): Burr, Wesley, Thesis advisor (ths): Pollanen, Marco, Degree committee member (dgc): Chan-Reynolds, Michael, Degree granting institution (dgg): Trent University
Abstract:

Little is actually known about how people cognitively process and integrate information when solving complex mathematical problems. In this thesis, eye-tracking was used to examine how people read and integrate information from mathematical symbols and complex formula, with eye fixations being used as a measure of their current focus of attention. Each participant in the studies was presented with a series of stimuli in the form of mathematical problems and their eyes were tracked as they worked through the problem mentally. From these examinations, we were able to demonstrate differences in both the comprehension and problem-solving, with the results suggesting that what information is selected, and how, is responsible for a large portion of success in solving such problems. We were also able to examine how different mathematical representations of the same mathematical object are attended to by students.

Author Keywords: eye-tracking, mathematical notation, mathematical representations, problem identification, problem-solving, symbolism

2020

Combinatorial Collisions in Database Matching: With Examples from DNA

Type:
Names:
Creator (cre): Johnson, Stephanie, Thesis advisor (ths): Pollanen, Marco, Thesis advisor (ths): Burr, Wesley, Degree granting institution (dgg): Trent University
Abstract:

Databases containing information such as location points, web searches and fi- nancial transactions are becoming the new normal as technology advances. Conse- quentially, searches and cross-referencing in big data are becoming a common prob- lem as computing and statistical analysis increasingly allow for the contents of such databases to be analyzed and dredged for data. Searches through big data are fre- quently done without a hypothesis formulated before hand, and as these databases grow and become more complex, the room for error also increases. Regardless of how these searches are framed, the data they collect may lead to false convictions. DNA databases may be of particular interest, since DNA is often viewed as significant evi- dence, however, such evidence is sometimes not interpreted in a proper manner in the court room. In this thesis, we present and validate a framework for investigating var- ious collisions within databases using Monte Carlo Simulations, with examples from DNA. We also discuss how DNA evidence may be wrongly portrayed in the court room, and the explanation behind this. We then outline the problem which may occur when numerous types of databases are searched for suspects, and framework to address these problems.

Author Keywords: big data analysis, collisions, database searches, DNA databases, monte carlo simulation

2020

Educational Data Mining and Modelling on Trent University Students' Academic Performance

Type:
Names:
Creator (cre): Kheiri, Amir, Thesis advisor (ths): Cater, Bruce, Degree committee member (dgc): Pollanen, Marco, Degree granting institution (dgg): Trent University
Abstract:

Higher education is important. It enhances both individual and social welfare by improving productivity, life satisfaction, and health outcomes, and by reducing rates of crime. Universities play a critical role in providing that education. Because academic institutions face resource constraints, it is thus important that they deploy resources in support of student success in the most efficient ways possible. To inform that efficient deployment, this research analyzes institutional data reflecting undergraduate student performance to identify predictors of student success measured by GPA, rates of credit accumulation, and graduation rates. Using methods of cluster analysis and machine learning, the analysis yields predictions for the probabilities of individual success.

Author Keywords: Educational data mining, Students' academic performance modelling

2021

Positive Solutions for Boundary Value Problems of Second Order Ordinary Differential Equations

Type:
Names:
Creator (cre): Zhang, Yanlei, Thesis advisor (ths): Feng, Wenying, Thesis advisor (ths): Abdella, Kenzu, Degree granting institution (dgg): Trent University
Abstract:

In this thesis, we study modelling with non-linear ordinary differential equations, and the existence of positive solutions for Boundary Value Problems (BVPs). These problems have wide applications in many areas. The focus is on the extensions of previous work done on non-linear second-order differential equations with boundary conditions involving first-order derivative. The contribution of this thesis has four folds. First, using a fixed point theorem on order intervals, the existence of a positive solution on an interval for a non-local boundary value problem is obtained. Second, considering a different boundary value problem that consists of the first-order derivative in the non-linear term, an increasing solution is obtained by applying the Krasnoselskii-Guo fixed point theorem. Third, the existence of two solutions, one solution and no solution for a BVP is proved by using fixed point index and iteration methods. Last, the results of Green's function unify some methods in studying the existence of positive solutions for BVPs of nonlinear differential equations. Examples are presented to illustrate the applications of our results.

Author Keywords: Banach Space, Boundary Value Problems, Differential Equations, Fixed Point, Norm, Positive Solutions

2017

The Compression Cone Method on Existence of Solutions for Semi-linear Equations

Type:
Names:
Creator (cre): Liu, Ankai, Thesis advisor (ths): Feng, Wenying, Thesis advisor (ths): Abdella, Kenzu, Degree committee member (dgc): Pollanem, Marco, Degree granting institution (dgg): Trent University
Abstract:

With wide applications in many fields such as engineering, physics, chemistry, biology and social sciences, semi-linear equations have attracted great interests of researchers from various areas. In the study of existence of solutions for such class of equations, a general and commonly applied method is the compression cone method for fixed-point index. The main idea is to construct a cone in an ordered Banach space based on the linear part so that the nonlinear part can be examined in a relatively smaller region.

In this thesis, a new class of cone is proposed as a generalization to previous work. The construction of the cone is based on properties of both the linear and nonlinear part of the equation. As a result, the method is shown to be more adaptable in applications. We prove new results for both semi-linear integral equations and algebraic systems.

Applications are illustrated by examples. Limitations of such new method are also discussed.

Keywords: Algebraic systems; compression cone method; differential equations; existence of solutions; fixed point index; integral equations; semi-linear equations.

Author Keywords: algebraic systems, differential equations, existence of solutions, fixed point index, integral equations, semi-linear equations

2018

Population-Level Ambient Pollution Exposure Proxies

Type:
Names:
Creator (cre): Scott, Carlone Livingston, Thesis advisor (ths): Burr, Wesley S, Degree granting institution (dgg): Trent University
Abstract:

The Air Health Trend Indicator (AHTI) is a joint Health Canada / Environment and Climate Change Canada initiative that seeks to model the Canadian national population health risk due to acute exposure to ambient air pollution. The common model in the field uses averages of local ambient air pollution monitors to produce a population-level exposure proxy variable. This method is applied to ozone, nitrogen dioxide, particulate matter, and other similar air pollutants.

We examine the representative nature of these proxy averages on a large-scale Canadian data set, representing hundreds of monitors and dozens of city-level populations. The careful determination of temporal and spatial correlations between the disparate monitors allows for more precise estimation of population-level exposure, taking inspiration from the land-use regression models commonly used in geography. We conclude this work with an examination of the risk estimation differences between the original, simplistic population exposure metric and our new, revised metric.

Author Keywords: Air Pollution, Population Health Risk, Spatial Process, Spatio-Temporal, Temporal Process, Time Series

2019

The Long-term Financial Sustainability of China's Urban Basic Pension System

Type:
Names:
Creator (cre): Song, Lin, Thesis advisor (ths): Cater, Bruce, Thesis advisor (ths): Pollanen, Marco, Degree committee member (dgc): Patrick, Brian, Degree granting institution (dgg): Trent University
Abstract:

Population aging has become a worldwide concern since the nineteenth century. The decrease in birth rate and the increase in life expectancy will make China's population age rapidly. If the growth rate of the number of workers is less than that of the number of retirees, in the long run, there will be fewer workers per retiree. This will apply great pressure to China's public pension system in the next several decades. This is a global problem known as the "pension crisis". In this thesis, a long-term vision for China's urban pension system is presented. Based on the mathematical models and the projections for demographic variables, economic variables and pension scheme variables, we test how the changes in key variables affect the balances of the pension fund in the next 27 years. This thesis applies methods of deterministic and stochastic modeling as well as sensitivity analysis to the problem. Using sensitivity analysis, we find that the pension fund balance is highly sensitive to the changes in retirement age compared with other key variables. Monte Carlo simulations are also used to find the possible distributions of the pension fund balance by the end of the projection period. Finally, according to my analysis, several changes in retirement age are recommended in order to maintain the sustainability of China's urban basic pension scheme.

Author Keywords: China, demographic changes, Monte Carlo simulation, pension fund, sensitivity tests, sustainability

2015