Pollanen, Marco
A Framework for Testing Time Series Interpolators
The spectrum of a given time series is a characteristic function describing its frequency properties. Spectrum estimation methods require time series data to be contiguous in order for robust estimators to retain their performance. This poses a fundamental challenge, especially when considering real-world scientific data that is often plagued by missing values, and/or irregularly recorded measurements. One area of research devoted to this problem seeks to repair the original time series through interpolation. There are several algorithms that have proven successful for the interpolation of considerably large gaps of missing data, but most are only valid for use on stationary time series: processes whose statistical properties are time-invariant, which is not a common property of real-world data. The Hybrid Wiener interpolator is a method that was designed for repairing nonstationary data, rendering it suitable for spectrum estimation. This thesis work presents a computational framework designed for conducting systematic testing on the statistical performance of this method in light of changes to gap structure and departures from the stationarity assumption. A comprehensive audit of the Hybrid Wiener Interpolator against other state-of-the art algorithms will also be explored.
Author Keywords: applied statistics, hybrid wiener interpolator, imputation, interpolation, R statistical software, time series
Combinatorial Collisions in Database Matching: With Examples from DNA
Databases containing information such as location points, web searches and fi- nancial transactions are becoming the new normal as technology advances. Conse- quentially, searches and cross-referencing in big data are becoming a common prob- lem as computing and statistical analysis increasingly allow for the contents of such databases to be analyzed and dredged for data. Searches through big data are fre- quently done without a hypothesis formulated before hand, and as these databases grow and become more complex, the room for error also increases. Regardless of how these searches are framed, the data they collect may lead to false convictions. DNA databases may be of particular interest, since DNA is often viewed as significant evi- dence, however, such evidence is sometimes not interpreted in a proper manner in the court room. In this thesis, we present and validate a framework for investigating var- ious collisions within databases using Monte Carlo Simulations, with examples from DNA. We also discuss how DNA evidence may be wrongly portrayed in the court room, and the explanation behind this. We then outline the problem which may occur when numerous types of databases are searched for suspects, and framework to address these problems.
Author Keywords: big data analysis, collisions, database searches, DNA databases, monte carlo simulation
Pathways to Innovation: Modelling University-to-Firm Research Development
Research and development activities conducted at universities and firms fuel economic growth
and play a key role in the process of innovation. Specifically, prior research has investigated the
widespread university-to-firm research development path and concluded that universities are
better suited for early stage of research while firms are better positioned for later stages. This
thesis aims to present a novel explanation for the pervasive university-to-firm research
development path. The model developed uses game theory to visualize and analyze interactions
between a firm and university under different strategies. The results reveal that as academic
research signals knowledge it helps attract tuition paying students. Generating these tuition
revenues is facilitated by university research discoveries, which, once published, a firm can build
upon to make new innovative products. In an environment of weak intellectual property rights,
moreover, the university-to-firm research development path enables firms to bypass the hefty
costs that are involved in basic research activities. The model also provides a range of solution
scenarios where a university and firm may find it viable to initiate a research line.
Author Keywords: Game theory, Intellectual property rights, Nash equilibrium, Research and development, University to-firm research path
Sinc-Collocation Difference Methods for Solving the Gross-Pitaevskii Equation
The time-dependent Gross-Pitaevskii Equation, describing the movement of parti-
cles in quantum mechanics, may not be solved analytically due to its inherent non-
linearity. Hence numerical methods are of importance to approximate the solution.
This study develops a discrete scheme in time and space to simulate the solution
defined in a finite domain by using the Crank-Nicolson difference method and Sinc
Collocation Methods (SCM), respectively. In theory and practice, the time discretiz-
ing system decays errors in the second-order of accuracy, and SCMs are decaying
errors exponentially. A new SCM with a unique boundary treatment is proposed
and compared with the original SCM and other similar numerical techniques in time
costs and numerical errors. As a result, the new SCM decays errors faster than the
original one. Also, to attain the same accuracy, the new SCM interpolates fewer
nodes than the original SCM, which saves computational costs. The new SCM is
capable of approximating partial differential equations under different boundary con-
ditions, which can be extensively applied in fitting theory.
Author Keywords: Crank-Nicolson difference method, Gross-Pitaevskii Equation, Sinc-Collocation methods
Educational Data Mining and Modelling on Trent University Students' Academic Performance
Higher education is important. It enhances both individual and social welfare by improving productivity, life satisfaction, and health outcomes, and by reducing rates of crime. Universities play a critical role in providing that education. Because academic institutions face resource constraints, it is thus important that they deploy resources in support of student success in the most efficient ways possible. To inform that efficient deployment, this research analyzes institutional data reflecting undergraduate student performance to identify predictors of student success measured by GPA, rates of credit accumulation, and graduation rates. Using methods of cluster analysis and machine learning, the analysis yields predictions for the probabilities of individual success.
Author Keywords: Educational data mining, Students' academic performance modelling
Range-Based Component Models for Conditional Volatility and Dynamic Correlations
Volatility modelling is an important task in the financial markets. This paper first evaluates the range-based DCC-CARR model of Chou et al. (2009) in modelling larger systems of assets, vis-à-vis the traditional return-based DCC-GARCH. Extending Colacito, Engle and Ghysels (2011), range-based volatility specifications are then employed in the first-stage of DCC-MIDAS conditional covariance estimation, including the CARR model of Chou et al. (2005). A range-based analog to the GARCH-MIDAS model of Engle, Ghysels and Sohn (2013) is also proposed and tested - which decomposes volatility into short- and long-run components and corrects for microstructure biases inherent to high-frequency price-range data. Estimator forecasts are evaluated and compared in a minimum-variance portfolio allocation experiment following the methodology of Engle and Colacito (2006). Some consistent inferences are drawn from the results, supporting the models proposed here as empirically relevant alternatives. Range-based DCC-MIDAS estimates produce efficiency gains over DCC-CARR which increase with portfolio size.
Author Keywords: asset allocation, DCC MIDAS, dynamic correlations, forecasting, portfolio risk management, volatility
The Disability-Mitigating Effects of Education on Post-Injury Employment Dynamics
Using data drawn from the Workplace Safety and Insurance Board's (WSIB) Survey of Workers with Permanent Impairments, this thesis explores if and how the human capital associated with education mitigates the realized work-disabling effects of permanent physical injury. Using Cater's (2000) model of post-injury adaptive behaviour and employment dynamics as the structural, theoretical, and interpretative framework, this thesis jointly studies, by injury type, the effects of education on both the post-injury probability of transitioning from non-employment into employment and the post-injury probability of remaining in employment once employed. The results generally show that, for a given injury type, other things being equal, higher levels of education are associated with higher probabilities of both obtaining and sustaining employment.
Author Keywords: permanent impairment, permanent injury, post-injury employment
The Long-term Financial Sustainability of China's Urban Basic Pension System
Population aging has become a worldwide concern since the nineteenth century. The decrease in birth rate and the increase in life expectancy will make China's population age rapidly. If the growth rate of the number of workers is less than that of the number of retirees, in the long run, there will be fewer workers per retiree. This will apply great pressure to China's public pension system in the next several decades. This is a global problem known as the "pension crisis". In this thesis, a long-term vision for China's urban pension system is presented. Based on the mathematical models and the projections for demographic variables, economic variables and pension scheme variables, we test how the changes in key variables affect the balances of the pension fund in the next 27 years. This thesis applies methods of deterministic and stochastic modeling as well as sensitivity analysis to the problem. Using sensitivity analysis, we find that the pension fund balance is highly sensitive to the changes in retirement age compared with other key variables. Monte Carlo simulations are also used to find the possible distributions of the pension fund balance by the end of the projection period. Finally, according to my analysis, several changes in retirement age are recommended in order to maintain the sustainability of China's urban basic pension scheme.
Author Keywords: China, demographic changes, Monte Carlo simulation, pension fund, sensitivity tests, sustainability
The Application of One-factor Models for Prices of Crops and Option Pricing Process
This thesis is intended to support dependent-on-crops farmers to hedge the price risks of their crops. Firstly, we applied one-factor model, which incorporated a deterministic function and a stochastic process, to predict the future prices of crops (soybean). A discrete form was employed for one-month-ahead prediction. For general prediction, de-trending and de-cyclicality were used to remove the deterministic function. Three candidate stochastic differential equations (SDEs) were chosen to simulate the stochastic process; they are mean-reverting Ornstein-Uhlenbeck (OU) process, OU process with zero mean, and Brownian motion with a drift. Least squares methods and maximum likelihood were used to estimate the parameters. Results indicated that one-factor model worked well for soybean prices. Meanwhile, we provided a two-factor model as an alternative model and it also performed well in this case. In the second main part, a zero-cost option package was introduced and we theoretically analyzed the process of hedging. In the last part, option premiums obtained based on one-factor model could be compared to those obtained from Black-Scholes model, thus we could see the differences and similarities which suggested that the deterministic function especially the cyclicality played an essential role for the soybean price, thus the one-factor model in this case was more suitable than Black-Scholes model for the underlying asset.
Author Keywords: Brownian motion, Least Squares Method, Maximum Likelihood Method, One-factor Model, Option Pricing, Ornstein-Uhlenbeck Process
Modeling drought derivatives in arid regions: a case study in Qatar
We propose a stochastic weather model based on temperature, precipitation, humidity and wind speed for Qatar, as a representative arid region, in order to obtain simulated values for a drought index. As a drought index, the Reconnaissance Drought Index (RDI) is commonly accepted in agriculture and is used to measure drought severity. It can be used to price weather derivatives to help farmers reduce nancial losses from drought. RDI, which is the ratio of precipitation to evapotranspiration, is calculated by considering crop growth stages. The use of dierent crop coecient value depending on the growth stage to calculate evapotranspiration can provide improved values for RDI. Additionally, six calculation methods for evapotranspiration using weather data are investigated to obtain accurate values for RDI.
Author Keywords: Evapotranspiration, Markov chains, Mean reversion processes, Reconnaissance Drought Index, Stochastic dierential equations, Stochastic weather models