Applied Modeling and Quantitative Methods
Influence of geodemographic factors on electricity consumption and forecasting models
The residential sector is a major consumer of electricity, and its demand will rise by 65 percent by the end of 2050. The electricity consumption of a household is determined by various factors, e.g. house size, socio-economic status of the family, size of the family, etc. Previous studies have only identified a limited number of socio-economic and dwelling factors. In this thesis, we study the significance of 826 geodemographic factors on electricity consumption for 4917 homes in the City of London. Geodemographic factors cover a wide array of categories e.g. social, economic, dwelling, family structure, health, education, finance, occupation, and transport. Using Spearman correlation, we have identified 354 factors that are strongly correlated with electricity consumption. We also examine the impact of using geodemographic factors in designing forecasting models. In particular, we develop an encoder-decoder LSTM model which shows improved accuracy with geodemographic factors. We believe that our study will help energy companies design better energy management strategies.
Author Keywords: Electricity forecasting, Encoder-decoder model, Geodemographic factors, Socio-economic factors
Academic Efficiency: The University-Firm Innovation Market, Intellectual Property Rights and Teaching
Universities produce a significant and increasing share of basic research that is later commercialized by firms. We argue that the university's prominence as a producer of basic research is the result of a differential efficiency in research production that cannot be replicated by firms or individual agents - teaching. By using research accomplishments to signal knowledge and attract tuition-paying students, universities are uniquely positioned to undertake certain types of research projects. However, in a market for innovation without patent rights, a significant and increasing number of basic research projects, that are social welfare improving, cannot be initiated by firms or universities. The extension of patent rights to university-generated research elegantly redresses this issue and leaves us to ponder important questions about the future of our innovation-driven economies.
Author Keywords: Innovation, Intellectual Property Rights, Research, Science Technology and Innovation Policy
Modelling Submerged Coastal Environments: Remote Sensing Technologies, Techniques, and Comparative Analysis
Built upon remote sensing and GIS littoral zone characterization methodologies of the past decade, a series of loosely coupled models aimed to test, compare and synthesize multi-beam SONAR (MBES), Airborne LiDAR Bathymetry (ALB), and satellite based optical data sets in the Gulf of St. Lawrence, Canada, eco-region. Bathymetry and relative intensity metrics for the MBES and ALB data sets were run through a quantitative and qualitative comparison, which included outputs from the Benthic Terrain Modeller (BTM) tool. Substrate classification based on relative intensities of respective data sets and textural indices generated using grey level co-occurrence matrices (GLCM) were investigated. A spatial modelling framework built in ArcGISTM for the derivation of bathymetric data sets from optical satellite imagery was also tested for proof of concept and validation. Where possible, efficiencies and semi-automation for repeatable testing was achieved using ArcGISTM ModelBuilder. The findings from this study could assist future decision makers in the field of coastal management and hydrographic studies.
Keywords: Seafloor terrain characterization, Benthic Terrain Modeller (BTM), Multi-beam SONAR, Airborne LiDAR Bathymetry, Satellite Derived Bathymetry, ArcGISTM ModelBuilder, Textural analysis, Substrate classification
SPAF-network with Saturating Pretraining Neurons
In this work, various aspects of neural networks, pre-trained with denoising autoencoders (DAE) are explored. To saturate neurons more quickly for feature learning in DAE, an activation function that offers higher gradients is introduced. Moreover, the introduction of sparsity functions applied to the hidden layer representations is studied. More importantly, a technique that swaps the activation functions of fully trained DAE to logistic functions is studied, networks trained using this technique are reffered to as SPAF-networks. For evaluation, the popular MNIST dataset as well as all \(3\) sub-datasets of the Chars74k dataset are used for classification purposes. The SPAF-network is also analyzed for the features it learns with a logistic, ReLU and a custom activation function. Lastly future roadmap is proposed for enhancements to the SPAF-network.
Author Keywords: Artificial Neural Network, AutoEncoder, Machine Learning, Neural Networks, SPAF network, Unsupervised Learning
Smote and Performance Measures for Machine Learning Applied to Real-Time Bidding
In the context of Real-Time Bidding (RTB) the machine learning problems of
imbalanced classes and model selection are investigated. Synthetic Minority Oversampling Technique (SMOTE) is commonly used to combat imbalanced classes but a shortcoming is identified. Use of a distance threshold is identified as a solution and testing in a live RTB environment shows significant improvement. For model selection, the statistical measure Critical Success Index (CSI) is modified to add emphasis on recall. This new measure (CSI-R) is empirically compared with other measures such as accuracy, lift, efficiency, true skill score, Heidke's skill score and Gilbert's skill score. In all cases CSI-R is shown to provide better application to the RTB industry.
Author Keywords: imbalanced classes, machine learning, online advertising, performance measures, real-time bidding, SMOTE
The Effect of Listing a Stock on the S&P 500 Index on the Stock's Volatility
This paper investigates the effect of listing a stock on the S&P 500 Index on the stock's volatility, using various econometrics models: GARCH and EGARCH. The study mainly addresses three issues; firstly, it analyzes stock volatility in two sub-periods, secondly, it determines whether the announcement can account for the fluctuations in the price of the stock, and finally, it investigates the change in the stock's variance. After isolating the effects of external and industry shock by using the returns on the S&P 500 Index as a proxy, the author finds evidence of structural change in the volatility of stocks after that stock is added to the index. Additionally, the existence of a dominant symmetric effect, which captures the response of volatility to news, indicate that following the onset of including the stock on the index, information flowing into the market increased. However, the rate at which old news is captured in price falls. The empirical evidence also suggests that on average a stocks variance falls and that the announcement to list a stock on the index has little effect on the stock's price.
Author Keywords: EGARCH, GARCH, S&P 500 Index, Symmetric Effect, Volatility
Assessing the Cost of Reproduction between Male and Female Sex Functions in Hermaphroditic Plants
The cost of reproduction refers to the use of resources for the production of offspring that decreases the availability of resources for future reproductive events and other biological processes. Models of sex-allocation provide insights into optimal patterns of resource investment in male and female sex functions and have been extended to include other components of the life history, enabling assessment of the costs of reproduction. These models have shown that, in general, costs of reproduction through female function should usually exceed costs through male function. However, those previous models only considered allocations from a single pool of shared resources. Recent studies have indicated that the type of resource currency can differ for female and male sex functions, and that this might affect costs of reproduction via effects on other components of the life history. Using multiple invasibility analysis, this study examined resource allocation to male and female sex functions, while simultaneously considering allocations to survival and growth. Allocation patterns were modelled using both shared and separate resource pools. Under shared resources, allocation patterns to male and female sex function followed the results of earlier models. When resource pools were separate, however, allocations to male function often exceeded allocations to female function, even if fitness gains increased less strongly with investment in male function than with investment in female function. These results demonstrate that the costs of reproduction are affected by (1) the types of resources needed for reproduction via female or male function and (2) via trade-offs with other components of the life history. Future studies of the costs of reproduction should examine whether allocations to reproduction via female versus male function usually entail the use of different types of resources.
Author Keywords: Cost of Reproduction, Gain Curve, Life History, Resource Allocation Patterns, Resource Currencies
Time Series Algorithms in Machine Learning - A Graph Approach to Multivariate Forecasting
Forecasting future values of time series has long been a field with many and varied applications, from climate and weather forecasting to stock prediction and economic planning to the control of industrial processes. Many of these problems involve not only a single time series but many simultaneous series which may influence each other. This thesis provides methods based on machine learning of handling such problems.
We first consider single time series with both single and multiple features. We review the algorithms and unique challenges involved in applying machine learning to time series. Many machine learning algorithms when used for regression are designed to produce a single output value for each timestamp of interest with no measure of confidence; however, evaluating the uncertainty of the predictions is an important component for practical forecasting. We therefore discuss methods of constructing uncertainty estimates in the form of prediction intervals for each prediction. Stability over long time horizons is also a concern for these algorithms as recursion is a common method used to generate predictions over long time intervals. To address this, we present methods of maintaining stability in the forecast even over large time horizons. These methods are applied to an electricity forecasting problem where we demonstrate the effectiveness for support vector machines, neural networks and gradient boosted trees.
We next consider spatiotemporal problems, which consist of multiple interlinked time series, each of which may contain multiple features. We represent these problems using graphs, allowing us to learn relationships using graph neural networks. Existing methods of doing this generally make use of separate time and spatial (graph) layers, or simply replace operations in temporal layers with graph operations. We show that these approaches have difficulty learning relationships that contain time lags of several time steps. To address this, we propose a new layer inspired by the long-short term memory (LSTM) recurrent neural network which adds a distinct memory state dedicated to learning graph relationships while keeping the original memory state. This allows the model to consider temporally distant events at other nodes without affecting its ability to model long-term relationships at a single node. We show that this model is capable of learning the long-term patterns that existing models struggle with. We then apply this model to a number of real-world bike-share and traffic datasets where we observe improved performance when compared to other models with similar numbers of parameters.
Author Keywords: forecasting, graph neural network, LSTM, machine learning, neural network, time series
Capital Ratios and Liquidity Creation: Evidence from Canadian Big Six Banks
Using quarterly data from the six largest Canadian banks, we investigate the relationship between regulatory capital ratio and on-balance sheet liquidity created in the Canadian economy by "Big Six". We find a significant positive relationship between Tier 1 capital ratio and on-balance sheet liquidity creation for Canadian big six banks, implying that large banks in Canada favor risks and rely on capital to fund illiquid assets. In contrast, for smaller banks, the relationship is significantly negative. Our results are robust to dynamic panel regression using 2-Step GMM, two exogenous shocks - COVID-19 crisis and the Global Financial Crisis (2007-2009), mergers & acquisitions activities in the banking industry, and core deposits financing. The COVID-19 pandemic and core deposits adversely impact the Tier 1 capital ratio's relationship with on-balance-sheet liquidity creation, while the global financial crisis (2007-2009) effect on the association is insignificant.
Author Keywords: Big Six, COVID -19, Deposits, Liquidity Creation, Tier 1 Capital Ratio,
Characteristics of Models for Representation of Mathematical Structure in Typesetting Applications and the Cognition of Digitally Transcribing Mathematics
The digital typesetting of mathematics can present many challenges to users, especially those of novice to intermediate experience levels. Through a series of experiments, we show that two models used to represent mathematical structure in these typesetting applications, the 1-dimensional structure based model and the 2-dimensional freeform model, cause interference with users' working memory during the process of transcribing mathematical content. This is a notable finding as a connection between working memory and mathematical performance has been established in the literature. Furthermore, we find that elements of these models allow them to handle various types of mathematical notation with different degrees of success. Notably, the 2-dimensional freeform model allows users to insert and manipulate exponents with increased efficiency and reduced cognitive load and working memory interference while the 1-dimensional structure based model allows for handling of the fraction structure with greater efficiency and decreased cognitive load.
Author Keywords: mathematical cognition, mathematical software, user experience, working memory