McConnell, Sabine

An Investigation of a Hybrid Computational System for Cloud Gaming

Type:
Names:
Creator (cre): Baxter, Sean Andrew, Thesis advisor (ths): Hurley, Richard, Degree committee member (dgc): Srivastava, Brian, Degree committee member (dgc): McConnell, Sabine, Degree committee member (dgc): Pazzi, Richard, Degree committee member (dgc): Parker, James, Degree granting institution (dgg): Trent University
Abstract:

Video games have always been intrinsically linked with the technology available for the progress of the medium. With improvements in technology correlating directly to improvements in video games, this has recently not been the case. One recent technology video games have not fully leveraged is Cloud technology. This Thesis investigates a potential solution for video games to leverage Cloud technology. The methodology compares the relative performance of a Local, Cloud and a proposed Hybrid Model of video games. We find when comparing the results of the relative performance of the Local, Cloud and Hybrid Models that there is potential in a Hybrid technology for increased performance in Cloud gaming as well as increasing stability in overall game play.

Author Keywords: cloud, cloud gaming, streaming, video game

2023

Modelling Request Access Patterns for Information on the World Wide Web

Type:
Names:
Creator (cre): Sturgeon, Robert Carl, Thesis advisor (ths): Hurley, Richard T, Degree committee member (dgc): De Grande, Robson, Degree committee member (dgc): Parker, James DA, Degree committee member (dgc): McConnell, Sabine, Degree granting institution (dgg): Trent University
Abstract:

In this thesis, we present a framework to model user object-level request patterns in the World Wide Web.This framework consists of three sub-models: one for file access, one for Web pages, and one for storage sites. Web Pages are modelled to be made up of different types and sizes of objects, which are characterized by way of categories.

We developed a discrete event simulation to investigate the performance of systems that utilize our model.Using this simulation, we established parameters that produce a wide range of conditions that serve as a basis for generating a variety of user request patterns. We demonstrated that with our framework, we can affect the mean response time (our performance metric of choice) by varying the composition of Web pages using our categories. To further test our framework, it was applied to a Web caching system, for which our results showed improved mean response time and server load.

Author Keywords: discrete event simulation (DES), Internet, performance modelling, Web caching, World Wide Web

2022

An Investigation of the Impact of Big Data on Bioinformatics Software

Type:
Names:
Creator (cre): Dobosz, Rafal, Thesis advisor (ths): McConnell, Sabine, Thesis advisor (ths): Hurley, Richard, Degree committee member (dgc): McConnell, Sabine, Degree committee member (dgc): Hurley, Richard, Degree committee member (dgc): Hajibabaei, Mehrdad, Degree committee member (dgc): Cater, Bruce, Degree granting institution (dgg): Trent University
Abstract:

As the generation of genetic data accelerates, Big Data has an increasing impact on the way bioinformatics software is used. The experiments become larger and more complex than originally envisioned by software designers. One way to deal with this problem is to use parallel computing.

Using the program Structure as a case study, we investigate ways in which to counteract the challenges created by the growing datasets. We propose an OpenMP and an OpenMP-MPI hybrid parallelization of the MCMC steps, and analyse the performance in various scenarios.

The results indicate that the parallelizations produce significant speedups over the serial version in all scenarios tested. This allows for using the available hardware more efficiently, by adapting the program to the parallel architecture. This is important because not only does it reduce the time required to perform existing analyses, but it also opens the door to new analyses, which were previously impractical.

Author Keywords: Big Data, HPC, MCMC, parallelization, speedup, Structure

2014

An Investigation of Load Balancing in a Distributed Web Caching System

Type:
Names:
Creator (cre): Plumley, Brandon Marcus, Thesis advisor (ths): Hurley, Richard, Degree committee member (dgc): McConnell, Sabine, Degree granting institution (dgg): Trent University
Abstract:

With the exponential growth of the Internet, performance is an issue as bandwidth is often limited. A scalable solution to reduce the amount of bandwidth required is Web caching. Web caching (especially at the proxy-level) has been shown to be quite successful at addressing this issue. However as the number and needs of the clients grow, it becomes infeasible and inefficient to have just a single Web cache. To address this concern, the Web caching system can be set up in a distributed manner, allowing multiple machines to work together to meet the needs of the clients. Furthermore, it is also possible that further efficiency could be achieved by balancing the workload across all the Web caches in the system. This thesis investigates the benefits of load balancing in a distributed Web caching environment in order to improve the response times and help reduce bandwidth.

Author Keywords: adaptive load sharing, Distributed systems, Load Balancing, Simulation, Web Caching

2015

Historic Magnetogram Digitization

Type:
Names:
Creator (cre): Weygang, Mark, Thesis advisor (ths): Burr, Wesley S, Thesis advisor (ths): McConnell, Sabine, Degree granting institution (dgg): Trent University
Abstract:

The conversion of historical analog images to time series data was performed by using deconvolution for pre-processing, followed by the use of custom built digitization algorithms. These algorithms have been developed to be user friendly with the objective of aiding in the creation of a data set from decades of mechanical observations collected from the Agincourt and Toronto geomagnetic observatories beginning in the 1840s. The created algorithms follow a structure which begins with pre-processing followed by tracing and pattern detection. Each digitized magnetogram was then visually inspected, and the algorithm performance verified to ensure accuracy, and to allow the data to later be connected to create a long-running time-series.

Author Keywords: Magnetograms

2019

Augmented Reality Sandbox (Aeolian Box): A Teaching and Presentation Tool for Atmospheric Boundary Layer Airflows over a Deformable Surface

Type:
Names:
Creator (cre): Singh, Pradyumn, Thesis advisor (ths): McConnell, Sabine, Thesis advisor (ths): McKenna-Neuman, Cheryl, Degree committee member (dgc): Tang, Vincent, Degree granting institution (dgg): Trent University
Abstract:

The AeolianBox is an educational and presentation tool extended in this thesis to

represent the atmospheric boundary layer (ABL) flow over a deformable surface in the

sandbox. It is a hybrid hardware cum mathematical model which helps users to visually,

interactively and spatially fathom the natural laws governing ABL airflow. The

AeolianBox uses a Kinect V1 camera and a short focal length projector to capture the

Digital Elevation Model (DEM) of the topography within the sandbox. The captured

DEM is used to generate a Computational Fluid Dynamics (CFD) model and project the

ABL flow back onto the surface topography within the sandbox.

AeolianBox is designed to be used in a classroom setting. This requires a low

time cost for the ABL flow simulation to keep the students engaged in the classroom.

Thus, the process of DEM capture and CFD modelling were investigated to lower the

time cost while maintaining key features of the ABL flow structure. A mesh-time

sensitivity analysis was also conducted to investigate the tradeoff between the number of

cells inside the mesh and time cost for both meshing process and CFD modelling. This

allows the user to make an informed decision regarding the level of detail desired in the

ABL flow structure by changing the number of cells in the mesh.

There are infinite possible surface topographies which can be created by molding

sand inside the sandbox. Therefore, in addition to keeping the time cost low while

maintaining key features of the ABL flow structure, the meshing process and CFD

modelling are required to be robust to variety of different surface topographies.

To achieve these research objectives, in this thesis, parametrization is done for meshing process and CFD modelling.

The accuracy of the CFD model for ABL flow used in the AeolianBox was

qualitatively validated with airflow profiles captured in the Trent Environmental Wind

Tunnel (TEWT) at Trent University using the Laser Doppler Anemometer (LDA). Three

simple geometries namely a hemisphere, cube and a ridge were selected since they are

well studied in academia. The CFD model was scaled to the dimensions of the grid where

the airflow was captured in TEWT. The boundary conditions were also kept the same as

the model used in the AeolianBox.

The ABL flow is simulated by using software like OpenFoam and Paraview to

build and visualize a CFD model. The AeolianBox is interactive and capable of detecting

hands using the Kinect camera which allows a user to interact and change the topography

of the sandbox in real time. The AeolianBox's software built for this thesis uses only

opensource tools and is accessible to anyone with an existing hardware model of its

predecessors.

Author Keywords: Augmented Reality, Computational Fluid Dynamics, Kinect Projector Calibration, OpenFoam, Paraview

2019

Support Vector Machines for Automated Galaxy Classification

Type:
Names:
Creator (cre): Chambers, Cameron Darrin, Thesis advisor (ths): McConnell, Sabine, Thesis advisor (ths): Hurley, Richard, Degree granting institution (dgg): Trent University
Abstract:

Support Vector Machines (SVMs) are a deterministic, supervised machine learning algorithm that have been successfully applied to many areas of research. They are heavily grounded in mathematical theory and are effective at processing high-dimensional data. This thesis models a variety of galaxy classification tasks using SVMs and data from the Galaxy Zoo 2 project. SVM parameters were tuned in parallel using resources from Compute Canada, and a total of four experiments were completed to determine if invariance training and ensembles can be utilized to improve classification performance. It was found that SVMs performed well at many of the galaxy classification tasks examined, and the additional techniques explored did not provide a considerable improvement.

Author Keywords: Compute Canada, Kernel, SDSS, SHARCNET, Support Vector Machine, SVM

2019

Fraud Detection in Financial Businesses Using Data Mining Approaches

Type:
Names:
Creator (cre): Moudarres, Anissa Nour, Thesis advisor (ths): McConnell, Sabine, Thesis advisor (ths): Hurley, Richard, Degree granting institution (dgg): Trent University
Abstract:

The purpose of this research is to apply four methods on two data sets, a Synthetic

dataset and a Real-World dataset, and compare the results to each other with the

intention of arriving at methods to prevent fraud. Methods used include Logistic Regression,

Isolation Forest, Ensemble Method and Generative Adversarial Networks.

Results show that all four models achieve accuracies between 91% and 99% except

Isolation Forest gave 69% accuracy for the Synthetic dataset.

The four models detect fraud well when built on a training set and tested with

a test set. Logistic Regression achieves good results with less computational eorts.

Isolation Forest achieve lower results accuracies when the data is sparse and not preprocessed

correctly. Ensemble Models achieve the highest accuracy for both datasets.

GAN achieves good results but overts if a big number of epochs was used. Future

work could incorporate other classiers.

Author Keywords: Ensemble Method, GAN, Isolation forest, Logistic Regression, Outliers

2020

Representation Learning with Restorative Autoencoders for Transfer Learning

Type:
Names:
Creator (cre): Fichuk, Dexter Lamont, Thesis advisor (ths): McConnell, Sabine, Degree committee member (dgc): Hurley, Richard, Degree granting institution (dgg): Trent University
Abstract:

Deep Neural Networks (DNNs) have reached human-level performance in numerous tasks in the domain of computer vision. DNNs are efficient for both classification and the more complex task of image segmentation. These networks are typically trained on thousands of images, which are often hand-labelled by domain experts. This bottleneck creates a promising research area: training accurate segmentation networks with fewer labelled samples.

This thesis explores effective methods for learning deep representations from unlabelled images. We train a Restorative Autoencoder Network (RAN) to denoise synthetically corrupted images. The weights of the RAN are then fine-tuned on a labelled dataset from the same domain for image segmentation.

We use three different segmentation datasets to evaluate our methods. In our experiments, we demonstrate that through our methods, only a fraction of data is required to achieve the same accuracy as a network trained with a large labelled dataset.

Author Keywords: deep learning, image segmentation, representation learning, transfer learning

2020

Cloud Versus Bare Metal: A comparison of a high performance computing cluster running in a commercial cloud and on a traditional hardware cluster using OpenMP and OpenMPI

Type:
Names:
Creator (cre): Bilaniuk, Vicky, Thesis advisor (ths): McConnell, Sabine, Degree committee member (dgc): Hurley, Richard, Degree granting institution (dgg): Trent University
Abstract:

A comparison of two high performance computing clusters running on AWS and Sharcnet was done to determine which scenarios yield the best performance. Algorithm complexity ranged from O (n) to O (n3). Data sizes ranged from 195 KB to 2 GB. The Sharcnet hardware consisted of Intel E5-2683 and Intel E7-4850 processors with memory sizes ranging from 256 GB to 3072 GB. On AWS, C4.8xlarge instances were used, which run on Intel Xeon E5-2666 processors with 60 GB per instance. AWS was able to launch jobs immediately regardless of job size. The only limiting factors on AWS were algorithm complexity and memory usage, suggesting a memory bottleneck. Sharcnet had the best performance but could be hampered by the job scheduler. In conclusion, Sharcnet is best used when the algorithm is complex and has high memory usage. AWS is best used when immediate processing is required.

Author Keywords: AWS, cloud, HPC, parallelism, Sharcnet

2019