Information science

Smote and Performance Measures for Machine Learning Applied to Real-Time Bidding

Type:
Names:
Creator (cre): McInroy, Ben P., Thesis advisor (ths): Feng, Wenying, Degree committee member (dgc): Patrick, Brian, Degree committee member (dgc): Pollanen, Marco, Degree granting institution (dgg): Trent University
Abstract:

In the context of Real-Time Bidding (RTB) the machine learning problems of

imbalanced classes and model selection are investigated. Synthetic Minority Oversampling Technique (SMOTE) is commonly used to combat imbalanced classes but a shortcoming is identified. Use of a distance threshold is identified as a solution and testing in a live RTB environment shows significant improvement. For model selection, the statistical measure Critical Success Index (CSI) is modified to add emphasis on recall. This new measure (CSI-R) is empirically compared with other measures such as accuracy, lift, efficiency, true skill score, Heidke's skill score and Gilbert's skill score. In all cases CSI-R is shown to provide better application to the RTB industry.

Author Keywords: imbalanced classes, machine learning, online advertising, performance measures, real-time bidding, SMOTE

2016

My Canadian Story: Multiculturalism and Meaning-Making in Local Archives

Type:
Names:
Creator (cre): Morrison, Caileigh, Thesis advisor (ths): Harrison, Julia, Degree committee member (dgc): Bhandar, Davina, Degree committee member (dgc): Eamon, Michael, Degree granting institution (dgg): Trent University
Abstract:

Canada prides itself on being a multicultural nation, but the stories of people who are not "Canadian-Canadians," as defined by Eva Mackey, are underrepresented in archives. This project investigates three local archives and one online archive in Peterborough, Ontario, employing Rita Dhamoon's practice of "accounts of meaning-making" to understand how archives contribute to a community's understanding of itself and who belongs there. The findings indicate that the city's "Canadian-Canadians," who have portrayed them as transient and only temporarily settled in the city, frequently mediate the stories of "other" populations in Peterborough's archival records. This account of meaning-making provides an entry point for changing this understanding and making archives more welcoming and accessible in the city and beyond.

Author Keywords: Archives, Community, Identity, Immigration, Integration, Multiculturalism

2017

Utilizing Class-Specific Thresholds Discovered by Outlier Detection

Type:
Names:
Creator (cre): Branch, Richard Arthur Conan, Thesis advisor (ths): McConnell, Sabine, Thesis advisor (ths): Hurley, Richard, Degree granting institution (dgg): Trent University
Abstract:

We investigated if the performance of selected supervised machine-learning techniques could be improved by combining univariate outlier-detection techniques and machine-learning methods. We developed a framework to discover class-specific thresholds in class probability estimates using univariate outlier detection and proposed two novel techniques to utilize these class-specific thresholds. These proposed techniques were applied to various data sets and the results were evaluated. Our experimental results suggest that some of our techniques may improve recall in the base learner. Additional results suggest that one technique may produce higher accuracy and precision than AdaBoost.M1, while another may produce higher recall. Finally, our results suggest that we can achieve higher accuracy, precision, or recall when AdaBoost.M1 fails to produce higher metric values than the base learner.

Author Keywords: AdaBoost, Boosting, Classification, Class-Specific Thresholds, Machine Learning, Outliers

2016