Moudarres, Anissa Nour

Fraud Detection in Financial Businesses Using Data Mining Approaches

Type:
Names:
Creator (cre): Moudarres, Anissa Nour, Thesis advisor (ths): McConnell, Sabine, Thesis advisor (ths): Hurley, Richard, Degree granting institution (dgg): Trent University
Abstract:

The purpose of this research is to apply four methods on two data sets, a Synthetic

dataset and a Real-World dataset, and compare the results to each other with the

intention of arriving at methods to prevent fraud. Methods used include Logistic Regression,

Isolation Forest, Ensemble Method and Generative Adversarial Networks.

Results show that all four models achieve accuracies between 91% and 99% except

Isolation Forest gave 69% accuracy for the Synthetic dataset.

The four models detect fraud well when built on a training set and tested with

a test set. Logistic Regression achieves good results with less computational eorts.

Isolation Forest achieve lower results accuracies when the data is sparse and not preprocessed

correctly. Ensemble Models achieve the highest accuracy for both datasets.

GAN achieves good results but overts if a big number of epochs was used. Future

work could incorporate other classiers.

Author Keywords: Ensemble Method, GAN, Isolation forest, Logistic Regression, Outliers

2020