- Title
- Statistical and Mathematical Learning: an application to fraud detection and prevention
- Creator
- Hamlomo, Sisipho
- ThesisAdvisor
- Baxter, Jeremy
- ThesisAdvisor
- Atemkeng Teufack, Marcellin
- Subject
- Credit card fraud
- Subject
- Bootstrap (Statistics)
- Subject
- Support vector machines
- Subject
- Neural networks (Computer science)
- Subject
- Decision trees
- Subject
- Machine learning
- Subject
- Cross-validation
- Subject
- Imbalanced data
- Date
- 2022-04-06
- Type
- Master's thesis
- Type
- text
- Identifier
- http://hdl.handle.net/10962/233795
- Identifier
- vital:50128
- Description
- Credit card fraud is an ever-growing problem. There has been a rapid increase in the rate of fraudulent activities in recent years resulting in a considerable loss to several organizations, companies, and government agencies. Many researchers have focused on detecting fraudulent behaviours early using advanced machine learning techniques. However, credit card fraud detection is not a straightforward task since fraudulent behaviours usually differ for each attempt and the dataset is highly imbalanced, that is, the frequency of non-fraudulent cases outnumbers the frequency of fraudulent cases. In the case of the European credit card dataset, we have a ratio of approximately one fraudulent case to five hundred and seventy-eight non-fraudulent cases. Different methods were implemented to overcome this problem, namely random undersampling, one-sided sampling, SMOTE combined with Tomek links and parameter tuning. Predictive classifiers, namely logistic regression, decision trees, k-nearest neighbour, support vector machine and multilayer perceptrons, are applied to predict if a transaction is fraudulent or non-fraudulent. The model's performance is evaluated based on recall, precision, F1-score, the area under receiver operating characteristics curve, geometric mean and Matthew correlation coefficient. The results showed that the logistic regression classifier performed better than other classifiers except when the dataset was oversampled.
- Description
- Thesis (MSc) -- Faculty of Science, Statistics, 2022
- Format
- computer, online resource, application/pdf, 1 online resource (161 pages), pdf
- Publisher
- Rhodes University, Faculty of Science, Statistics
- Language
- English
- Rights
- Hamlomo, Sisipho
- Rights
- Use of this resource is governed by the terms and conditions of the Creative Commons "Attribution-NonCommercial-ShareAlike" License (http://creativecommons.org/licenses/by-nc-sa/2.0/)
- Hits: 3055
- Visitors: 3372
- Downloads: 743
Thumbnail | File | Description | Size | Format | |||
---|---|---|---|---|---|---|---|
View Details | SOURCE1 | HAMLOMO-MSC-TR22-44.pdf | 1 MB | Adobe Acrobat PDF | View Details |