Author: Sneha Kumar
The main work has been done in Fraud_Detection.ipynb. Python 3.11.5 has been used in this project. Additional files:
saved_objectsfolder : contains saved states of all the models.autoencoder_tuningfolder : contains the trials for autoencoder.requirements.txt: list of libraries from the python environment used while running this notebook.
With the rapidly evolving capabilities of technology, financial fraud has emerged as one of the prevalent issues in the recent years. Whether through online banking, e-commerce platforms or credit cards, financial scams and fraud are become increasingly sophisticated and pervasive. In Singapore, scam victims lost over $380 million in just the first half of 2024, with scam cases increasing by 16.3% (Sun, 2024). Globally, 17% rise in digital fraud attacks in financial services has been reported (Fintech News Singapore, 2024). Detecting fraud has become more critical to safeguard individuals and prevent financial losses. With this motivation, I decided to explore what are the ways in which machine learning could be applied in detecting fraudulent transactions. While there are many forms of financial fraud, for the purposes of this project, we will be focusing only on credit card fraud.
In the context of machine learning, the problem of fraud detection can be formulated in two different ways:
-
Classification Problem: A supervised learning task (binary classification), where given a set of input variables, the model would classify the transaction as normal or fraudulent.
-
Anomaly Detection: An semi-supervised learning task, where patterns of normal transactions are learnt by the model and used to identify anomalies (i.e. data points that significantly deviate from the normal transaction patterns). The task can is semi-supervised because data labels are used to obtain an optimal decision threshold for anomalies.
The goal of this project was to compare these two formulations and determine which of them a) results in better perfomrnace & b) provides a better representation of the credit card fraud detection task.