Abstract
Fraud,
waste, and abuse in many financial systems are estimated to result in significant
losses annually. Predictive analytics offer government and private financial institutions
the opportunity to identify, prevent or recover such losses. This work proposed
a novel Big Data driven approach for fraud detection based on Deep Learning
methods. A supervised Deep Learning solution leveraging Big Data was shown to
be an effective Fraud predictor. Additionally, an unsupervised method based on anomaly
detection using deep auto encoders was proposed for when there is few or no labelled
data. The two methods presented offered adaptive and predictive Fraud detection
through improved Analytics. Future work will look into how the two methods can
be integrated into an effective tool for enhanced Fraud detection.
Chapter 1
Introduction
1.1 Background to the
Study
Fraud
refers to the intentional illegal exploitation of a system which results in injury
of an oblivious entity. Financial fraud involves the exploitation of financial
systems which results in the loss of financial resource, the most prominent
being monetary although other damages such as loss of integrity are possible.
Fraud, waste, and abuse in many financial systems are estimated to result in
significant losses annually running into billions of US dollars. Furthermore,
the proliferation of the internet has exposed financial systems to diverse
fraudsters using different mechanisms to exploit financial systems. This provided
an explode in attack patterns which rendered the once effective case-based fraud
detection solutions no more effective as the computational complexity increases
with each new detected fraud. More seriously, their is a higher tendency for first-time
frauds going undetected. The case-based detection methods are also slow as a successful
exploit could multiply if the solution took time to be integrated into the system.
This problem can only be addressed with an online (on-the-y) adaptive (able to
detect new frauds) solution. Also of concern to financial fraud detector
solutions is, the prediction strength that indicates a Fraud detector's ability
to correctly identify both known and novel Frauds. This is usually a direct function
of how much fraud samples there are to model a solution. The emergence of Big
Data and its Analytics has provided financial fraud detection experts with
verse amount of data that will enhance the detection models. Such solutions
that use Big Data to model offer more comprehensive solutions. A complete fraud
detection model thus, must have the following properties:
1.
Adaptive: This refers to the following abilities:
·
Ability
to detect fraudulent activities within a short period of time. This is also
referred to as its alertness.
·
Ability
to detect first-time fraudulent activities with high accuracy.
2.
Predictive: This refers to the following abilities:
·
Ability
to detect all new instances of fraudulent activities that have happened in the
past. This is very difficult to achieve if there is no data with a considerable
description of previous transactions.
Over
the years solutions have been proposed to provide effective solutions to financial
frauds. Most of the models proposed to address the Fraud detection model property
1 have been statistical models that try to detect outliers in the data set (See
[27], [21] and [8]). This follows after the assumption that fraudulent
transactions will behave abnormally different from legitimate transactions. An abnormal
pattern of behavior (i.e. an Outlie ) is fagged\suspicious." More recent,
Machine Learning methods have been used to develop more effective models (See
[4], [11], [7] and [9]). The emergence of Big Data analytic tools provided
means to address Fraud detection model property 2. Such technology allows the
integration of data from various sources used to model and predict financial
fraud. For example location data of a fraudster, social-media activity and
credit card information can be reconciled to trace a fraudulent transaction to
him. However, Big Data analytics brings with it challenges that limit
application of techniques used to address fraud detection model property
1.
Such challenges are enumerated in [24], some of which are:
1.
High-dimensionality and data reduction,
2.
Data quality and validation,
3.
Data cleansing,
4.
Feature engineering,
5.
Data representations and distributed data sources,
6.
Data sampling
Much
research has gone into addressing some of the above issues so that existing models
that work for \small" data can scale-up to work with Big Data, for example
[16], [26] proposed improvements that address high-dimensionality, others are
[13], [29]. However all these attempts might not have scaled well as they were
not originally designed to handle Big Data complexity. Deep learning is one
technique that has the capability to handle such complex abstractions. It is
good at analyzing and organizing large amount of unsupervised data. Most raw
data in Big Data Analytics are largely unlabeled and uncategorised, which are
ideally suited for Deep learning algorithms.
Department: Computer Science (M.Sc Thesis)
Format: MS Word
Chapters: 1 - 5, Preliminary Pages, Abstract, References, Appendix.
No. of Pages: 46
Price: 20,000 NGN
In Stock
Our Customers are Happy!!!
No comments:
Post a Comment
Add Comment