Fraud Detection with Artificial Intelligence
January 02, 2005
From 1999 to 2004, I collected information on the topic of ‘Fraud detection’ on my website.
When I started this in 1999 as a research assistant at the University of Karlsruhe, there was not much information available on the topic of ‘Data Science’. Back then, it was more commonly referred to as ‘Knowledge Discovery in Databases’ (KDD) in academic circles or ‘Data Mining’ in the business world.
In november 2001 my web pages on were the site of the month (Le site du mois) of the (no longer existing) web site web-datamining.net. These pages are gone and are only available at the internet archive.
My web pages on fraud are referenced in the books
- Investigative Data Mining for Security and Criminal Detection by Jesus Mena
- Computer and Intrusion Forensics by George Mohay et. al.
Introduction
Due to technological advancements, more and more areas of daily life are being permeated by computers. Examples include digital communication, internet commerce (E-Commerce), and online banking.
Due to the complexity of these systems, it is very difficult and also very expensive to find all security vulnerabilities before they are operational. Criminals can thus discover security gaps and exploit them for their (often financial) advantage. For example, digital payment systems have been used for money laundering.
When technical systems are misused, methods are needed to detect this misuse and prevent further fraud.
In the field of fraud detection, user data is analyzed to reconstruct and analyze user behavior. Fraud management goes a step further and also includes preventative measures, such as stronger access controls.
Many thanks to the following individuals for their support: Heinz Cech, Tom Fawcett, Carlos Santa Cruz Fernandez, Al Guiva, Reinhold Huber, Andreas Lenk und Alexey Vasilyev.
Types of fraud
Misuse appears in various areas, but the task remains the same: Based on the available data about user behavior, fraudulent cases must be distinguished from normal cases.
General
Credit card fraud
- Steven W. Klebe. Evaluating Online Credit Fraud Fraud With Artificial Intelligence.
- John G. Faughnan. International Credit Card/Check Card Fraud with Small Charges.
- The National Check Fraud Center.
- Business Week Online. New Software Aims to Cut Terrorists' Cash.
Internet fraud
- Internet ScamBusters.
- Steven W. Klebe. Evaluating Online Credit Fraud Fraud With Artificial Intelligence.
Insurance fraud
- Stephen Barrett. Insurance Fraud and Abuse: A Very Serious Problem.
- Blue Cross and Blue Shield of Montana. Help Us Stop Fraud.
- Coalition Against Insurance Fraud.
- National Health Care Anti-Fraud Association (NHCAA).
Money laundering
- Otis Port. New Software Aims to Cut Terrorists' Cash Business Week Online.
Computer Crime
Telecommunications fraud
- Communications Fraud Control Association.
- Introduction by the Statistics group at Lucent Technologies: Why Do We Work on Fraud?.
- Communications Fraud Control Association.
- Deborah Young. Worst-Case Security Planning. In: Wireless Review, Nov 1, 2001.
- Deborah Young. Detection Connection. In: Wireless Review, Aug 15, 2001.
- CaveBear Blog. Yet Another Kind of Internet Thievery (YAKOIT).
Identity theft
The theory of fraud detection
Numerous algorithms have already been developed in the fields of Knowledge Discovery in Databases (KDD), Data Mining, Machine Learning, and Statistics.
Many of these methods are very general and have been successfully applied in various areas. However, in the field of fraud detection, there are some peculiarities that make the application of these existing methods either impossible or unprofitable.
A unique aspect is that fraud cases only constitute a very small proportion of the total data volume. In statistics, this is referred to as skewed distributions.
For each method of fraud, it is usually necessary to develop a specific detection algorithm, whose parameters must be specially adapted to this ‘pattern of fraud’.
On the other hand, fraudsters change their methods slightly, so that they are no longer detected. Therefore, the detection algorithm must be continuously adjusted.
To limit the damage, a fast response time of the fraud detection systems is necessary. In the case of credit card fraud, for example, it is best if the detection occurs in real-time immediately.
In binary classification (Normal Usage vs. Fraud), there are two different types of errors: false alarms (also known as false positives) and undetected fraud (also known as false negatives). See the following table.”
Fraud | No fraud | |
---|---|---|
Alarm | correct | false alarm |
No alarm | undetected fraud | correct |
When an fraud detection system triggers an alarm, it often needs to be reviewed by an employee. The costs for the two types of misdiagnoses are therefore different. With a false alarm, an employee works in vain on a case and wastes valuable working time, and with undetected fraud, the fraud continues. Therefore, cost-sensitive methods are needed.
The constantly changing and skewed distributions and the need for cost-sensitive methods complicate the evaluation of the success of a detection method. Even with “normal” classification methods, several difficulties must be considered when evaluating the success of detection [Sal97]. Usual metrics, such as error rate, accuracy, and ROC curves, are not suitable for fraud detection [PFK98, PF01]. A technique specifically developed for fraud detection is the ROC Convex Hull [PF01].
In traditional databases, data is typically analyzed in the following three steps: “Load the data, create the indexes, and then query the data”. Particularly, loading and index creation can be very time-consuming with large volumes of data, making real-time processing impossible. Here, a new data model has been designed for better handling of large volumes of data, the continuous data streams. This area is still a subject of research, but there are already prototypical data stream management systems, stream processing engines, and an extension of SQL known as Continuous Query Language (CQL).
Further reading
- R. J. Bolton's and D. J. Hand's article Statistical Fraud Detection: A Review
- Statistics and Data Mining Research at Bell Labs
References
Articles
- Tom Fawcett created a bibliographie.
- Fraud detection in mobile communication networks in the ASPeCT project.
- Some articles from the Statistics Group at Lucent.
- A bibliography on fraud detection at the University of Karlsruhe.
- Computer Fraud & Security
Workshops and conferences
- 1997 AAAI Workshop "AI Approaches to Fraud Detection and Risk Management"
- 1998 AAAI Workshop "The Methodology of Applying Machine Learning"
- 1998 AI Fall Symposium on Artificial Intelligence and Link Analysis
- International Conference on Fighting Mobile FraudLondon, 1997
- Research Priorities in Wireless and Mobile Communications and Networking. Report of a Workshop held in March 1997, sponsored by the National Science Foundation, Division of Networking and Communications Research and Infrastructure.
Bibliographie
- [AFR97] Emin Aleskerov, Bernd Freisleben, Bharat Rao. CARDWATCH: A Neural Network Based Database Mining System for Credit Card Fraud Detection. In: Proceedings of Computa- tional Intelligence for Financial Engineering (CIFEr), S. 220--226, 1997.
- [AME98] Dean W. Abbott, I. Philip Matkovsky und John F. Elder. An Evaluation of High-End Data Mining Tools for Fraud Detection. In: Proceedings of the 1998 IEEE International Conference on Systems, Man, and Cybernetics, vol. 3, pp. 2836-2841, 1998.
- [ATW97] Suhaya Abu-Hakima, Mansour Toloo, Tony White. A Multi-Agent Systems Approach for Fraud Detection in Personal Communication Systems. In: [Faw97], 1997.
- [Axe99] Stefan Axelsson. The Base-Rate Fallacy and its Implications for the Difficulty of Intrusion Detection. In: Proceedings of the 6th ACM Conference on Computer and Communications Security, pp. 1-7, 1999.
- [BH] Richard J. Bolton, David J. Hand Statistical Fraud Detection: A Review. Statistical Science, 17(3), 235-255.
- [BLH99a] R. Brause, T. Langsdorf, M. Hepp. Credit Card Fraud Detection by Adaptive Neural Data Mining. Internal Report 7/99, FB Informatik, University of Frankfurt a.M., 1999
- [BLH99b] R. Brause, T. Langsdorf, M. Hepp. Neural Data Mining for Credit Card Fraud Detection. In: Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence. pp. 103--106. 1999.
- [BS97a] Peter Burge, John Shawe-Taylor. Detecting Cellular Fraud Using Adaptive Prototypes. In: [Faw97].
- [BS97b] Peter Burge, John Shawe-Taylor. Fraud-Management Tools: First Prototype. ASPeCT -- Project, Januar 1997. See [ASPeCT].
- [BSCMPS97] P. Burge, J. Shawe-Taylor, C. Cooke, Y. Moreau, B. Preneel, C. Stoermann. Fraud Detection and Management in Mobile Telecommunications Networks.
- [CCLPS00] Michael Cahill, Fei Chen, Diane Lambert, José Pinheiro, Don X. Sun. Detecting Fraud in the Real World. In: Handbook of Massive Datasets. Kluewer. 2002.
- [CFPS99] Philip K. Chan, Wei Fan, Andreas L. Prodromidis, Salvatore J. Stolfo. Distributed Data Mining in Credit Card Fraud Detection. In: IEEE Intelligent Systems, Bd. 14, Nr. 6, S. 67--74, 1999.
- [CLPS99] Fei Chen, Diane Lambert, José Pinheiro, Don Sun. Reducing Transaction Databases, Without Lagging Behind the Data or Losing Information. Unpublished, 1999.
- [DB98] Steven K. Donoho, Scott W. Bennett. Fraud Detection and Discovery.
- [DC98] J. R. Dorronsoro, C. Santa Cruz. Discrimination of overlapping data and credit card fraud detection. Technischer Bericht, Department of Computer Engineering, Universidad de Madrid, 1998.
- [DGSC97] Jose R. Dorronsoro, Francisco Ginel, Carmen Sanchez, Carlos Santa Cruz. Neural Fraud Detection in Credit Card Operations. In: IEEE Transactions on Neural Networks, Nr. 4, Bd. 8, Juli 1997.
- [EN96] Kazuo J. Ezawa, Steven W. Norton. Constructing Bayesian Networks to Predict Uncollectible Telecommunications Accounts. IEEE Expert, Nr. 5, Bd. 11, S. 45--51, 1996.
- [Faw97] Tom Fawcett. AI Approaches to Fraud Detection & Risk Management --- Papers from the 1997 AAAI Workshop, Technical Report WS-97-07, Juli 1997, AAAI-Press.
- [FP97a] Tom Fawcett and Foster Provost. Adaptive Fraud Detection. Data Mining and Knowledge Discovery, vol. 1, no. 3, p. {291-316}. 1997.
- [FP97b] Tom Fawcett, Foster Provost. Combining Data Mining and Machine Learning for Effective Fraud Detection. In: [Faw97].
- [Gos97] Phil Gosset. Fraud Detection Concepts: Final Report. ASPeCT -- Project, November 1997. See [ASPeCT].
- [GH99] Phil Gossett, Mark Hyland. Classification, Detection and Prosecution of Fraud on Mobile Networks. Proceedings of ACTS Mobile Summit, Sorrento, Italy, Juni 1999.
- [GR94] Sushmito Ghosh, Douglas L. Reilly. Credit Card Fraud Detection with a Neural-Network. In: Proceedings of the 27th Hawaii International Conference on Information Systems, S. 621-- 630, 1994.
- [HDA98] Mark Hyland, Jos Dumortier, Diana Alonso Blas. Legal Aspects of Fraud Detection. ASPeCT-Project. See [ASPeCT].
- [HS08] Constantinos S. Hilas, Paris As. Mastorocostas. An Application of Supervised and Unsupervised Learning Approaches to Telecommunications Fraud Detection. Knowledge-Based Systems, 21, pp 721 – 726, 2008. doi:10.1016/j.knosys.2008.03.026.
- [HS09] Constantinos S. Hilas, Paris As. Mastorocostas. Designing an expert system for fraud detection in a private telecommunications network. An Application of Supervised and Unsupervised Learning Approaches to Telecommunications Fraud Detection. Expert Systems with Applications. 2009. doi: 10.1016/j.eswa.2009.03.031.
- [HS05] Constantinos S. Hilas, John N. Sahalos. User profiling for fraud detection in telecommunication networks. In: 5th International Conference on Technology and Automation, Thessaloniki, Greece, October 2005. pp 382-387.
- [HS06] Constantinos S. Hilas, John N. Sahalos. Testing the fraud detection ability of different user profiles by means of FFNN classifiers. In: Collias St. et al ed.. Lecture Notes in Computer Science, vol. 4132, Part II, 2006. pp 872-883.
- [HS07] Constantinos S. Hilas, John N. Sahalos. An application of decision trees for rule extraction towards telecommunications fraud detection. In: B. Apolloni et al. (Eds.): KES 2007/ WIRN 2007, Lecture Notes in Artificial Intelligence, vol. 4693, Part II, Springer. 2007, pp. 1112–1121.
- [Jen97] David Jensen. Prospective Assessment of AI Technologies for Fraud Detection: A Case Study.
- [KKN99] Daniel A. Keim, Eleftherios E. Koutsofios, Stephen C. North. Visual Exploration of Large Telecommunication Data Sets. In: User Interfaces to Data Intensive Systems, S. 12-- 20, 1999.
- [MP96] Yves Moreau, Bart Preneel. Definition of Fraud Detection Concepts. ASPeCT -- Project, August 1996. See [ASPeCT].
- [OTA95] U. S. Congress, Office of Technology Assessment. Information Technologies for Control of Money Laundering. U. S. Government Printing Office, OTA-ITC-630, Washington DC, September 1995.
- [PF97] Foster Provost, Tom Fawcett. Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), 1997.
- [PF01] Foster Provost, Tom Fawcett. Robust Classification for Imprecise Environments. In: Machine Learning, vol. 42, no. 3, pp. 203-231, 2001.
- [PFK98] Foster Provost, Tom Fawcett, Ron Kohavi. The Case Against Accuracy Estimation for Comparing Induction Algorithms. Proceedings of the Fifteenth International Conference on Machine Learning (ICML-98), July 1998.
- [Sal97] Steven Salzberg. On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach. In: Data Mining and Knowledge Discovery, Nr. 3, S. 317--328, 1997.
- [SFLPC97] Salvatore J. Stolfo, David W. Fan, Wenke Lee, Andreas L. Prodromidis, Philip K. Chan. Credit Card Fraud Detection Using Meta-Learning: Issues and Initial Results. In: [Faw97].
- [Stö97] Christof Störmann. Fraud Management Tool: Evaluation Report. ASPeCT - Project, Oktober 1997. See [ASPeCT].
Fraud management systems and services
This list contains Fraud Management Systems, not individual components, such as Data Mining tools. In alphabetical order. This list is not complete. The names of the software products are listed in square brackets.
- ACI Worldwide
- ACL Services
- Advanced Software Applications (A.S.A)
- Alcatel
- The ai corporation
- Amdocs
- Beck Computer Systems
- Brighterion
- Carreker Corporation (now Fiserv)
- ChoicePoint
- Communications Expert
- CyberSource
- Ectel [FraudView]
- Equinox Information Systems [Protector, Guardian]
- FICO (formerly, Fair, Isaac and Company, formerly HNC Software)
- FML
- i2
- infoRate
- Inform GmbH[RiskShield]
- Inforsud[TimRisk]
- Mahindra - British Telecom
- Metavante
- NFC Global, Inc.
- NetMap Analytics
- Neural Technologies
- Oskar Kilo Ltd.
- ReD Retail Decisions
- Secure Science Corporation
- Subex Systems[Ranger]
- Telemate
- Telesciences[Sterling]
- VerifyFraud
- Vips[STARS]
- Visual Analytics[VisuaLinks]
- Xanalys[Watson]
- Xtract
Components of fraud management systems
Fraud Management Systems are often created from usual software components, such as databases, Data Mining, or visualization tools. In alphabetical order. This list is not complete.
- KXEN[KXEN Analytic Framework]
- Oracle Corporation [Darwin]
-
SAS Institute [SAS Enterprise Miner]
- SAS Fraud Prevention and Detection for Financial Services
-
SPSS [Clementine]
- ClearCommerce, online transaction software.
- Lloyds TSB, credit card fraud.
- Computer Associates [CleverPath, Neugent]
People and research groups
Research groups
- ASPECT, Advanced Security for Personal Communications Technologies
People
- Andi Baritchi
- Fei Chen
- Tom Fawcett
- Constantinos S. Hilas
- Jakko Hollmen
- David Jensen
- Diane Lambert
- Jose Pinheiro
- Foster Provost
- Carlos Santa Cruz
- Salvatore Stolfo
- Don X. Sun
Related areas
- Fraud Detection & Prevention at AAAI
Articial Intelligence
- Artificial Intelligence Resourcesat the Institute for Information Technology
- David W. Aha's Machine Learning Resources
- Computational Learning Theory (COLT)
- Evaluation of Intelligent Systems
- ILPnet2 : Inductive Logic Programming Net
- Kernel Based Learning Methods
- Knowledge Discovery central
- KDNuggets : Data Mining, Web Mining, Knowledge Discovery, and CRM guide
- MLnet OiS : Machine Learning network Online Information Service
- Mixture Modelling : Cluster
- Recursive Partitioning
Statistics
Intrusion Detection Systems
- List of intrusion detection systems
- National Info-Sec Technical Baseline "Intrusion Detection and Response"