Using Neural Networks to Stop Telecom Fraud
Orange Silicon Valley is the Bay Area division of Orange, one of the world’s leading telecommunications operators, serving 250 million customers across 29 countries. Each week, hundreds of instances of fraud occur on its network, resulting in hundreds of millions in annual revenues lost. To combat this, Orange uses artificial intelligence to identify and predict cases of fraud.
Orange Silicon Valley is working with Skymind to prevent Subscriber Identity Module Box (SIM box) fraud on its mobile network. Using an artificial neural network (ANN) called an autoencoder, Orange Silicon Valley analyzes call detail records (CDRs) to find patterns that identify fraud. The ANN predicts the likelihood that an instance is fraudulent. Where a static rule system flags cases as likely fraud or not based on preconceived decision trees, the ANN enables Orange's analysts to prioritize high probability cases of SIM Box fraud adaptively.
Worldwide, cellular network operators lose billions in annual revenue due to fraudulent activity on their networks. Voice traffic termination fraud, also known as SIM box fraud or Interconnect Bypass fraud, is particularly a problem in global contexts, where enforcement across national borders can be difficult.
A SIM box is typically used as part of a VoIP gateway that allows calls to be received and transmitted over the internet. Fraudulent SIM Boxes hijack international voice calls and transfer them over the internet to a cellular device, which then returns them to the cellular network.
As a result, the calls appear to be local and therefore the cellular operators of the intermediate and destination networks do not receive payments for long-distance call routing and termination.
In addition to causing financial losses, SIM boxes negatively impact phone service by overloading cell towers and degrading the quality of voice calls, leaving customers dissatisfied and injuring the service provider’s reputation.
The sheer number of mobile devices and the increasing volume of cellular traffic make detecting SIM-box fraud extremely challenging. Moreover, the patterns and characteristics of fraudulent SIM boxes can be very similar to those of legitimate devices, such as cellular network probes. Detecting fraudulent SIM boxes is like searching for a few needles in a huge haystack — a haystack with needles that keep changing their appearance to more closely resemble the hay.
Due to the enormous volume of data and the use of big data frameworks to store CDRs, deep learning was a natural fit to augment current fraud detection systems.
Working with Skymind, Orange Silicon Valley has implemented an ANN autoencoder that learns patterns and uncovers unusual activity. By ranking CDRs according to fraud probability, analysts at Orange can prioritize outliers instead of sifting through hundreds of CDRs for instances of fraud. That is, autoencoders can minimize time wasted on false positives. Even better, autoencoders can learn from ongoing streams of data, automatically adapting to new tactics adopted by fraudsters. Rapid, responsive adaptation is a chief advantage of deep-learning techniques.
Orange Silicon Valley has currently amassed hundreds of millions of unlabeled CDRs. In this mass of data, the fine distinctions that often differentiate fraud from legitimate activity are difficult to discover via feature engineering alone; static rules often cast too wide a net, resulting in false positives, or depend on activity grossly deviating from the norm.
In the future, Orange Silicon Valley’s analysts will curate a labeled data set that identifies some CDRs as either fraudulent or legitimate. This labeled data will then be used to build a supervised classifier that will continue to train the ANN to make ever more granular predictions of fraud.