How Machine Learning Models Can Outperform Rule Based Systems, Explained
The task of detecting fraudulent online payments is a perfect use case for applying machine learning algorithms that thrive in environments where data volume is high and the characteristics of fraudulent transactions cannot be easily detected using only a handful of features. Nonetheless, many fraud prevention systems still rely on hard-coded rules engines that consolidate the aggregate knowledge of fraud experts. In this piece I will shed some light on the main differences between the two approaches and which use cases fit one or the other better.
A crucial cog in the machine - the decision engine
Systems that guard merchants from fraud are a lot more than just serialized machine learning models or sets of rules expressed in code using many “if-else” statements. There are a lot of other engineering challenges in various areas ranging from infrastructure, backend and frontend programming. Those challenges tend to differ a bit depending on the chosen decision engine and business sector specificity, but they are not the main topic of this blog post. Here, we will focus on just one crucial piece, a single cog in the machine - the decision engine that determines whether the transaction is fraudulent or not.
Rules based systems
As the name suggests, those systems rely on hard coded rules that are set to flag transactions if they meet certain criteria. Such rules can be developed by:
- following industry best practices - like blocking multiple transactions from a single account in a short period of time or the ones coming through VPNs or from risky areas,
- analyzing caught / prevented fraudulent transactions and developing new rules to cover all of their suspicious characteristics.
The rules are often expressed using “if-else” statements present in almost all imperative programming languages and are easily interpretable. They mirror the way in which a human would process a transaction — the engine checks if a transaction meets any of the risky patterns expressed in the rules and if it does, it blocks it or sends it to be manually reviewed by humans. This is one of the reasons why their presence is still very strong - stakeholders trust them because they mimic the way in which they themselves would tackle this task.
- Full explainability out of the box - if a certain rule triggered an alert for a particular transaction it’s 100% transparent why this happened.
- No cold start problem - they are operational from day 1, there’s no need to gather training datasets that are required for machine learning algorithms.
- Low threshold of entry - you don’t need a team of data scientists, machine learning engineers or MLOps - first rules can be easily implemented by the backend team since they are already familiar with translating business logic into code.
- Continuous need of reverse engineering fraudsters’ attacks - new rules have to be developed as new fraud patterns emerge.
- Incremental number of rules - cost of maintenance grows in time (recalibration & adjusting to new fraud patterns).
- Detection of fraud cases with limited complexity - there is a limit for number of rules & transactions’ features. Rule based systems are limited by human comprehension (due to manual development of rules & necessary maintenance).
Machine learning models
ML models address the shortcomings of rule based systems. They thrive in environments where the volume and dimensionality of data is high. Algorithms like decision trees, random forests, gradient boosting or neural networks are designed to find complex, nonlinear patterns utilizing hundreds (if available) features of transactions. Such an approach demands a shift in focus For one, deploying ML models requires high quality, labeled historical data used as a training dataset. The more data you have (in terms of the number of transactions and number of features capturing transactions’ characteristics) the better the model will perform. In such a scenario we are trying to keep a record of past transactions (with a detailed description in the form of a feature vector) rather than trying to directly understand the fraud phenomena.
- Automatic fraud pattern recognition - the task of figuring out what makes a fraud is handled by the algorithm. Our task is to provide it with as detailed a description as possible (in form of a feature vector).
- Concept drift defined as a change in fraud characteristics in time (new fraud methods, new tools used by fraudsters) often can be solved by retraining the models on new data — there’s no need to reverse engineer fraudsters’ methods.
- Less manual work involved - many of the processes can be automated. Companies that have mature machine learning pipelines spend most of the time on researching new features & algorithms while keeping an eye on performance metrics of current models available through monitoring apps.
- ML models’ economical efficiency grows along with data volume. The more data you have and the more complex it is, the harder it is to develop rule-based systems. The return on developing automated fraud detection using ML models thus increases as data volume increases.
- Cold start problem - to run ML models you need a significant amount of historical data.
- Lack of explainability out of the box - not all algorithms’ predictions can be easily explained, some of them are “black boxes” for which there are no easy explanations between inputs and outputs.
ML models deep dive
Most modern fraud prevention systems function as hybrid solutions that gather outputs from both rule based engines and machine learning models, and then propose a synthetic recommendation based on the client specific business logic. Since rule based systems mimic the reasoning process of humans let’s dive deeper into how machine learning algorithms find fraudulent traits in online traffic.
There has been a lot of hype around machine learning for the past few years but certain tasks, like fraud detection, remain difficult even for many novel methods and techniques. Extreme class imbalance, concept drift (defined as a varying characteristics of detected phenomena in time), and expectations of full explainability of models’ predictions from business stakeholders are just some examples of common difficulties.
Fraudulent transactions tend to make up a tiny fraction of traffic. This poses 2 challenges:
- Datasets need to be bigger than usual due to the fact that fraudulent patterns are to be observed only in a small fraction of the data.
- Since most of the traffic is legitimate, models need to be carefully calibrated so as not to “suffocate” the business by frequent false positive errors (the situation when a legitimate transaction is blocked on the suspicion of fraud).
These data characteristics disqualify a range of ML algorithms. Gradient boosting methods tend to strive in such environments due to the feedback loop mechanism that is embedded into the algorithm. During the iterative process of training, the algorithm “focuses” on the parts of data where it was previously wrong - this mechanism is a good solution to class imbalance.
Fraudsters play a constant “cops and robbers” game with companies working on fraud prevention software. Their toolset is growing and when a new security measure becomes a new industry standard, they quickly adapt to the situation and find new ways of being efficient at their activities. This calls for frequent retraining of ML models - one trained a year ago may not address the fraud patterns found in newer data samples.
Superiority over rule based systems
Maintaining a complex rule engine with hundreds of interdependent rules that express constantly changing fraud patterns isn’t easy and it definitely isn’t scalable. In contrast, machine learning based solutions scale automatically via cloud service providers - the only difference in cost between processing 1k and 100k transactions is the figure on the invoice from your cloud service provider. Data scientists or machine learning engineers need to do exactly the same job provided they use proper tools and automate repetitive tasks like retraining models or data collection.
Automatic adaptation via retraining
Concept drift is less troublesome for machine learning based solutions. In rule based engines, changes in fraud patterns call for manual recalibration of rules and creation of new ones that are a result of research. This is manual work that can’t be easily automated. In comparison, ML models require rerunning the training on new data samples and (sometimes) coming up with new features that would capture the change in detected phenomena described as concept drift. Retraining can be easily automated so, again, ML models prove to be more effective cost wise.
Automatic detection of fraud patterns
Today, you can attend an online bootcamp that teaches you how to effectively commit fraud the same way one might attend an online course to learn programming. This means that obvious fraud patterns, expressed by rule based engines that haven’t evolved as much as ML in recent years, will be swiftly bypassed by modern fraudsters. In light of this, automatic fraud pattern detection that comes with ML models is a necessity rather than a luxury.
Power of ensembles
Many modern day ML algorithms work as ensembles (e.g. random forest, gradient boosting). This means that, under the hood, algorithms create numerous separate classifiers that are trained independently on different data subsets, learning slightly different things about fraud patterns. When deployed, they vote on the score for every transaction, solving the problem of bias. If a fraudster is coming from the other part of the world and is half the age of the analyst that composes the rules, the bias transferred from analyst to code can create a gateway for fraudsters coming from different backgrounds. Ensembles partially alleviate this single point of failure.
Rule based systems hold a strong advantage over ML models in terms of explainability. In such systems, there is little ambiguity over why a certain transaction was blocked. Some ML algorithms (especially deep neural networks, the most hyped of all ML techniques) work as a black boxes - there is no easy way of saying why it returned a certain value for certain input. Fortunately, most fraud detection datasets are imbalanced and made of structured data - this means that algorithms that utilize decision trees work really well. Predictions of such models can be easily explained using packages like ELI5 (which stands for “Explain Like I'm 5”) that enable us to see which transaction traits contribute to its likelihood of being fraudulent (just like in rule based systems). Even if the algorithm is not tree-based, there are many tools that try to demystify the internal workings of those black boxes (deep neural networks included). XAI which stands for “Explainable Artificial Intelligence” is a new field that gained a lot of attention recently due to the fact that many real-world applications of ML models demand explainability.
In this piece I tried to outline the main differences between rule based engines and machine learning models. A stated above, the best set up should contain both since they are not mutually exclusive. Each of the methods has its pros and cons but it looks like the future belongs to machine learning, complemented by rule based systems. One way of looking at this is treating the machine learning model as just another rule in a rule based engine - it’s just a bit more smart, that’s all.
Nethone is a global provider of AI-driven KYU (Know Your Users) solutions that allows online merchants to understand their end-users and prevent online fraud. By using machine learning technology, Nethone is able to detect and prevent card-not-present fraud, including protection against account takeover. Founded in 2016 by data scientists, security experts, and business executives, Nethone successfully cooperates with eCommerce, digital goods, travel, and financial industries on a global scale.
There are no related Events
This presentation, from Jörg Schad of ArangoDB, explores how machine learning models based on graphs can incorporate relationships inside data explicitly, while also highlighting the many powerful Machine Learning algorithms that already use graphs, e.g., Page Rank (Pregel), Recommendation Engines (collaborative filtering), text summarization, and other NLP tasks.
The keynote also highlights the recent developments regarding Graph Neural Networks and how they connect the worlds of Graphs and Machine Learning even further.
Learn how machine learning can be used to detect and block fraud during the Spotify Checkout process from data scientist Cicely Robinson.
In this presentation, Cicely will explore the complexities and challenges of the problems encountered in this space, while also discussing the modelling techniques and the data requirements needed to build an effective fraud mitigation model in real-time.
Defeating fraud begins with understanding fraudulent behaviors, in order to correctly identify who the fraudulent agent is behind a transaction. The differences between first and third parties constitute the foundation of fraud knowledge and this distinction is a key component in mitigating the problem in an efficient way.
Machine Learning (ML) is widely applied in the field of fraud detection, but is not often used as a way to approach post-payment fraudulent behaviors, especially for identifying what is first-party misuse as opposed to third-party fraud.
This presentation explores the use of ML to better scale what has traditionally been a lengthy manual process of reviewing chargebacks. This contributes to the recovery process by giving insights for representing chargebacks and training adaptive detection ML models in a more efficient way.
There are no related Surveys
False positives down, revenue up! Learn from an Experian fraud expert how machine learning strengthens fraud prevention, reduces false positives, and leads to new revenues.
Fraud prevention is one of the most exciting areas in commerce. However, it is also challenging to core business functions. This webinar will present how modern fraud prevention becomes more efficient through machine learning models and how this results in new revenue potential.
By participating in this webinar, the learner should be able to understand:
- State-of-the-art fraud prevention methods and current fraud figures
- How machine learning supports fraud prevention
- Why companies should rely on smart fraud prevention with machine learning models
We have met with many cases where analysts not only doubt themselves in their decision but also discouraged to take information into account that may not be considered as useful – not to mention the rise of Machine Learning in fraud prevention. Is our work that we do manually truly outdated?
Through examples we will re-explore the tools that we have at our expense and discuss how we can effectively use them in relation, from articulating sentences, to defining use cases and relying on AI.
- To find or regain the interest in the beauty of fighting fraud – making decisions that would be the key to stop the malicious activity yet keeping the mind open to changes.
- Utilising every aspect of the tool that are available and considering ways that we may have disregarded before and still could be useful, ultimately a boost for the year 2022 and to remain on the top of defense against harmful users.
In this thought-provoking session, we will share best practices on combining AI and ML with some of the more traditional approaches to mitigating online fraud and highlight why extensive cross-industry collaboration can help organizations strike the optimal balance between security and usability.
Types of fraud prevention systems which are payment-agnostic are addressed next, followed by a brief analysis of machine learning models which use no payment data. The broadcast concludes with a detailed Q&A session.
At SEON Technologies we have released new information on the collection countries that are most and least at risk of cyberattacks. We have also taken a close look at the most common types of cybercrime occurring in the US.
Dubbed the Global Cybercrime Report, the report explains how several countries are the safest in the world from fraud and other cybercrime. and why others are not. Our methodology for this research was based on how companies and public infrastructure are all being fairly well protected through both legislation and technology at their disposal.