How Machine Learning Models Can Outperform Rule Based Systems, Explained
The task of detecting fraudulent online payments is a perfect use case for applying machine learning algorithms that thrive in environments where data volume is high and the characteristics of fraudulent transactions cannot be easily detected using only a handful of features. Nonetheless, many fraud prevention systems still rely on hard-coded rules engines that consolidate the aggregate knowledge of fraud experts. In this piece I will shed some light on the main differences between the two approaches and which use cases fit one or the other better.
A crucial cog in the machine - the decision engine
Systems that guard merchants from fraud are a lot more than just serialized machine learning models or sets of rules expressed in code using many “if-else” statements. There are a lot of other engineering challenges in various areas ranging from infrastructure, backend and frontend programming. Those challenges tend to differ a bit depending on the chosen decision engine and business sector specificity, but they are not the main topic of this blog post. Here, we will focus on just one crucial piece, a single cog in the machine - the decision engine that determines whether the transaction is fraudulent or not.
Rules based systems
As the name suggests, those systems rely on hard coded rules that are set to flag transactions if they meet certain criteria. Such rules can be developed by:
- following industry best practices - like blocking multiple transactions from a single account in a short period of time or the ones coming through VPNs or from risky areas,
- analyzing caught / prevented fraudulent transactions and developing new rules to cover all of their suspicious characteristics.
The rules are often expressed using “if-else” statements present in almost all imperative programming languages and are easily interpretable. They mirror the way in which a human would process a transaction — the engine checks if a transaction meets any of the risky patterns expressed in the rules and if it does, it blocks it or sends it to be manually reviewed by humans. This is one of the reasons why their presence is still very strong - stakeholders trust them because they mimic the way in which they themselves would tackle this task.
- Full explainability out of the box - if a certain rule triggered an alert for a particular transaction it’s 100% transparent why this happened.
- No cold start problem - they are operational from day 1, there’s no need to gather training datasets that are required for machine learning algorithms.
- Low threshold of entry - you don’t need a team of data scientists, machine learning engineers or MLOps - first rules can be easily implemented by the backend team since they are already familiar with translating business logic into code.
- Continuous need of reverse engineering fraudsters’ attacks - new rules have to be developed as new fraud patterns emerge.
- Incremental number of rules - cost of maintenance grows in time (recalibration & adjusting to new fraud patterns).
- Detection of fraud cases with limited complexity - there is a limit for number of rules & transactions’ features. Rule based systems are limited by human comprehension (due to manual development of rules & necessary maintenance).
Machine learning models
ML models address the shortcomings of rule based systems. They thrive in environments where the volume and dimensionality of data is high. Algorithms like decision trees, random forests, gradient boosting or neural networks are designed to find complex, nonlinear patterns utilizing hundreds (if available) features of transactions. Such an approach demands a shift in focus For one, deploying ML models requires high quality, labeled historical data used as a training dataset. The more data you have (in terms of the number of transactions and number of features capturing transactions’ characteristics) the better the model will perform. In such a scenario we are trying to keep a record of past transactions (with a detailed description in the form of a feature vector) rather than trying to directly understand the fraud phenomena.
- Automatic fraud pattern recognition - the task of figuring out what makes a fraud is handled by the algorithm. Our task is to provide it with as detailed a description as possible (in form of a feature vector).
- Concept drift defined as a change in fraud characteristics in time (new fraud methods, new tools used by fraudsters) often can be solved by retraining the models on new data — there’s no need to reverse engineer fraudsters’ methods.
- Less manual work involved - many of the processes can be automated. Companies that have mature machine learning pipelines spend most of the time on researching new features & algorithms while keeping an eye on performance metrics of current models available through monitoring apps.
- ML models’ economical efficiency grows along with data volume. The more data you have and the more complex it is, the harder it is to develop rule-based systems. The return on developing automated fraud detection using ML models thus increases as data volume increases.
- Cold start problem - to run ML models you need a significant amount of historical data.
- Lack of explainability out of the box - not all algorithms’ predictions can be easily explained, some of them are “black boxes” for which there are no easy explanations between inputs and outputs.
ML models deep dive
Most modern fraud prevention systems function as hybrid solutions that gather outputs from both rule based engines and machine learning models, and then propose a synthetic recommendation based on the client specific business logic. Since rule based systems mimic the reasoning process of humans let’s dive deeper into how machine learning algorithms find fraudulent traits in online traffic.
There has been a lot of hype around machine learning for the past few years but certain tasks, like fraud detection, remain difficult even for many novel methods and techniques. Extreme class imbalance, concept drift (defined as a varying characteristics of detected phenomena in time), and expectations of full explainability of models’ predictions from business stakeholders are just some examples of common difficulties.
Fraudulent transactions tend to make up a tiny fraction of traffic. This poses 2 challenges:
- Datasets need to be bigger than usual due to the fact that fraudulent patterns are to be observed only in a small fraction of the data.
- Since most of the traffic is legitimate, models need to be carefully calibrated so as not to “suffocate” the business by frequent false positive errors (the situation when a legitimate transaction is blocked on the suspicion of fraud).
These data characteristics disqualify a range of ML algorithms. Gradient boosting methods tend to strive in such environments due to the feedback loop mechanism that is embedded into the algorithm. During the iterative process of training, the algorithm “focuses” on the parts of data where it was previously wrong - this mechanism is a good solution to class imbalance.
Fraudsters play a constant “cops and robbers” game with companies working on fraud prevention software. Their toolset is growing and when a new security measure becomes a new industry standard, they quickly adapt to the situation and find new ways of being efficient at their activities. This calls for frequent retraining of ML models - one trained a year ago may not address the fraud patterns found in newer data samples.
Superiority over rule based systems
Maintaining a complex rule engine with hundreds of interdependent rules that express constantly changing fraud patterns isn’t easy and it definitely isn’t scalable. In contrast, machine learning based solutions scale automatically via cloud service providers - the only difference in cost between processing 1k and 100k transactions is the figure on the invoice from your cloud service provider. Data scientists or machine learning engineers need to do exactly the same job provided they use proper tools and automate repetitive tasks like retraining models or data collection.
Automatic adaptation via retraining
Concept drift is less troublesome for machine learning based solutions. In rule based engines, changes in fraud patterns call for manual recalibration of rules and creation of new ones that are a result of research. This is manual work that can’t be easily automated. In comparison, ML models require rerunning the training on new data samples and (sometimes) coming up with new features that would capture the change in detected phenomena described as concept drift. Retraining can be easily automated so, again, ML models prove to be more effective cost wise.
Automatic detection of fraud patterns
Today, you can attend an online bootcamp that teaches you how to effectively commit fraud the same way one might attend an online course to learn programming. This means that obvious fraud patterns, expressed by rule based engines that haven’t evolved as much as ML in recent years, will be swiftly bypassed by modern fraudsters. In light of this, automatic fraud pattern detection that comes with ML models is a necessity rather than a luxury.
Power of ensembles
Many modern day ML algorithms work as ensembles (e.g. random forest, gradient boosting). This means that, under the hood, algorithms create numerous separate classifiers that are trained independently on different data subsets, learning slightly different things about fraud patterns. When deployed, they vote on the score for every transaction, solving the problem of bias. If a fraudster is coming from the other part of the world and is half the age of the analyst that composes the rules, the bias transferred from analyst to code can create a gateway for fraudsters coming from different backgrounds. Ensembles partially alleviate this single point of failure.
Rule based systems hold a strong advantage over ML models in terms of explainability. In such systems, there is little ambiguity over why a certain transaction was blocked. Some ML algorithms (especially deep neural networks, the most hyped of all ML techniques) work as a black boxes - there is no easy way of saying why it returned a certain value for certain input. Fortunately, most fraud detection datasets are imbalanced and made of structured data - this means that algorithms that utilize decision trees work really well. Predictions of such models can be easily explained using packages like ELI5 (which stands for “Explain Like I'm 5”) that enable us to see which transaction traits contribute to its likelihood of being fraudulent (just like in rule based systems). Even if the algorithm is not tree-based, there are many tools that try to demystify the internal workings of those black boxes (deep neural networks included). XAI which stands for “Explainable Artificial Intelligence” is a new field that gained a lot of attention recently due to the fact that many real-world applications of ML models demand explainability.
In this piece I tried to outline the main differences between rule based engines and machine learning models. A stated above, the best set up should contain both since they are not mutually exclusive. Each of the methods has its pros and cons but it looks like the future belongs to machine learning, complemented by rule based systems. One way of looking at this is treating the machine learning model as just another rule in a rule based engine - it’s just a bit more smart, that’s all.
Nethone is a global provider of AI-driven KYU (Know Your Users) solutions that allows online merchants to understand their end-users and prevent online fraud. By using machine learning technology, Nethone is able to detect and prevent card-not-present fraud, including protection against account takeover. Founded in 2016 by data scientists, security experts, and business executives, Nethone successfully cooperates with eCommerce, digital goods, travel, and financial industries on a global scale.
There are no related Events
Manual Review remains a cornerstone of modern fraud prevention; offering enhanced precision in fraud screening, fairness in dispute management and labels for machine learning. However, as a business grows, so to must a Manual Review team. This introduces operational challenges to manage headcount, quality and return on investment.
In this session, we will cover Booking’s story of launching and rapidly scaling Manual Review Operations as a Service in Fraud Prevention. Hear how we empower a small team of 4 to manage Manual Fraud Operations on a global scale across Booking’s suite of products and associated risks.
As the need to reduce the cost of manual fraud prevention increases, we will show the benefits of leveraging multiple fraud prevention solutions such as outsourcing, automation and ML, demonstrating how the function interacts with various ML models to optimise workload and achieve a balanced defense against fraud.
Business leaders at global marketplace Wish knew the secret to the enterprise's continued growth was increased international expansion.
As one of the world’s largest e-commerce marketplaces, Wish now operates in more than 60 countries, processing approximately 900,000 transactions a day, across about 250,000 merchants, and has more than 20 million monthly active users. While that scale provides a wealth of transaction intelligence, Wish also needed to put a very local lens on the data it was processing.
Wish found its answer in machine learning — in particular a machine learning fraud protection model that scales rapidly while leveraging the fraud and risk expertise of the internal team at Wish. Smart machines surfaced the localized data the Wish team needed to react to the diversity of geo-specific fraud attacks and abuse schemes.
In this session, Signifyd senior vice president of operations and corporate development J. Bennett and Wish Director of Risk Operations Tara Mitchell dive into the considerations around deploying a machine learning solution and explore the steps Wish took to marry the strengths of humans and machines to reduce their exposure to fraud and increase order approval and conversion rates.
Conventional fraud detection has historically been centered around linear relationships between user indicators. Machine learning techniques in this domain also focus on learning and adapting to patterns arising from single depth connections of datapoints. The weak point of this setup is that we miss out on transitive relationships, which becomes an increasingly interesting factor when fraudsters reuse few overlapping indicators in new fraud attempts.
At Booking.com, we solved this problem by using graph technology to power our payment fraud controls. In this presentation we will show how we developed an in-house technological ecosystem that stores payments transactions in graph format and leveraged state of the art algorithms to compute innovative graph features that enriches our existing fraud controls (ML models and static rules). We will have a look at the different parts of the system: a latency sensitive real time feature computing service, a visualization tool for analysis of networks and a historical feature reconstruction mechanism. Finally, we will have a look at the impact we saw on our fraud controls after incorporating the new features.
We would conclude by explaining the plans to scale this technology in other fraud domains like marketing & rewards, account takeover etc. and exploring Network Representation Learning (a type of deep learning on graph data) to tackle them.
Machine learning continues to be an important tool for fraud mitigation specialists, but fraudsters are learning how to adapt.
The current landscape is now a battle against both automated and human-driven attacks. This presentation from Adyen and Microsoft explores the latest techniques in AI and ML to ensure the best customer experience and provide additional context for fraud decisions.
There are no related Surveys
False positives down, revenue up! Learn from an Experian fraud expert how machine learning strengthens fraud prevention, reduces false positives, and leads to new revenues.
Fraud prevention is one of the most exciting areas in commerce. However, it is also challenging to core business functions. This webinar will present how modern fraud prevention becomes more efficient through machine learning models and how this results in new revenue potential.
By participating in this webinar, the learner should be able to understand:
- State-of-the-art fraud prevention methods and current fraud figures
- How machine learning supports fraud prevention
- Why companies should rely on smart fraud prevention with machine learning models
We have met with many cases where analysts not only doubt themselves in their decision but also discouraged to take information into account that may not be considered as useful – not to mention the rise of Machine Learning in fraud prevention. Is our work that we do manually truly outdated?
Through examples we will re-explore the tools that we have at our expense and discuss how we can effectively use them in relation, from articulating sentences, to defining use cases and relying on AI.
- To find or regain the interest in the beauty of fighting fraud – making decisions that would be the key to stop the malicious activity yet keeping the mind open to changes.
- Utilising every aspect of the tool that are available and considering ways that we may have disregarded before and still could be useful, ultimately a boost for the year 2022 and to remain on the top of defense against harmful users.
In this thought-provoking session, we will share best practices on combining AI and ML with some of the more traditional approaches to mitigating online fraud and highlight why extensive cross-industry collaboration can help organizations strike the optimal balance between security and usability.
Types of fraud prevention systems which are payment-agnostic are addressed next, followed by a brief analysis of machine learning models which use no payment data. The broadcast concludes with a detailed Q&A session.
At SEON Technologies we have released new information on the collection countries that are most and least at risk of cyberattacks. We have also taken a close look at the most common types of cybercrime occurring in the US.
Dubbed the Global Cybercrime Report, the report explains how several countries are the safest in the world from fraud and other cybercrime. and why others are not. Our methodology for this research was based on how companies and public infrastructure are all being fairly well protected through both legislation and technology at their disposal.