3 types of fraud graph technology can tackle

Organizations across industries are adopting graphs to reinforce anti-fraud programs. Let’s take a look at three common fraud schemes to understand why:

• Insurance fraud,
• Credit card fraud ,
• VAT fraud.

Detecting fraud is about connecting the dots

In many areas, fraud investigators have at their disposal large datasets in which clues are hidden. These clues are left behind by criminals who, on their side, try to hide their activity behind layers of more or less intricate schemes. To unveil illegal activities, investigators have to connect the pieces of the puzzle to discover evidence of wrongdoing.

Most anti-fraud applications are able to connect simple data points together to detect suspicious behaviors: an IP address to a user, withdrawal activities to a place of residence, or a loan request history to a client.

But these applications fall short on more complex analysis that would imply several levels of relationships or data types. This is mostly due to the technology on which these applications often rely and the data silos it creates. The relational databases that emerged in the ‘80s are efficient at storing and analyzing tabular data but their underlying data model makes it difficult to connect various data scattered across multiple tables.

The graph databases we’ve seen emerge in recent years are designed for this purpose. Their data model is particularly well-suited to store and organize data where connections are as important as data points. Connections are stored and indexed as first-class citizens, making it an interesting model for investigations in which you need to connect the dots.

That’s one of technological reasons behind graph technology adoption. We can also put on the investigator shoes to see how this approach is of interest to tackle fraud.

1) A graph to unveil insurance fraud

Insurance fraud encompasses any act committed in the intent of defrauding an assurance process. It ranges from staged car accidents to faked deaths or exaggerated property damages. The FBI estimates that it represents $40 billion losses per year in the U.S.

As an example, people frequently team up and put together a fake road traffic accident (RTA) claim reporting hard-to-disprove light personal injuries. Those fraud rings involve several criminal playing the various roles of conductors, passengers, witnesses and even doctors that certify injuries or accomplice lawyers that file the claim.

There are too many claims filed every day to analyze them manually. Fraud investigation units have to rely on simple business rules to identify suspicious claims. But if the fraudsters made sure to avoid red flag elements (ie: unusual injury, recently purchased insurance policy, low velocity but significant injury etc) there is a chance they will go undetected and repeat the scheme.

This is where graph technology steps in. The graph approach brings data from various sources under a common model, so investigators can look at all the data at the same time, instead of isolated data silos. And this is exactly what they need because in these situations, what often gives away the fraudsters is abnormal connections to other elements.

These suspicious connections could be that the witness’s wife is connected to two similar cases, or that the doctor’s phone number is the same as the one of a conductor involved in another RTA claims. If investigators have all their data represented as a graph, they can pick up those suspicious signs faster. They get a better understanding of the “big picture” and can identify abnormal connections to detect insurance fraud.

Above is an example of what insurance data could look like as a graph. This visualization specifically depicts one of those abnormal patterns that indicate insurance fraud: two customers (blue nodes) filed three claims (green nodes). We can identify a network of three customers connected through personal information such as phone (brown nodes), email (pink nodes) with the same lawyer (green node) involved every time. It is very likely they are recycling stolen or fake identity to file fraudulent claims.

2) A graph to understand payment card fraud

Payment card fraud takes the form of criminals getting ahold of credit card information and proceeding to create unauthorized transactions. Card-present scenarios, in which criminals use a stolen or counterfeit credit card at an ATM or at the point-of-sale (POS) terminal of a physical store, affected 45,8 million cards in the U.S in 2018. Despite a massive migration to the safer chip-based card, stolen credit card fraud is still a major issue.

In a commonly encountered situation, a criminal proceeds the following way:

  • set up skimming devices at ATM or gas pump to steal the details stored in card’s magnetic stripes;
  • replicate the stolen card information into a counterfeit card;
  • use to stolen cards to withdraw money at ATM, buy goods or gift cards at shops;
  • cardholders notice unusual activity on their bank account and notify the authority.

These situations are a perfect case for graphs. While traditional technologies will hardly allow you to create a ‘big picture’ of heterogeneous data, the graph approach lets you collect the data in a model linking together: cardholders, transactions, terminals, and locations.

This way, when authorities are confronted with a surge of card-present fraud cases in a given region, graph technology can help identify the common point of compromise by highlighting the common links within the various reported cases, no matter how large the dataset is.

Above is another example of what a tiny part of the graph of data could look like. This approach is easier to identify a common point of compromise: my clients (blue nodes) report fraudulent purchases (orange nodes). We can identify through connections the common ATM (purple) where they made a withdrawal before the card was compromised.

3) VAT fraud

Finally, carousel fraud, also known as the missing trader, or VAT fraud, is the theft of VAT collected on the sale of goods initially bought VAT-free in another jurisdiction. This scheme is difficult to identify in time and losses can be massive as recent cases have shown.

In 2018, a single VAT fraud ring cost more than 60 million euros to the European economy. The criminal organization was selling products online through a wide network of shell companies and producing false invoices to perform VAT fraud. Generally, this is how the carousel works:

  • Company A sells the goods company B VAT-free
  • Company B sells the goods to company C, charging the VAT
  • Company C sells the goods and claims a VAT refund to the tax agency of country A

Those schemes are intricate and transactions quickly come after one after another to avoid raising suspicion. To make sense of the layers behind which criminals hide, investigators need an overview of the situation. Once again, graph technology can help bring together various data types to get a better understand of the financial context.

Many tools facilitate the identification of patterns and investigators can search across vast data collections for scenarios indicative of the carousel: for example multiple transactions occurring in a short amount of time between companies from two different countries with a newly created intermediary company. From there they have a starting point to assess the existence of potential carousel fraud.

Above is an example of data visualization for the pattern mentioned above. We see a chains of transactions in which companies (blue nodes) and their parent organizations (flags nodes) sell goods VAT-free and collect back VAT through complex layers of sales between EU and non-EU countries.

Today, organizations use graph technology to fight fraud across activity sectors: insurance, banking, law enforcement or financial administrations. It is a complementary approach to traditional statistical and relational technologies because it gives the opportunity to look for clues within data connections, which is where the value often lies when it comes to fraud.

(Initially published on linkurio.us blog)

Tech enthusiast, digital marketing manager. Working at Statice, startup specialized in synthetic data for privacy-preserving data applications 👉 www.statice.ai