Big Time ROI in Identity Resolution

8th May 20105th May 2012Henrik Gabs Liliendahl

Yesterday I had the chance to make a preliminary assessment of the data quality in one of the local databases holding information about entities involved in carbon trade activities. It is believed that up to 90 percent of the market activity may have been fraudulent with criminals pocketing 5 billion Euros. There is a description of the scam here from telegraph.co.uk.

Most of my work with data matching is aimed at finding duplicates. In doing this you must avoid finding so called false positives, so you don’t end up merging information about to different real world entities. But when doing identity resolution for several reasons including preventing fraud and scam you may be interested in finding connections between entities that are not supposed to be connected at all.

The result from making such connections in the carbon trade database was quite astonishing. Here is an example where I have changed the names, addresses, e-mails and phones, but such a pattern was found in several cases:

Here we have an example of a group of entities where the name, address, e-mail or phone is shared in a way that doesn’t seem natural.

My involvement in the carbon trade scam was initiated by a blog post yesterday by my colleague Jan Erik Ingvaldsen based on the story that journalists by merely gazing the database had found addresses that simply doesn’t exist.

So the question is if authorities may have avoided losing 5 billion taxpayer Euros if some identity resolution including automated fuzzy connection checks and real world checks was implemented. I know that you are so much more enlightened on what could have been done when the scam is discovered, but I actually think that there may be a lot of other billions of Euros (Pounds, Dollars, Rupees) to avoid losing out there by making some decent identity resolution.

Jim Harris 8th May 2010 / 15:00

Excellent post Henrik,

As you know, one of the most common objections to data cleansing efforts is that they often produce considerable costs without delivering tangible and significant ROI.

One of the ways that the ROI of removing duplicates can be measured is the cost savings on redundant postal deliveries, which although sometimes significant (with high duplicate rates) doesn’t exactly wow executives.

Identity resolution, especially in situations of fraud detection, can not only deliver more significant ROI, but can make the need more obvious for defect prevention using integrated real-time matching services where data originates.

Whether performing duplicate identification or identity resolution, false positives remain a justifiable concern, but the negative impact of false negatives can be much more damaging with identity resolution.

Thanks for providing a great real-world example of this challenging problem.

Best Regards,

Jim

Reply
Henrik Liliendahl Sørensen 9th May 2010 / 08:52

Thanks for the comment Jim.

We are often short of success stories about data quality because no one measures what could have happened if we didn’t prevent it to happen. Instead we can merely present these trainwrecks and hope someone learns from these.

In the case described here you could possibly not beforehand state that if you don’t apply any kind of identity resolution you will lose 5 billion Euros. But I think common sense could have told you that not doing so is like leaving the door to the treasury open at night.

Reply
Garnie Bolling 10th May 2010 / 01:39

Henrik, another great story, and real world….

Like Jim mentioned, the cost of Data Profiling, and Identity resolution does have up front costs, and little to show for immediate ROI. While you can see tangible dollars from Jim’s example (redundant mailings to the same address), also look at sharing with the business group the realistic view into their customer base.

Looking at Banks and Retail, being able to see real trends based on real customer bases is powerful, analytically. Not to mention Fraud, where finding the trends of erroneous data based on validation techniques (or persistent searches) can avoid lots of risk in the near future. (try to avoid the train wreck based on what we have learned before).

Thanks Guys, for sharing, and I hope this will spawn a few more entries where we an share / learn and communicate the real value of good DQ.

– Garnie

Reply
philsimon 11th May 2010 / 15:28

I only wish that examples like this were used more prevalently. I have seen way too many people dismiss DQ efforts because they didn’t see any benefit.

Henrik – you just proved them wrong.

Reply
Henrik Liliendahl Sørensen 11th May 2010 / 15:47

Garnie and Phil, thanks for the kind words.

Reply
Pingback: In Search of an Anecdotal Antidote | The Data Roundtable
Thomas 19th April 2012 / 12:52

This is the nice information, actually any one can make to their identity only one time in the life, so keep on to this right & good way.

Reply
Henrik Liliendahl Sørensen 19th April 2012 / 13:03

Thanks Thomas. I have been following InfoGlide for some years.

Reply

	Henrik Gabs Lilienda… on Balancing the Business Partner…
	Jeppe Thing Sørensen on Balancing the Business Partner…
	peolsolutions on MDM, Cloud, SaaS, PaaS, IaaS a…
	Henrik Gabs Lilienda… on Is the Holiday Season called C…
	Michael D. on Is the Holiday Season called C…
	Jay Ram on The Disruptive MDM List is…
	Henrik Gabs Lilienda… on The Intersection of Data Obser…
	Shanker on The Intersection of Data Obser…
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on Data Matching Efficiency
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on From Platforms to Ecosyst…
	Michael Fieg on From Platforms to Ecosyst…
	From Platforms to Ec… on What is Collaborative Product…
	From Platforms to Ec… on MDM and Knowledge Graph

Liliendahl on Data Quality

A blog about Master Data Management, Product Information Management, Data Quality Management and more

Big Time ROI in Identity Resolution

Related

8 thoughts on “Big Time ROI in Identity Resolution”

Leave a comment Cancel reply