Identity Resolution – Page 2 – Liliendahl on Data Quality

Hierarchical Data Matching

13th August 2013Henrik Gabs Liliendahl9 Comments

A year ago I wrote a blog post about data matching published on the Informatica Perspective blog. The post was called Five Future Data Matching Trends.

One of the trends mentioned is hierarchical data matching.

The reason we need what may be called hierarchical data matching is that more and more organizations are looking into master data management and then they realize that the classic name and address matching rules do not necessarily fit when party master data are going to be used for multiple purposes. What constitutes a duplicate in one context, like sending a direct mail, doesn’t necessary make a duplicate in another business function and vice versa. Duplicates come in hierarchies.

One example is a household. You probably don’t want to send two sets of the same material to a household, but you might want to engage in a 1-to-1 dialogue with the individual members. Another example is that you might do some very different kinds of business with the same legal entity. Financial risk management is the same, but different sales or purchase processes may require very different views.

I usually divide a data matching process into three main steps:

Candidate selection
Match scoring
Match destination

(More information on the page: The Art of Data Matching)

Hierarchical data matching is mostly about the last step where we apply survivorship rules and execute business rules on whether to purge, merge, split or link records.

In my experience there are a lot of data matching tools out there capable of handling candidate selection, match scoring, purging records and in some degree merging records. But solutions are sparse when it comes to more sophisticated things like spitting an original entity into two or more entities by for example Splitting Names or linking records in hierarchies in order to build a Hierarchical Single Source of Truth.

Know Your (Foreign Luxury Bag) Customer

11th August 201311th August 2013Henrik Gabs Liliendahl4 Comments

A story featured a lot in the media the last days is the incident where one of richest women on the planet, Oprah Winfrey, was told that she couldn’t afford the handbag she wanted to look at in a Zürich shop. Was it racism or a misunderstanding because Oprah isn’t good at speaking German?

Either way it was for sure an example of bad things happening when you don’t know your customer. This story also highlights the issues we have with foreign customers as Oprah may not be just as famous in Zürich as in New York.

We have these challenges in customer master data management all over as described in the post Know Your Foreign Customer.

And oh: Maybe it’s time to start a sister blog called Liliendahl on Fashion. This is my second post on luxury handbags. The first post was called Data Quality Luxury.

Call me on Phone, Mobile or Skype

9th May 2013Henrik Gabs LiliendahlLeave a comment

When calling people in order to have a long distance conversation there are three main ways today:

The landline phone, which have been around since the 19^th century and penetrated most homes and businesses in the last century
The mobile phone, which came around in the 70’s and spread rapidly in the 90’s
Skype, a voice over internet service that grew in the 00’s

Using these services involves and identifier which may be stored in customer tables and other party master data repositories with some implications for data management and identity resolution:

The Landline Phone Number

The landline phone number is a very common attribute in databases around and is often used as the main identifier of a customer in ERP and CRM solutions around.

Using a landline phone number for identity resolution has some challenges, including:

As with most attributes they may change. Depending on the country in question they may change during relocation and most phone number systems gets and upgrade over the years.
In business-to-business (B2B) a company typically has more than one phone number.
In business-to-consumer (B2C) the landline phone number merely belongs to a household rather than a single individual. That may be good or not good depending on purpose of use.

The Mobile Phone Number

Mobile phone numbers also piles up in databases around. In relation to identity resolution there are issues with mobile phone numbers, namely:

They change a lot.
It’s not always clear to who a number actually belongs:
- A company paid phone may be used for both business and pleasure and may be transferred to another individual
- In a household a person may be registered for a range of mobile phones used by individual members of the household including children

The Skype ID

I seldom see databases with Skype ID’s. In my experience Skype ID aren’t used a lot in internal master data. They reside in Skype and social network profiles like for example LinkedIn.

A final rant

Today I hardly ever use a landline phone, I use my mobile once in a while and I use Skype a lot. Not because it’s convenient, but because the telecom companies has decided to charge international mobile calls in ways so greedy that it make Somali sea pirates look like honest business men.

Multi-Channel Data Matching

4th April 20134th April 2013Henrik Gabs LiliendahlLeave a comment

Most data matching activities going on are related to matching customer, other rather party, master data.

In today’s business world we see data matching related to party master data in those three different channels types:

Offline is the good old channel type where we have the mother of all business cases for data matching being avoiding unnecessary costs by sending the same material with the postman twice (or more) to the same recipient.
Online has been around for some time. While the cost of sending the same digital message to the same recipient may not be a big problem, there are still some other factors to be considered, like:
- Duplicate digital messages to the same recipient looks like spam (even if the recipient provided different eMail addresses him/her self).
- You can’t measure a true response rate
Social is the new channel type for data matching. Most business cases for data matching related to social network profiles are probably based on multi-channel issues.

The concept of having a single customer view, or rather single party view, involves matching identities over offline, online and social channels, and typical elements used for data matching are not entirely the same for those channels as seen in the figure to the right.

Most data matching procedures are in my experience quite simple with only a few data elements and no history track taking into considering. However we do see more sophisticated data matching environments often referred to as identity resolution, where we have historical data, more data elements and even unstructured data taking into consideration.

When doing multi-channel data matching you can’t avoid going from the popular simple data matching environments to more identity resolution like environments.

Some advices for getting it right without too much complication are:

Emphasize on data capturing by getting it right the first time. It helps a lot.
Get your data models right. Here reflecting the real world helps a lot.
Don’t reinvent the wheel. There are services for this out here. They help a lot.

Read more about such a service in the post instant Single Customer View.

Data Quality Vendors Beware of SEO Agencies

1st March 2013Henrik Gabs Liliendahl7 Comments

As reported in the post Fighting Identity Fraud with Identity Fraud and experienced with the post 255 Reasons for Data Quality Diversity I have seen several sloppy attempts of link building from SEO agencies working for data quality tool vendors.

The other day it happened again, this time on LinkedIn.

There was a comment in the Master Data Management Interest group:

The comment is now deleted by the author and I do understand why.

I guess a SEO guy was working for Simon at DataLadder and Nathan from somewhere else at the same time and given access to their LinkedIn accounts. However he/she posted a comment to be meant being from Simon logged in as Nathan (who is not working with MDM and data quality).

So, data quality tool and service vendors: You can’t fight identity fraud with identity fraud and you can’t advocate for a single view of customer with a messy view of you as a vendor. Be authentic.

The New Year in Identity Resolution

31st December 201231st December 2012Henrik Gabs Liliendahl5 Comments

You may divide doing identity resolution into these categories:

Hard core identity check
Light weight real world alignment
Digital identity resolution

Hard Core Identity Check

Some business processes requires a solid identity check. This is usually the case for example for credit approval and employment enrolment. Identity check is also part of criminal investigation and fighting terrorism.

Services for identity checks vary from country to country because of different regulations and different availability of reference data.

An identity check usually involves the entity who is being checked.

Light Weight Real World Alignment

In data quality improvement and Master Data Management (MDM) you often include some form of identity resolution in order to have your data aligned with the real world. For example when evaluating the result of a data matching activity with names and addresses, you will perform a lightweight identity resolution which leads to marking the matched results as true or false positives.

Doing such kind of identity resolution usually doesn’t involve the entity being examined.

Digital Identity Resolution

Our existence has increasingly moved to the online world. As discussed in the post Addressing Digital Identity this means that we also will need means to include digital identity into traditional identity resolution.

There are of course discussions out there about how far digital identity resolution should be possible. For example real name policy enforcement in social networks is indeed a hot topic.

Future Trends

With regard to digital identity resolution the jury is still out. In my eyes we can’t avoid that the economic consequences of the rising social sphere will affect the demand for knowing who is out there. Also the opportunities in establishing identity via digital footprints will be exploited.

My guess is that the distinction between hard core identity check and real world alignment in data quality improvement and MDM will disappear as reference data will become more available and the price of reference data will go down.

That’s why I’m right now working with a solution (www.instantdq.com) that combines identity check features and data universe into master data management with the possibility of adding digital identity into the mix.

Fighting Identity Fraud with Identity Fraud

6th December 2012Henrik Gabs Liliendahl3 Comments

I have earlier had issues with SEO agencies posting comments on this blog in their quest to help data quality tool vendors in getting better search rank for data quality related terms. Example here.

This happened again today with a recent post called Addressing Digital Identity.

I find it quite funny that the SEO guy is talking about fighting identity fraud while posting a comment under a name that I bet is not his/her real name:

	Henrik Gabs Lilienda… on Balancing the Business Partner…
	Jeppe Thing Sørensen on Balancing the Business Partner…
	peolsolutions on MDM, Cloud, SaaS, PaaS, IaaS a…
	Henrik Gabs Lilienda… on Is the Holiday Season called C…
	Michael D. on Is the Holiday Season called C…
	Jay Ram on The Disruptive MDM List is…
	Henrik Gabs Lilienda… on The Intersection of Data Obser…
	Shanker on The Intersection of Data Obser…
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on Data Matching Efficiency
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on From Platforms to Ecosyst…
	Michael Fieg on From Platforms to Ecosyst…
	From Platforms to Ec… on What is Collaborative Product…
	From Platforms to Ec… on MDM and Knowledge Graph