Names, Addresses and National Identification Numbers

When working with customer, or rather party, master data management and related data quality improvement and prevention for traditional offline and some online purposes, you will most often deal with names, addresses and national identification numbers.

While this may be tough enough for domestic data, doing this for international data is a daunting task.

Names

In reality there should be no difference between dealing with domestic data and international data when it comes to names, as people in today’s globalized world move between countries and bring their names with them.

Traditionally the emphasize on data quality related to names has been on dealing with the most frequent issues be that heaps of nick names in the United States and other places, having a “van” in bulks of names in the Netherlands or having loads of surname like middle names in Denmark.

With company names there are some differences to be considered like the inclusion of legal forms in company names as told in the post Legal Forms from Hell.

UPU S42Addresses

Address formats varies between countries. That’s one thing.

The availability of public sources for address reference data varies too. These variations are related to for example:

  • Coverage: Is every part of the country included?
  • Depth: Is it street level, house number level or unit level?
  • Costs: Are reference data expensive or free of charge?

As told in the post Postal Code Musings the postal code system in a given country may be the key (or not) to how to deal with addresses and related data quality.

National Identification Numbers

The post called Business Entity Identifiers includes how countries have different implementations of either all-purpose national identification numbers or single-purpose national identification numbers for companies.

The same way there are different administrative practices for individuals, for example:

  • As I understand it is forbidden by constitution down under to have all-purpose identification numbers for individuals.
  • The United States Social Security Number (SSN) is often mentioned in articles about party data management. It’s an example of a single-purpose number in fact used for several purposes.
  • In Scandinavian countries all-purpose national identification numbers are in place as explained in the post Citizen ID within seconds.

Dealing with diversity

Managing party master data in the light of the above mentioned differences around the world isn’t simple. You need comprehensive data governance policies and business rules, you need elaborate data models and you need a quite well equipped toolbox regarding data quality prevention and exploiting external reference data.

Bookmark and Share

Keep It Real, Stupid

One of my pet peeves is the KISS principle: Keep It Simple, Stupid.

Don’t get me wrong: It’s worth striving for simplicity wherever possible. But some problems are not simple and have simple solutions. Sometimes KISS is the shortcut to getting it all wrong.

Another take on simplicity is a quote floating around in social media these days:

Simply Einstein

Oh, so Einstein said that. So you can’t argue with that.

Well, he probably didn’t as Wikiquote reports:

Simply Not Einstein

So let’s stick to a real Einstein quote:

“Everything should be as simple as it can be, but not simpler”

A great quote related to data quality and master data management by the way.

Bookmark and Share

Big Data and Data Matching

Data matching has been an established discipline for many years and most data quality tools have more or less sophisticated features for data matching as well as many MDM (Master Data Management) platforms have data matching capabilities.

BigDataQuality
The LinkedIn Big Data Quality group

In a way the data matching realm has become slightly dull the recent years. People don’t get excited anymore over a discussion about if deterministic matching or probabilistic matching is the right way.  Soundex is old, edit distance has been around for ages and matchcodes may have outlived themselves.

So, it’s good to see a new beast turning up. Data matching with big data.

It may be about deduplicating (deduping) volumes that is bigger than traditional data matching can handle. You know: Dedoop’ing.

But it is also very much about matching big data with small data, first and foremost master data. And having well matched master data. Kimmo Kontra wrote a good post about that recently. The post is called Big Grease, Big Data, and Big Apple – manholes and MDM.

The case presented by Kimmo holds many exciting implementations of data matching like for example proximity matching of locations.

Bookmark and Share

Why You shouldn’t go to the MDM Summit Europe 2013

The weather in London has been awful this March. The forecast for the first week of April doesn’t meet historical standards either. The MDM Summit Europe 2013 will be in London 15th to 17th April. You shouldn’t go there because of the weather based on the trend in the weather forecast:

London Forecast April 2013

On the other hand, it could heat up indoor.

There are quite a lot of exciting sessions, including the ones about:

And hey, it has happened before that the weather has suddenly improved.

Bookmark and Share

Small Data with Big Impact

In an ongoing discussion on LinkedIn there are some good points on: How important is data quality for big data compared to data quality for small data?

A repeated sentiment in the comments is that data quality for small data is going to be more important with the rise of big data.

The small data we are talking about here is first and foremost master data.

Master Data Challenges with Big Data

As with traditional transaction data master data is also describing the who, what, where and when of big data.

If we are having issues with completeness, timeliness and uniqueness in our master data any prediction based on big data matched with master data is going to be as chaotic as weather forecasts.

big small dataWe also need to expand the range of entities embraced by our master data management implementations as exemplified in the post Social MDM and Future Competitive Intelligence.

Matching Big Data with Master Data

Some of the issues in matching big data with master data I have stumbled upon are:

  • Who: How do we link the real world entities reflected in our traditional systems of record with the real world entities behind who’s talking in systems of engagement? This question was touched in post Making Sense with Social MDM.
  • What: How do we manage our product hierarchies and product descriptions so they fulfill both (different) internal purposes and external usage? More on this in the post Social PIM.
  • Where: How do we identify a given place? If you think this is easy, why not read the post Where is the Spot?
  • When: Date and time comes in many formats and relating events to the wrong schedule may have us  Going in the Wrong Direction.

How: You may for example follow this blog. Subscription is in the upper right corner 🙂

Bookmark and Share

Sharing is the Future of MDM

Over at the DataRoundtable blog Dylan Jones recently posted an excellent piece called The Future of MDM?

Herein Dylan examines how a lot of people in different organizations spend a lot of time on trying to get complete, timely and unique data about customers and other business partners.

A better future for MDM (Master Data Management) could certainly be that every organization doesn’t have to do the work over and over and again. While self registration by customers is a way of letting off the burden on private enterprises and public sector bodies, we may even do better by not having the customer being the data entry clerk and typing in the same information over and over and again.

Today there are several available options for customer and other business partner reference data:

  • Public sector registries which are getting more and more open being that for example for the address part or even deeper in due respect of privacy considerations which may be different for business entities and individual entities.
  • Commercial directories often build on top of public registries.
  • Personal data lockers like the Mydex service mentioned by Dylan.
  • Social network profiles.

instant Single Customer ViewMy guess is that the future of MDM is going to be a mashup of exploiting the above options.

Oh, and as representatives of such a mashup service we recently at iDQ made sure we had the accurate, complete and timely information filled in on our Linkedin Company profile.

Bookmark and Share

Making sense with Social MDM

A few days ago Jeff Jonas of IBM made a new blog post called Master Data Management (MDM) vs. Sensemaking.

iDQ microscopeHerein Jeff Jonas ponders the differences in the data matching algorithms we use in traditional MDM, predominately name and address matching, and the kind of identity resolution we need when we for example try to listen to and make sense of the signals in the social media data streams.

Jeff Jonas says: “Different missions, different tools.  Some organizations will use one or the other; most organizations will want both.”  

I tend to disagree slightly with Jeff Jonas. As told in the post The New Year in Identity Resolution I think we will need a connection between the old systems of record and the new systems of engagement.

Indeed the algorithms will be used differently and indeed we need different thresholds of confidence for different tasks. But I think we will have to make the integration story a bit more complicated in order to make sensible decisions across the two missions.

Bookmark and Share

Data Management in the Cloud

We are seeing more and more data management services offered in the cloud.

dnblogo2As I have had a long time experience with data matching services around the Dun & Bradstreet WorldBase, it was good to see a presentation yesterday in Stockholm featuring D&B Europe’s new cloud based data manager service.

Managing World-Wide B2B Master Data

The D&B WorldBase is a business directory with 225 million business entities from all over the world.

D&B’s Data Manager is a self-service application in the cloud around the WorldBase taking care of:

  • Data matching with comprehensive functionality for manual inspection, approval and master data survivorship
  • Data enrichment embracing a wide range of data attributes
  • Data Maintenance subscription for keeping enriched data up to date

The data matching functionality is built on the good old D&B methodology with confidence codes and matchgrades.

Right for QlikTech

QlikTech is the Swedish firm (pretending to be American) behind the prominent business intelligence solution called QlikView.

At the Stockholm event QlikTech presented how and why they use the D&B Data Manager for ensuring the right data quality in their cloud based B2B CRM solution (SalesForce.com).

As QlikTech is operating all over the world having a consistent world-wide business directory as the reference for party master data is extremely important, and the self-service concept is a perfect match for having the right insight and control into achieving the needed level of data quality in CRM master data.

From there the QlikTech CRM team takes its own medicine using QlikView for self-service business intelligence.

Bookmark and Share

instant Single Customer View

Achieving a Single Customer View (SCV) is a core driver for many data quality improvement and Master Data Management (MDM) implementations.

As most data quality practitioners will agree, the best way of securing data quality is getting it right the first time. The same is true about achieving a Single Customer View. Get it right the first time. Have an instant Single Customer View.

The cloud based solution I’m working with right now does this by:

  • Searching external big reference data sources with information about individuals, companies, locations and properties as well as social networks
  • Searching internal master data with information already known inside the enterprise
  • Inserting really new entities or updating current entities by picking  as much data as possible from external sources

instant Single Customer View

Some essential capabilities in doing this are:

  • Searching is error tolerant so you will find entities even if the spelling is different
  • The receiving data model is real world aligned. This includes:
    • Party information and location information have separate lives as explained in the post called A Place in Time
    • You may have multiple means of contact attached like many phones, email addresses and social identities

How do you achieve a Single Customer View?

Bookmark and Share

Master Data Management in the Utility Sector

Making vertical MDM (Master Data Management) solutions, being MDM solutions prepared for a given industry, seems to become a trend in the MDM realm.

Traditionally many MDM solutions actually are strong in a given industry or a few related industries.

This is also true for the MDM solution I’m working with right now, as this solution has gained traction in the utility sector.

So, what’s special (and not entirely special) about the utility sector?

Here are three of my observations:

Exploiting big external reference data

As examined in the post instant Data Quality at Work the utility sector may gain much in using all the available external reference data available in the party master data domain, including:

  • Consumer/citizen directories
  • Business directories
  • Address directories
  • Property directories

However, if data quality shouldn’t be a joke, this means using the best national data sources available as many of the world-wide data sources is this domain are far from providing the precision, accuracy and timeliness needed in the utility sector.

Location precision

Managing locations is a big thing in the utility sector. The post called Where is the Spot explains how identifying locations isn’t as simple as we may use to think in daily life.

This is indeed also true in the utility sector where the issue also includes managing many different locations for the same customer fulfilling different purposes at the same time.

The products

puzzleThe electricity supply part of the utility sector share a lot of issues with the telco sector when it comes to fixed installations and the products and services are in fact the same in some cases which also as a consequence means that  some organizations belongs to both sectors.

This is also a danger with vertical MDM solutions as there may be several best-of-breed options for a given organization, which eventually will result in choosing more than one platform and thereby introducing the silos which MDM in first place was supposed to eliminate.