The Data Enrichment ABC

A popular and indeed valuable method of avoiding decay of data quality in customer master data and other master data entities is setting up data enrichment services based on third party reference data sources. Examples of such services are:

  • Relocation updates like National Change Of Address services from postal services
  • Change of name, address and a variety of status updates from business directories and in some countries citizen directories too

When using such services you will typically want to consider the following options for how to deal with the updates:

A: Automatic Update

Here your internal master data will be updated automatically when a change is received from the external reference data source.

C: Excluded Update

Here an automated rule will exclude the update as there may be a range reasons for why you don’t want to update certain entity segments under certain circumstances.

B: Interactive Update

Here the update will require a form of manual intervention either to be fulfilled or excluded based on human decision.

An example will be if a utility supplier receives a relocation update for the occupier at an installation address. This will trigger/support a complex business process far beyond changing the billing address.

iDQ logo
iDQ

As explained in the post When Computer Says Maybe we need functionality within data quality tools and Master Data Management (MDM) solutions to support data stewards in cost effectively handling these situations and this certainly also applies to the B pot in data enrichment.

Right now I’m working with designing such data stewardship functionality within the instant Data Quality environment.

Bookmark and Share

Know Your Supplier

Social Responsibility for Retailers and Distributors is No Longer an Option is the title of a new blog post by Paul Sirface on the Stibo Systems Datafusion blog.

Herein Paul writes:

“While many companies know that they have to respond to consumers’ demands, those with an active Master Data Management strategy have the best chance of responding effectively.  Multi-domain Master Data Management (MDM) is the perfect place to begin organizing and collecting the data on product related information within the supply chain, including supplier compliance..”

KYC KYSKnow Your Customer (KYC) is a well established term within data management and linked to fraud protection and anti money laundering.

Know Your Supplier (KYS) is indeed an equally important side of party master data management.

While customer master data management is on the way of evolving from handling mostly domestic customer data quality issues to also handling international customer data quality issues, supplier master data management has always been about international data quality challenges for most businesses.

As with customer master data having supplier master data that is well aligned with the real world and that can be maintained to reflect changes in the real world is indeed the starting point.

Bookmark and Share

The Future of Data Stewardship

Data Stewardship is performed by data stewards.

What is a Data Steward?

A steward may in a general sense be:

  • One employed in a large household or estate to manage domestic concerns – typically an old role.
  • An employee on a ship, airplane, bus, or train who attends passengers needs – typically a new role.

My guess is that data stewardship also will tend to be going from the first kind of role related to data to the latter kind role related to data.

The current data steward role is predominately seen as the oversight of the house-holding related to the internal enterprise data assets. It’s about keeping everything there clean and tidy. It involves having routines and rules that ensure that things with data are done properly according to the traditions and culture in the enterprise.

Big Data Stewardship

In the future enterprises will rely much more on external data. Exploiting third party reference data and open government data and digging into big data sources as social data and sensor data will shift the focus from looking mostly into keeping the internal data fit for purposes.

As such you as a data steward will become more like the steward on a ship, airplane, bus or train. Data will come and go. After a nice welcoming smile you will have to carefully explain about the safety procedures. Some data will be fairly easy to handle – mostly just spending the time sleeping. Other data will be demanding asking for this and that and changing its mind shortly after. Some data will be a frequent traveler and some data will be there for the first time.

So, are you ready to attend the next batch of travelling data on board your enterprise?

star trek enterprise

Bookmark and Share

How is Social MDM different?

In a recent interview with yours truly on the Fliptop blog I had the chance to answer a question about how Social MDM is different from traditional MDM (Master Data Management). Check out the interview here.

As said in the interview I think that:

“The main difference between MDM as it has been practiced until now and Social MDM is that traditional MDM has been around handling internal master data and Social MDM will be more around exploiting external reference data and sharing those data.”

This is in line with a take away from the MDM Summit Europe 2013 as reported in the post Adding 180 Degrees to MDM.

But, as asked by a member of the Social MDM group on LinkedIn:

What is the industry or analysts’ consensus on the meaning of Social MDM? Is it just gathering Master Data from social sources? Not really MDM – where is the Management part?”

Social MDM IconYou may follow the discussion here.

I definitely think that the management part is there, but it is different. Management is different in the social sphere in general. Data governance is different when it comes to social data (and other big data for that matter). Relying on social collaboration when maintaining master data is different from implementing “a data steward regime”.

In my eyes the management part is about balancing the use of internal master and the use of external reference data. Every organization should very carefully assess if they are good at maintaining different aspects of their internal master data (Hint: Many aren’t). Getting help from traditional data collectors and the new social sources and using social collaboration may very well be an important part of the solution.

Bookmark and Share

Names, Addresses and National Identification Numbers

When working with customer, or rather party, master data management and related data quality improvement and prevention for traditional offline and some online purposes, you will most often deal with names, addresses and national identification numbers.

While this may be tough enough for domestic data, doing this for international data is a daunting task.

Names

In reality there should be no difference between dealing with domestic data and international data when it comes to names, as people in today’s globalized world move between countries and bring their names with them.

Traditionally the emphasize on data quality related to names has been on dealing with the most frequent issues be that heaps of nick names in the United States and other places, having a “van” in bulks of names in the Netherlands or having loads of surname like middle names in Denmark.

With company names there are some differences to be considered like the inclusion of legal forms in company names as told in the post Legal Forms from Hell.

UPU S42Addresses

Address formats varies between countries. That’s one thing.

The availability of public sources for address reference data varies too. These variations are related to for example:

  • Coverage: Is every part of the country included?
  • Depth: Is it street level, house number level or unit level?
  • Costs: Are reference data expensive or free of charge?

As told in the post Postal Code Musings the postal code system in a given country may be the key (or not) to how to deal with addresses and related data quality.

National Identification Numbers

The post called Business Entity Identifiers includes how countries have different implementations of either all-purpose national identification numbers or single-purpose national identification numbers for companies.

The same way there are different administrative practices for individuals, for example:

  • As I understand it is forbidden by constitution down under to have all-purpose identification numbers for individuals.
  • The United States Social Security Number (SSN) is often mentioned in articles about party data management. It’s an example of a single-purpose number in fact used for several purposes.
  • In Scandinavian countries all-purpose national identification numbers are in place as explained in the post Citizen ID within seconds.

Dealing with diversity

Managing party master data in the light of the above mentioned differences around the world isn’t simple. You need comprehensive data governance policies and business rules, you need elaborate data models and you need a quite well equipped toolbox regarding data quality prevention and exploiting external reference data.

Bookmark and Share

Putting it Right

Data Governance (DG), Reference Data Management (RDM) and Management Data Management (MDM) are closely related disciplines.

MDM DG RDMConsequently the Data Governance Conference Europe 2013 and the Master Data Management Summit Europe 2013 are co-located and a hot topic this year is Reference Data Management.

The difficulties in putting the sessions on the conference in one right place may be seen by that the session called Establishing Reference Data Governance in the Large Enterprise is part of a MDM track, but is actually mostly about data governance. The session is labeled Product MDM & Reference Data, but will be about governing reference data for multi-domain MDM and the data governance program described was in fact based on a party master data challenge involving reference data for industry classification.

In the session Petter Larsen, Head of Data Governance at Norway’s largest financial services group called DNB, and Thomas T. Thykjaer, Lead MDM Consultant at Capgemini, will connect the dots in the landscape of business vocabularies, data models, the data governance toolbox, data domains and reference data architecture.

I for sure look forward to that Petter and Thomas will put it right.

Bookmark and Share

The Data Governance Jigsaw Puzzle

Picture this: You find yourself taking over a challenging Data Governance initiative part way through and the path to complete the implementation is far from clear.

Most learning and best practices for data governance implementation, and a lot of other implementations of whatever, are based on doing the stuff from start to end. But in fact many people are thrown into the journey somewhere along the route without any own history on how the journey began, no clear understanding on why the actual direction was taken and no clue about where the end of the rainbow is supposed to be.

If this isn’t hard enough the good people organizing the Data Governance Conference Europe 2013 (co-located with the MDM Summit) has put the session from Nicola Askham on this tough challenge almost at end of the program. Check it out here.

Last Friday I met Nicola for an after work drink at a secret place in the City of London and I can assure you that Nicola despite all odds is fit for fight and ready to kick y… well, putting the puzzle together.

DG2013

Bookmark and Share

Business in the Driver’s Seat for MDM

It has always been a paradox in Master Data Management (MDM), and many other IT enabled disciplines, that while most people agree that the business part of business should take the lead, often it is the IT part of business that is running the projects.

However, at Tetra Pak, a multi-national company of Swedish origin, MDM has been approached as a business problem rather than as an IT problem.

Yesterday I touched base with Program Manager Jesper Persson at Tetra Pak.

A main reason for Tetra Pak to focus on MDM was having a very specific business problem related to master data, not an IT problem. Taking it from there the business has been in the driver’s seat for the MDM journey.

Master data quality and related data quality dimensions are seen as triggers for the essential KPI’s related to process performance. The model for getting this right is starting with the business requirements, putting the needed data governance in place, getting on with managing master data which leads to the actual master data maintenance all as part of business process management.

Jesper is telling a lot more at the Master Data Management Summit Europe 2013 in London in the session Business in the Driver’s Seat for MDM – Integrating MDM with BPM.

MDM Summit Europe 2013

Bookmark and Share

MDM Summit Europe 2013 Wordle

The Master Data Management Summit Europe 2013, co-located with the Data Governance Conference Europe 2013, takes place in London the 15th to 17th April.

Here is a wordle with the session topics:

MDMDG 2013 wordle

Some of the words catching my eyes are:

Global is part of several headlines. There is no doubt about that governing master data on a global scale is a very timely subject. Handling master data in a domestic context can be hard enough, but enterprises are facing a daunting task when embracing party master data, product master data and location master data covering the diversity of languages, script systems, measuring systems, national standards and regulatory requirements. However, there is no way around the challenges when synergies in global enterprises are to be harvested.

RDM (Reference Data Management) is becoming a popular subject as well. Being successful with governing master data requires a steady hand with the reference data layer that sits on top of the master data. Some reference data sets may be small, but the importance of getting them right must not be underestimated.

Business. Oh yes. All the data stuff is there to enable business processes, drive business transformation and make business opportunities.

Bookmark and Share

The New Year in Identity Resolution

identity resolutionYou may divide doing identity resolution into these categories:

  • Hard core identity check
  • Light weight real world alignment
  • Digital identity resolution

Hard Core Identity Check

Some business processes requires a solid identity check. This is usually the case for example for credit approval and employment enrolment. Identity check is also part of criminal investigation and fighting terrorism.

Services for identity checks vary from country to country because of different regulations and different availability of reference data.

An identity check usually involves the entity who is being checked.

Light Weight Real World Alignment

In data quality improvement and Master Data Management (MDM) you often include some form of identity resolution in order to have your data aligned with the real world. For example when evaluating the result of a data matching activity with names and addresses, you will perform a lightweight identity resolution which leads to marking the matched results as true or false positives.

Doing such kind of identity resolution usually doesn’t involve the entity being examined.

Digital Identity Resolution

Our existence has increasingly moved to the online world. As discussed in the post Addressing Digital Identity this means that we also will need means to include digital identity into traditional identity resolution.

There are of course discussions out there about how far digital identity resolution should be possible. For example real name policy enforcement in social networks is indeed a hot topic.

Future Trends

With regard to digital identity resolution the jury is still out. In my eyes we can’t avoid that the economic consequences of the rising social sphere will affect the demand for knowing who is out there. Also the opportunities in establishing identity via digital footprints will be exploited.

My guess is that the distinction between hard core identity check and real world alignment in data quality improvement and MDM will disappear as reference data will become more available and the price of reference data will go down.

That’s why I’m right now working with a solution (www.instantdq.com) that combines identity check features and data universe into master data management with the possibility of adding digital identity into the mix.

Bookmark and Share