Tear Down These Walls

Over at The Data Roundtable there is some good thinking going on. Recently Dylan Jones blogged: Want to improve data quality? Start by re-imagining your data boundaries.

In his blog post Dylan explains how data journeys are costly and risky. There are huge opportunities, not at least for data quality, in simplifying the sharing of data by breaking down the data boundaries.

The Berlin Wall. Fortunately it is not there anymore.

Data boundaries exists within organisations and between organisations. As the way of doing business today involves businesses working together, we see more and more data being sent between businesses. Unfortunately often using spreadsheets as told in post Excellence vs Excel.

We definitely need better ways to share data within organisations and between organisations. Furthermore, as Dylan points out, the data exchange needs to go in both directions. The ability to share data in an intelligent way is based on that data is identified and described by commonly shared reference and master data.

In my experience, the ability to collaborate between businesses by sharing reference and master data, and utilize available public sources, will be crucial in the quest for re-imagining data boundaries. This is indeed the future of data quality and The Future of Master Data Management.

Identity Resolution and Social Data

Identity Resolution

Identity resolution is a hot potato when we look into how we can exploit big data and within that frame not at least social data.

Some of the most frequent mentioned use cases for big data analytics revolves around listening to social data streams and combine that with traditional sources within customer intelligence. In order to do that we need to know about who is talking out there and that must be done by using identity resolution features encompassing social networks.

The first challenge is what we are able to do. How we technically can expand our data matching capabilities to use profile data and other clues from social media. This subject was discussed in a recent post on DataQualityPro called How to Exploit Big Data and Maintain Data Quality, interview with Dave Borean of InfoTrellis. In here InfoTrellis “contextual entity resolution” approach was mentioned by David.

The second challenge is what we are allowed to do. Social networks have a natural interest in protecting member’s privacy besides they also have a commercial interest in doing so. The degree of privacy protection varies between social networks. Twitter is quite open but on the other hand holds very little usable stuff for identity resolution as well as sense making from the streams is an issue. Networks as Facebook and LinkedIn are, for good reasons, not so easy to exploit due to the (chancing) game rules applied.

As said in my interview on DataQualityPro called What are the Benefits of Social MDM: It is a kind of a goldmine in a minefield.

Bookmark and Share

In the future, data quality will be more social

Every time I walk in and out of a plane at London-Gatwick Airport I always nod at an advert from the HSBC bank saying that in the future, selling will be more social:

Selling will be more social

A natural consequence of this will also be that data quality improvement (and master data management) will be more social.

One example is how complex sales, being sales processes typically in business-to-business (B2B) environments, will be heavily depended on integrating the exploitation of professional social networks as discussed on the DataQualityPro interview about the benefits of Social MDM.

Traditional Master Data Management (MDM) and related data quality improvement in B2B environments has been a lot about a single view of the business account and the legal entity behind. As Social Customer Relation Management (CRM) is much about the relations to the business contacts, the people side of business, we need a solid master data foundation behind the people being those contacts.

The same individual may in fact be an important influencer related to a range of business accounts being the legal entity with who you are aiming for a sales contract. You need a single view of that. So many sales contracts are based on a relation to a buyer moving from one business account to another. You need to be the winner in that game and the answer to that may very well be your ability to do better social MDM and embrace the data quality issues related to that.

Social selling of course also relates to business-to-consumer (B2C) activities and in doing that we will see new data quality issues. When exploiting social networks, both in B2B and B2C activities you need to link the traditional attributes as name and address with new attributes in the online and social world as explained in the post Multi-Channel Data Matching.

Besides exploiting social networks we will also see social collaboration as a mean to improve data quality. Social collaboration will go beyond collaboration within a single company and extend to the ecosystems of manufacturers, distributors, resellers and end users. A good example of this is the social collaboration platform called Actualog, which is about sharing product master data and thereby improving product data quality.

Bookmark and Share

Connecting CRM and MDM with Social Network Profiles

As told on DataQualityPro recently in an interview post about the Benefits of Social MDM, doing social MDM (Master Data Management) may still be outside the radar of most MDM implementations. But there are plenty of things happening with connecting CRM (Customer Relationship Management) and social engagement.

While a lot of the talk is about the biggest social networks as FaceBook and LinkedIn, there are also things going around with more local social networks like the German alternative to LinkedIn called Xing.


Last week I followed a webinar by Dirk Steuernagel of MRM24. It was about connecting your SalesForce.com contact data with Xing.

As said in the MRM24 blog post called Social CRM – Integration von Business Netzwerken in Salesforce.com:

“Our business contacts are usually found in various internal and external systems and on non-synchronized platforms. It requires a lot of effort and nerves to maintain all of our business contacts at the different locations and keep the relevant information up to date.”

(Translated to English by Google and me).


We see a lot of connectors between CRM systems and social networks.

In due time we will also see a lot of connectors between MDM and social networks, which is a natural consequence of the spread of social CRM. This trend was also strongly emphasized on the Gartner (the analyst firm) tweet chat today:

GartnerMDM chat and social MDM

Bookmark and Share

Social Data Quality

A cornerstone in the social sphere around data quality is the site DataQualityPro founded by Dylan Jones.

This week the site had a major facelift. As Dylan explains:

“We’ve moved over to one of the most advanced content hosting sites available to make it easier for you to discover, share and engage with the huge amounts of educational content and resources we now have on the site.”

You may read more about the changes in the post Welcome to the New Look Data Quality Pro.

I remember joining DataQualityPro even before it was a site, as it started as a section of the sister site called DataMigrationPro.

During the years I have learned a lot by being a member of DataQualityPro and as most things social you don’t pay anything for being that. The only difference compared to other services is that there are no paid upgrades. You get the full package when joining.

There are sponsors too of course.

Also here I, as representing the data quality service provider iDQ, have very good experiences with DataQualityPro. Last summer we had a technology briefing on the site with a massive response.

So, if you haven’t seen the new design or you are not a member (or a sponsor) yet, hurry on and visit


Bookmark and Share

Making Data Quality Gangnam Style

The 21st December 2012 wasn’t the end of the world. But it was the day a music video for the first time passed one billion views on YouTube. It has been said that a reason for this success for Gangnam Style was that the Korean pop singer PSY hasn’t pursued any copyrights related to the video. But that doesn’t mean that PSY doesn’t earn money from the video. On the contrary related commercials are making money Gangnam Style.

A hindrance for better data quality by better real world alignment has traditionally been lack of free and open reference data.  Some issues has been availability and heavy price tags on government collected data.

In my current daily work I mostly use such data within the United Kingdom and Denmark. And here the authorities are taking different paths.

The prices on UK public reference data has traditionally been fairly high and there’s certainly room for innovation around open government data as reported on DataQualityPro in the post Introduction to the Open Data User Group UK.

In Denmark the 21st December 2012 was the day it was published that a unanimous parliament had agreed on the laws behind having Free and Open Public Sector Master Data. From the 1st January 2013 there are no price tags on reference data about addresses, properties, companies (and citizens) and there are plans for making those data even more available, consistent and timely.

Great news for data quality, Gangnam Style.

Data Quality Gangnam Style

Bookmark and Share

MDM Summit Europe 2012 Preview

I am looking forward to be at the Master Data Management Summit Europe 2012 next week in London. The conference runs in parallel with the Data Governance Conference Europe 2012.

Data Governance

As I am living within a short walking distance of the venue I won’t have so much time thinking as Jill Dyché had when she recently was on a conference within driving distance, as reported on her blog post After Gartner MDM in which Jill considers MDM and takes the road less traveled. In London Jill will be delivering a key note called: Data Governance, What Your CEO Needs to know.

On the Data Governance tracks there will be a panel discussion called Data Governance in a Regulatory Environment with some good folks: Nicola Askham, Dylan Jones, Ken O’Connor and Gwen Thomas.

Nicola is currently writing an excellent blog post series on the Six Characteristics Of A Successful Data Governance Practitioner. Dylan is the founder of DataQualityPro. Ken was the star on the OCDQblog radio show today discussing Solvency II and Data Quality.

Gwen, being the founder of The Data Governance Institute, is chairing the Data Governance Conference while Aaron Zornes, the founder of The MDM Institute, is chairing the MDM Summit.

Master Data, Social MDM and Reference Data Management

The MDM Institute lately had an “MDM Alert”  with Master Data Management & Data Governance Strategic Planning Assumptions for 2012-13 with the subtitle: Pervasive & Pandemic MDM is in Your Future.

Some of the predictions are about reference data and Social MDM.

Social master data management has been a favorite subject of mine the last couple of years, and I hope to catch up with fellow MDM practitioners and learning how far this has come outside my circles.

Reference Data is a term often used either instead of Master Data or as related to Master Data. Reference data is those data defined and initially maintained outside a single enterprise. Examples from the customer master data realm are a country list, a list of states in a given country or postal code tables for countries around the world.

The trend as I see it is that enterprises seek to benefit from having reference data in more depth than those often modest populated lists mentioned above. In the customer master data realm such big reference data may be core data about:

  • Addresses being every single valid address typically within a given country.
  • Business entities being every single business entity occupying an address in a given country.
  • Consumers (or Citizens) being every single person living on an address in a given country.

There is often no single source of truth for such data.

As I’m working with an international launch of a product called instant Data Quality (iDQ™) I look forward to explore how MDM analysts and practitioners are seeing this field developing.

Bookmark and Share

Big Reference Data Musings

The term “big data” is huge these days. As Steve Sarsfield suggest in a blog post yesterday called Big Data Hype is an Opportunity for Data Management Pros, well, let’s ride on the wave (or is it tsunami?).

The definition of “big data” is as with many buzzwords not crystal clear as examined in a post called It’s time for a new definition of big data on Mike2.0 by Robert Hillard. The post suggests that big may be about volume, but is actually more about big complexity.

As I have worked intensively with large amounts of rich reference data, I have a homemade term called “big reference data”.

Big Reference Data Sets

Reference Data is a term often used either instead of Master Data or as related to Master Data. Reference data is those data defined and (initially) maintained outside a single organization. Examples from the party master data realm are a country list, a list of states in a given country or postal code tables for countries around the world.

The trend is that organizations seek to benefit from having reference data in more depth than those often modest populated lists mentioned above.

An example of a big reference data set is the Dun & Bradstreet WorldBase. This reference data set holds around 300 different attributes describing over 200 million business entities from all over world.

This data set is at first glance well structured with a single (flat) data model for all countries. However, when you work with it you learn that the actual data is very different depending on the different original sources for each country. For example addresses from some countries are standardized, while this isn’t the case for other countries. Completeness and other data quality dimensions vary a lot too.

Another example of a large reference data set is the United Kingdom electoral roll that is mentioned in the post Inaccurately Accurate. As told in the post there are fit for purpose data quality issues. The data set is pretty big, not at least if you span several years, as there is a distinct roll for every year.

Big Reference Data Mashup

Complexity, and opportunity, also arises when you relate several big reference data sets.

Lately DataQualityPro had an interview called What is AddressBase® and how will it improve address data quality? Here Paul Malyon of Experian QAS explains about a new combined address reference source for the United Kingdom.

Now, let’s mash up the AddressBase, the WorldBase and the Electoral Rolls – and all the likes.

Image called Castle in the Sky found on photobotos.

Bookmark and Share

Five Moments of Truth in Subscriber Data Management

The term “Subscriber Data Management” with SDM as the TLA is the industry flavor in the telecommunication sector of the general term “Customer Data Management”.

Recently Teresa Cottam, research director of Telesperience, made a good introduction to the subject in an interview on DataQualityPro.com.

As we have a term as “Customer Master Data Management” we will then also have a term as “Subscriber Master Data Management”.

Based on my experience with phone companies “Subscriber Master Data Management” will be very much about (better) handling the subscriber’s life circle.

These are probably the five most important moments in a subscriber’s life circle(s):

  • A lead is born
  • Engaging a prospect
  • One more subscriber
  • Churn happens
  • Win-Back happiness

A lead is born

One of the most important things to do when capturing the data at this point is ensuring if you already have the person/business behind the subscriber somewhere in the life circle or maybe even in other party roles as examined in the post 360° Business Partner View.

Engaging a prospect

Much of the information prospects are asked about already exist somewhere in the cloud. Why not take advantage of these rich sources as described in Reference Data at Work in the Cloud. By doing that you will have fewer keystrokes and a much better chance of getting it right the first time.  

One more subscriber

After a successful sales process a new subscriber can be added to the subscriber list often with more data being captured as adding a billing address and stating credit risk as credit limit and terms of payment.

This is the point where many party entities are split into data silos. Maybe the current subscriber master data lives on in sales oriented systems while new subscriber data are reentered and enriched in an ERP system and other business applications.

Keeping these data silos aligned is the master data challenge as discussed in the post Boiling Data Silos.

Churn happens

A churn is often seen as the termination of a given subscription. But did the person/business behind the subscription really quit or is the service still covered by other subscriptions by the same person, by the household or within a company family tree?

Isn’t the person among us anymore or did a business dissolve?  

Such questions can be answered better if you are practicing Ongoing Data Maintenance

Win-Back happiness

If a person or business really did quit, but then comes back, then be sure to build on the data from the first engagement and not start from scratch again capturing master data and history. Avoiding this covers up for some of the 55 reasons to improve data quality related to party master data uniqueness.

Bookmark and Share

Diversity in Data Quality in 2010

Diversity in data quality is a favorite topic of mine and diversity has been my theme word in social media engagement this year.

Fortunately I’m not alone. Others have been writing about diversity in data quality in the past year. Here are some of the contributions I remember:

The Dutch data quality tool vendor Human Inference has a blog called Data Value Talk. Here several posts are about diversity in data quality including the post World Languages Day – Linguistic diversity rules in Switserland!

Another blog based in the Netherlands is from Graham Rhind. Graham (a Brit stranded in Amsterdam) is an expert in international issues with data quality and one of his blog posts this year is called Robert the Carrot.

The MDM Vendor IBM Initiate has a lively blog about Master Data Management and Data Quality. One of the posts this year was an introduction to a webinar. The post by Scott Schumacher (in which I’m proud to be mentioned) is called Join Us to Demystify Multi-Cultural Name Matching.

Rich Murnane posted a funny but learning video with Derek Sivers about Japanese addresses called What is the name of that block? (Again, thanks Rich for the mention).

In the eLearningCurve free webinar series there was a very educational session with Kathy Hunter called Overcoming the Challenges of Global Data.  There is also an interview with Kathy Hunter on the DataQualityPro site.

I also remember we debated the state of the art of data quality tools when it comes to international data in the post by Jim Harris called OOBE-DQ, Where Are You? As Jim mentions in his later post called Do you believe in Magic (Quadrants)?: “It must be noted that many vendors (including the “market leaders”) continue to struggle with their International OOBE-DQ”.

I guess that international capabilities in data quality tools and party master data management solutions will be on the agenda in 2011 as well.

Bookmark and Share