Christmas is just around the corner. But how far away is that corner?
New Oxford Dictionary Entries in 2013
Well, selfie was selected as the new word of the year in the Oxford English Dictionary and indeed that choice was celebrated with the buzzworthy selfie taken at the memorial services for Nelson Mandela this week.
Big data also made it to the list of well explained terms as told in this post: OK, so big data is about size (and veracity).
And finally, after a little social sharing of this post on my phablet, I srsly think I will have a digital detox.
Ways of Sharing Master Data
The ”buy vs. build” option is well known within many disciplines not at least around your IT application stack. The trend here is that where you in the old times did a lot of in-house programming today you tend to buy more and more stuff to prevent reinventing wheel. Yesterday there was a post on that on Informatica Perspectives. The post is called Stop The Hand-Coding Madness!.
We certainly also see that trend when it comes to Master Data Management (MDM) solutions. And my guess is that we will see that trend too when it comes to the master data itself.
What has puzzled me over the years is how a lot of organizations spend time on and makes their personal errors when they type in the name, address and other core data about individuals and companies they do business with or alternatively letting us business partners type in our name, address and other data again and again – sometimes with a little remembering help from Google.
With product data you see that the same data is retyped again and again with heaps of errors and shortcuts from when the description and specifications is registered at the manufacturer, then again at a couple of wholesalers, at a lot of retailers and for some product types as for example spare parts in heaps of end user organizations.
In order to avoid this madness there are some different ways in which master data can be shared between organizations:
Using commercial third party data
Using third party directories is a well known way of buying your master data.
Business directories have been used for ages. The Dun&Bradstreet WorldBase is probably the most widely known example, but there are plenty of alternatives when it comes to specific regions and countries out there. Where it earlier was common to use these sources for downstream data enrichment we now see more services for picking the id, names, addresses and other data in the data entry process.
Address directories are becoming very useful for example in using rapid addressing which saves time and ensures data quality for addresses when they are entered.
Product directories with related services can also help within managing product master data.
Digging into open government data
In many countries the public company registry is available as a raw business directory and in some countries there are also possibilities with citizen data. Public sector is often the root source for address data, which is getting more available around and even in some cases with relating property data as told in post Making Data Quality Gangnam Style.
As it often isn’t in the genes of public sector bodies to provide nice and easy ways of getting to these data, there are good opportunities for private enterprises to add that service on top of the open government data.
Having your own data locker
Instead of having business men controlling your data or trusting the government to do so the idea of a personal controlled data locker has gained interest. In the UK there is such a service called Mydex.
Relying on social collaboration
Most people and companies too are doing a good job in maintaining their profile data on social networks. So this is in many cases the place to go to find out where someone is and is doing right now.
Social collaboration is also a possible way to share product data between manufacturers, wholesalers, retailers and end users. There is a service for that called Actualog.
Do You Like the Lake?
Today Capgemini as a result of a co-innovation partnership with Pivotal released their take on information management in the big data era in a piece called The Principles of the Business Data Lake.
The business data lake concept is a new try on getting rid of all the excel spreadsheets business people operate because of limitations in today’s enterprise data warehouses and the business intelligence solutions sitting on top of those extracted, transformed and loaded data.
In the business data lake you load raw data including unstructured data sources. Single view and related governance is restricted to master and reference data.
It’s not that you are going to load all the data in the world in your business data lake. You will link internal and external data based on where and when needed.
Thomas Redman has made a famous metaphor in the data quality realm about a polluted lake where the best option to deal with that is to prevent polluted water from streaming into the lake. I guess the rise of big data challenges that take as told some years ago in the post Extreme Data Quality.
In the business data lake we will have polluted data. In that view I think it’s a good thing that master and reference data has a special place in the lake.
What do you think? Do you like the lake – the old and/or the new one?
The Role of Technology in Data Quality Management
A recent article called Data’s Credibility Problem by Thomas Redman on Harvard Business Review has rightfully got a lot of mentions in the data quality community on social media including Twitter.
I agree with many things in the article except I have to question the credibility of this saying:
”The solution is not better technology: It’s better communication between the creators of data and the data users”.
There is a lot a truth in this saying. But it is in my eyes not valid.
If the human race had relied solely on communication we would still discuss if a wheel should have the shape of a square or a circle. There is a balance between fruitful communication and throwing technology at problems and you may emphasize on one side or the other depending on if you sell data quality consultancy or data quality tools.
I would say:
“The solution is better communication between the creators of data, the data users and the innovators of data quality technology”.
Now, how do I best spread this message….
Trust in External Data is Like Trust in Analysts
The analyst industry is like any other industry. Analysts compete. Mostly analysts do it by presenting what is supposed to be more trustworthy reports than the other ones do including their special visualization method be that a quadrant, landscape, bulls eye or whatever approach . And sometimes they compete by bashing the other ones.

This week I had a blog post called A Little Bit of Truth vs A Big Load of Trust. The post cites a blog post from Andrew White of Gartner called From MDM to Big Data – From truth to trust. This post again cites an article on SearchDataManagement called Enterprise master data management and big data: A well-matched pair?
Andrew White’s post praises the views of fellow Gartner analyst Ted Friedman in the SearchDataManagement article and bashes the views of the other contributors being Evan Levy, Andy Hayler (Information Difference), Aaron Zornes of the MDM Institute and Kelly O’Neal by saying:
“… presumably since the thinking out there in the cited analyst community has not gotten very far yet.”
Indeed, you have to consider multiple opinions out there when it comes to Master Data Management (MDM), big data and other external data. The same way there are, when it comes to the data, multiple versions of the truth out there and you have, with Andrew White’s words, to: “..manage and govern trust in someone else’s data”.
About Big Data and Doing It
The below saying has become a popular share around in social media:
“Big data is like teenage sex. Everybody talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it.”
Indeed, there is quite a lot of hype around big data as for example told in The Big MDM Trend.
The teenage sex joke isn’t new at all. It has been used about a lot of new trends. I remember when the e-Business hype started, the joke was used here as well as you still can find some evidence about if googling the saying and getting this and that.
Today e-Business has matured and maybe a few brick and mortar bookstores have stopped laughing about the e-Business and teenage-sex joke now.
Also, maybe the joke says more about parents’ knowledge about teenage-sex.
A Little Bit of Truth vs A Big Load of Trust
The soul of Master Data Management (MDM) is often explained as the search for a single version of the truth. It has always puzzled me that that search in many cases has been about finding the truth as the best data within different data silos inside a given organization.
Big data, including how MDM and big data can be a good match, has been a well covered subject lately. As discussed in the post Adding 180 Degrees to MDM this has shed the light on how external data may help having better master data by looking at data from outside in.
At Gartner, the analyst firm, they have phrased that movement as a shift from truth to trust for example as told in the post by Andrew White called From MDM to Big Data – From truth to trust.
Don’t get me (and master data) wrong. The truth isn’t out there in a single silver bullet shot. You have to mash up your internal master data with some of the most trustworthy external big reference data. This include commercial directory offerings, open data possibilities, public sector data (made available for private entities) and social networks.
Indeed there are potholes in that path. Timeliness of directories, completeness of open data, consistency and availability and price tags on public sector data and validity of social network data are common challenges.
The MDM Market Wordle
Analyst firms have a lot of fun in making different surveys and rankings of vendors in different markets using their own special visualizing method. For the Master Data Management (MDM) market we have this year had the:
- Gartner MDM for Product Data Quadrant
- Gartner MDM for Customer Data Quadrant
- Bloor Market Update for MDM using the bulls eye
- Information Difference MDM Landscape
Encouraged by a recent comment on the post What’s New in The Data Quality Magic Quadrant? I have now made my take on the market utilizing the wordle as my special visual approach.
Lazy as I am I haven’t made my own survey but simply taken the brand names from the rankings mentioned above and filled in the name either 1, 2 or 3 times from each report depending on how well the brand was positioned.
So the size of the letters tells something about market positioning according to analyst reports. The size of the words also tells something about the length of the brand name. The placement is according to the wordle principle of course totally random.
And of course I now expect a load of tweets from vendor marketing departments saying that their company is positioned very randomly in the MDM Market Wordle 🙂
Third-Party Data and MDM
A recent blog post called Top 14 Master Data Management Misconceptions by William McKnight has as the last misconception this one:
“14. Third-party data is inappropriate for MDM
Third-party data is largely about extending the profile of important subject areas, which are mastered in MDM. Taking third-party data into organizations has actually kicked off many MDM programs.”
Indeed, using third-party data, which also could be called big external reference data, is in my eyes a very good solution for a lot of use cases. Some of the most popular exploitations today are:
- Using a business directory as big reference data for B2B party master data in customer data integration (CDI) and supplier master data management.
- Using address directories as big reference data for location master data management also related to party master data management for B2C customer data.
- Using product data directories such as the Global data Synchronization Network (GDSN®) services, the UNSPSC® directory and heaps of industry specific product directories.
The next wave of exploiting external data, which is just kicking off as Social MDM, is digging into social media for sharing data, including:
- Using professional social networks as LinkedIn in B2B environments where you often find the most timely reference data not at least about contact data related to your business partners’ accounts.
- Using consumer oriented social networks as Facebook for getting to know your B2C customers better.
- Using social collaboration as a way to achieve better product master data as told in the post Social PIM.


