Modern Data Management, Paella, Herodotus, Darwin and Einstein

Reltio has a blog series with the tag #moderndatamasters. The posts are interviews with people in the data management world. The other day it was my turn to share my story.

Kate Tickner from Reltio went with me around some serious questions as:

  • How would you define “modern” data management and what does it /should it mean for organisations that adopt it?
  • What are your top 3 tips or resources to share for aspiring modern data masters?
  • Can you tell us a little more about the concepts behind Product Data Lake and your vision for how it could be used in the future?
  • What trends or changes do you predict to the data management arena in the next few years?

You can read the interview here on the Reltio blog.

At the end we touched:

  • What do you like to do outside of work?
  • Which 3 people – living or dead, real or fictional – would you invite to a dinner party and why?
  • What are you cooking?

For the dinner party I would make paella and, based on my interest in history, picked three historical persons, who also have been featured on this blog:

Reltio moderndatamasters

I am afraid that Gartner does not help

“The average financial impact of poor data quality on organizations is $9.7 million per year.” This is a quote from Gartner, the analyst firm, used by them to promote their services in building a business case for data quality.

AverageWhile this quote rightfully emphasizes on that a lot of money is at stake, the quote itself holds a full load of data and information quality issues.

On the pedantic side, the use of the $ sign in international communication is problematic. The $ sign represents a lot of different currencies as CAD, AUD, HKD and of course also USD.

Then it is unclear on what basis this average is measured. Is it among the +200 million organizations in the Dun & Bradstreet Worldbase? Is it among organizations on a certain fortune list? In what year?

Even if you knew that this is an average in a given year for the likes of your organization, such an average would not help you justify allocation of resources for a data quality improvement quest in your organization.

I know the methodology provided by Gartner actually is designed to help you with specific return on investment for your organization. I also know from being involved in several business cases for data quality (as well as Master Data Management and data governance) that accurately stating how any one element of your data may affect your business is fiendishly difficult.

I am afraid that there is no magic around as told in the post Miracle Food for Thought.

Automate or Obliterate, That is the Question

Back in 1990 Michael Hammer made a famous article called Reengineering Work: Don’t Automate, Obliterate.

Indeed, while automation is a most wanted outcome of Master Data Management (MDM) implementations and many other IT enabled initiatives, you should always consider the alternative being eliminating (or simplifying). This often means thinking out of the box.

As an example I today stumbled upon the Wikipedia explanation about Business Process Mapping. The example used is how to make breakfast (the food part):


You could think about different Business Process Re-engineering opportunities for that process. But you could also realize that this is an English / American breakfast. What about making a French breakfast instead. Will be as simple as:

Input money > Buy croissant > Fait accompli

PS: From the data quality and MDM world one example of making French breakfast instead of English / American breakfast is examined in the post The Good, Better and Best Way of Avoiding Duplicates.

Bookmark and Share

You probably won’t find the truth (and salsa) inside your firewall

In a Data Roundtable blog post published today and called Big Data in Your Kitchen Phil Simon says:

“CXOs who believe that “data” is simply the content in their own internal databases are increasing being seen as anachronistic. More progressive leaders understand that data is everywhere, including–and especially–external to the enterprise.”

Bringing in external data was also touched recently by Kim Loughead of Informatica in the post Bring The Outside In: Why Integrating External Data Sources Should Be Your Next Data integration Project.

Herein Kim emphasizes that: “Innovation is driven by data and that data largely resides outside your firewall”.

SalsaMy humble work in bringing in the outside revolves around a service called instant Data Quality (iDQ™). This service is about exploiting the increasing choice if external directories holding valuable information about the individuals, companies, addresses and properties we have so much trouble with reflecting in our party master data hubs.

What about you? Are you anachronistic or do you bring in the outside? Or as it will sound in Phil’s Big Data Kitchen: Will you miss salsa tonight?

Bookmark and Share

Häagen-Dazs Datakvalitet

There is a term called foreign branding. Foreign branding is describing an implied cachet or superiority of products and services with foreign-sounding names

Häagen-Dazs ice cream is an example of foreign branding. Though the brand was established in New York the name was supposed to sound Scandinavian.

However, Häagen-Dazs does sound and look somewhat strange to a Scandinavian. The reason is probably that the constellation of the letters “äa” and “zs” are not part of any native Scandinavian words.

By the way, datakvalitet is the Scandinavian compound word for data quality.

Getting datakvalitet right in world wide data isn’t easy. What works in some countries doesn’t work in other countries, not at least when we are talking datakvalitet regarding party master data such as customer master data, supplier master data and employee master data.

One of the reasons why datakvalitet for party master data is different is the various possibilities with applying big reference data sources. For example the availability of citizen data is different in New York than in Scandinavia. This affects the ways of reaching optimal datakvalitet as reported in the post Did They Put a Man on the Moon.

As part of the ongoing globalization handling international datakvalitet is becoming more and more common. Many enterprises try to deploy enterprise wide datakvalitet initiatives and shared service centers handles party master data uncommon to the people working there. This often results in finding a strange word like Häagen-Dazs.

Bookmark and Share

Yin and Yang Data Quality

The old Chinese concept of yin and yang, or simply yīnyáng, is used to describe how polar opposites or seemingly contrary forces are interconnected and interdependent in the natural world. The concept is probably best known materialized as sweet and sour sauce.

Lately we had a debate in the data quality community on social media about if data quality is a journey or a destination, nicely summarized by Jim Harris in the post Quo Vadimus. I guess the prevailing sentiment is that it is kind of both a journey and a destination.

We also have the good old question about if data are of high quality if they are “fit for the purpose of use” or “aligned with the real world”. Sometimes these benchmarks go in opposite directions and we like to fulfill both goals at the same time.

The Data Quality discipline is tormented by belonging to both the business side and the technology side of practice. These sides are often regarded as contrary, but in my experience we get the best sauce by having both sides represented.

And oh yes, do we actually have to call it one of two diametrically different terms being Data Quality or Information Quality. Bon appetit.

Bookmark and Share

The Data Quality Cuisine

Analogies between making and serving good food and improving data and information quality are among the recurring topics on this blog. Like the term good food is a subjective matter also good information is a subjective matter though the ones who have the task of preparing both knows that fresh and clean raw materials / data is a must for preparing both, as explained in the post Bon Appetit.

Food preferences and data and information preferences differs around the world. High esteemed local dishes from one country may not have the same traction in other parts of the world. As discussed in the post Data Quality and World Food this is also true for data and information quality.

In the post Metadata Meatballs it is examined how the same diversity applies to metadata.

Sometimes you can’t trust data even if data is captured correctly. If you for example ask people about food consumption habits we tend to give answers with some distance from reality. That calls for a Survey Data Laundering.

Estimating the return on investment for improving data quality has always been hard. The post Miracle Food for Thought is about how that resembles how following “good” advices around what you should eat and drink isn’t as simple as often stated.   

Anyway, we all know that better food and better serving in a restaurant does create more business and sometimes we have to put the restaurant and the information bistro Under new Master Data Management.

And finally, tomorrow this blog is two years old. That calls for a Birthday Party in the cloud.

Bookmark and Share

Survey Data Laundering

There are a lot of different words for data quality improvement activities like data cleaning, data cleansing, data scrubbing and data hygiene.

Today I stumbled upon “data laundering” and the site that is owned by an old colleague of mine from way back when we were doing stuff not focused on data quality.

Joseph is specializing in laundering data from surveys. The issue is that surveys always have some unreliable responses that lead to wrong conclusions that again lead to wrong decisions.  This is a trail well known in data and information quality.

Unreliable responses resemble outliers in business intelligence. These are responses from respondents that provide answers distant from the most conceivable result. What I like about the presentation of the business value is that the example is about food: What we say that we eat and what we actually consume. Then there is a lot of math and even induction mechanism to support the proposition. Read all about it here.      

Bookmark and Share

Miracle Food for Thought

We all know the headlines in the media about food and drink and your health. One day something is healthy, the next day it will kill you. You are struck with horror when you learn that even a single drop of alcohol will harm your body until you are relieved by the wise words saying that a glass (or two) of red wine a day keeps the doctor away.

These misleading, exaggerated and contradictory headlines are now documented in a report called Miracle Food, Myth and the Media.

It’s the same with data quality, isn’t it?

Sometimes some data are fit for purpose. At another time at another place the very same data are rubbish.

As said as an excerpt from the Miracle Food report:

“The facts about the latest dietary discoveries are rarely as simple as the headlines imply. Accurately testing how any one element of our diet may affect our health is fiendishly difficult. And this means scientists’ conclusions, and media reports of them, should routinely be taken with a pinch of salt.”

It’s about the same with data quality, isn’t it?

Accurately testing how any one element of our data may affect our business is fiendishly difficult. So predictions of return of investment (ROI) from data quality improvement are unfortunately routinely taken with a big spoon of salt.

Bon appétit.

Bookmark and Share