“The average financial impact of poor data quality on organizations is $9.7 million per year.” This is a quote from Gartner, the analyst firm, used by them to promote their services in building a business case for data quality.
While this quote rightfully emphasizes on that a lot of money is at stake, the quote itself holds a full load of data and information quality issues.
On the pedantic side, the use of the $ sign in international communication is problematic. The $ sign represents a lot of different currencies as CAD, AUD, HKD and of course also USD.
Then it is unclear on what basis this average is measured. Is it among the +200 million organizations in the Dun & Bradstreet Worldbase? Is it among organizations on a certain fortune list? In what year?
Even if you knew that this is an average in a given year for the likes of your organization, such an average would not help you justify allocation of resources for a data quality improvement quest in your organization.
I know the methodology provided by Gartner actually is designed to help you with specific return on investment for your organization. I also know from being involved in several business cases for data quality (as well as Master Data Management and data governance) that accurately stating how any one element of your data may affect your business is fiendishly difficult.
Indeed, while automation is a most wanted outcome of Master Data Management (MDM) implementations and many other IT enabled initiatives, you should always consider the alternative being eliminating (or simplifying). This often means thinking out of the box.
As an example I today stumbled upon the Wikipedia explanation about Business Process Mapping. The example used is how to make breakfast (the food part):
You could think about different Business Process Re-engineering opportunities for that process. But you could also realize that this is an English / American breakfast. What about making a French breakfast instead. Will be as simple as:
“CXOs who believe that “data” is simply the content in their own internal databases are increasing being seen as anachronistic. More progressive leaders understand that data is everywhere, including–and especially–external to the enterprise.”
Herein Kim emphasizes that: “Innovation is driven by data and that data largely resides outside your firewall”.
My humble work in bringing in the outside revolves around a service called instant Data Quality (iDQ™). This service is about exploiting the increasing choice if external directories holding valuable information about the individuals, companies, addresses and properties we have so much trouble with reflecting in our party master data hubs.
What about you? Are you anachronistic or do you bring in the outside? Or as it will sound in Phil’s Big Data Kitchen: Will you miss salsa tonight?
Getting datakvalitet right in world wide data isn’t easy. What works in some countries doesn’t work in other countries, not at least when we are talking datakvalitet regarding party master data such as customer master data, supplier master data and employee master data.
One of the reasons why datakvalitet for party master data is different is the various possibilities with applying big reference data sources. For example the availability of citizen data is different in New York than in Scandinavia. This affects the ways of reaching optimal datakvalitet as reported in the post Did They Put a Man on the Moon.
As part of the ongoing globalization handling international datakvalitet is becoming more and more common. Many enterprises try to deploy enterprise wide datakvalitet initiatives and shared service centers handles party master data uncommon to the people working there. This often results in finding a strange word like Häagen-Dazs.
The old Chinese concept of yin and yang, or simply yīnyáng, is used to describe how polar opposites or seemingly contrary forces are interconnected and interdependent in the natural world. The concept is probably best known materialized as sweet and sour sauce.
Lately we had a debate in the data quality community on social media about if data quality is a journey or a destination, nicely summarized by Jim Harris in the post Quo Vadimus. I guess the prevailing sentiment is that it is kind of both a journey and a destination.
We also have the good old question about if data are of high quality if they are “fit for the purpose of use” or “aligned with the real world”. Sometimes these benchmarks go in opposite directions and we like to fulfill both goals at the same time.
The Data Quality discipline is tormented by belonging to both the business side and the technology side of practice. These sides are often regarded as contrary, but in my experience we get the best sauce by having both sides represented.
And oh yes, do we actually have to call it one of two diametrically different terms being Data Quality or Information Quality. Bon appetit.
Analogies between making and serving good food and improving data and information quality are among the recurring topics on this blog. Like the term good food is a subjective matter also good information is a subjective matter though the ones who have the task of preparing both knows that fresh and clean raw materials / data is a must for preparing both, as explained in the post Bon Appetit.
Food preferences and data and information preferences differs around the world. High esteemed local dishes from one country may not have the same traction in other parts of the world. As discussed in the post Data Quality and World Food this is also true for data and information quality.
Sometimes you can’t trust data even if data is captured correctly. If you for example ask people about food consumption habits we tend to give answers with some distance from reality. That calls for a Survey Data Laundering.
Estimating the return on investment for improving data quality has always been hard. The post Miracle Food for Thought is about how that resembles how following “good” advices around what you should eat and drink isn’t as simple as often stated.
Anyway, we all know that better food and better serving in a restaurant does create more business and sometimes we have to put the restaurant and the information bistro Under new Master Data Management.
And finally, tomorrow this blog is two years old. That calls for a Birthday Party in the cloud.
There are a lot of different words for data quality improvement activities like data cleaning, data cleansing, data scrubbing and data hygiene.
Today I stumbled upon “data laundering” and the site http://www.datalaundering.com that is owned by an old colleague of mine from way back when we were doing stuff not focused on data quality.
Joseph is specializing in laundering data from surveys. The issue is that surveys always have some unreliable responses that lead to wrong conclusions that again lead to wrong decisions. This is a trail well known in data and information quality.
Unreliable responses resemble outliers in business intelligence. These are responses from respondents that provide answers distant from the most conceivable result. What I like about the presentation of the business value is that the example is about food: What we say that we eat and what we actually consume. Then there is a lot of math and even induction mechanism to support the proposition. Read all about it here.
I can’t help making analogies between data quality and food and drink even that I am actually not on any kind of diet these days.
Today’s subject is the similarities between metadata and meatballs.
Metadata is loosely defined as data about data. Some data describing what is meant to be in a dataset and a data element, what the purpose is and what standards are used.
The problem with metadata is if everybody understands the same when you use a certain term when creating metadata. Despite best intensions there will probably always be someone, somewhere getting something different from your wordings.
That’s where meatballs come into the context.
If you read the article about meatballs on Wikipedia you’ll get the picture. Yes, meatballs have some common characteristics around the world. Some minced meat (or fish (if not vegetarian style)) mixed with some additional ingredients exposed to heat in some way and served with something different depending on where on earth you are.
Having a metadata repository is good for data and information quality.
The challenge in filling out a metadata repository is the balancing between describing how meatballs should be (your mom’s recipe) and how meatballs could be.
We all know the headlines in the media about food and drink and your health. One day something is healthy, the next day it will kill you. You are struck with horror when you learn that even a single drop of alcohol will harm your body until you are relieved by the wise words saying that a glass (or two) of red wine a day keeps the doctor away.
Sometimes some data are fit for purpose. At another time at another place the very same data are rubbish.
As said as an excerpt from the Miracle Food report:
“The facts about the latest dietary discoveries are rarely as simple as the headlines imply. Accurately testing how any one element of our diet may affect our health is fiendishly difficult. And this means scientists’ conclusions, and media reports of them, should routinely be taken with a pinch of salt.”
It’s about the same with data quality, isn’t it?
Accurately testing how any one element of our data may affect our business is fiendishly difficult. So predictions of return of investment (ROI) from data quality improvement are unfortunately routinely taken with a big spoon of salt.