Data Governance in the Self-Service Age

The term self-service is used increasingly within data management. Self-service may be about people within your organization using self-service capabilities as in self-service business intelligence. But probably more disruptive it may be about customer self-service and supplier self-service meaning that people outside your organization are increasingly more dependent on the level of data quality you can offer within your services.

Customer self-service will not succeed without you offering decent data quality related to product information as exemplified in the post Falsus in Uno, Falsus in Omnibus. There will be more happy customer self-service events with more complete product information. Knowing your customer better helps with helping your customer doing self-serving. And in that sense it may be Time To Turn Your Customer Master Data Management Social?

Data entrySupplier self-service will not fly if you do not know your suppliers and their differences, which is quite similar to the concept of knowing your customer as explained in the post Single Business Partner View. When it comes to approaches to data management within supplier engagement there are several options as those examined in the post Sharing Product Master Data.

Do you think data governance is hard enough when dealing with the dear people within your own organization? I have news for you. It’s going to be even tougher when dealing with all the lovely people outside your organization who you will ask to be part of your data collection and consumption workspace.

Bookmark and Share

The Place for Data Matching in and around MDM

Data matching has increasingly become a component of Master Data Management (MDM) solutions. This has mostly been the case for MDM of customer data solutions, but it is also a component of MDM of product data solutions not at least when these solutions are emerging into the multi-domain MDM space.

The deployment of data matching was discussed nearly 5 years ago in the post Deploying Data Matching.

Neural NetworkWhile MDM solutions since then have been picking up on the share of the data matching being done around it is still a fairly small proportion of data matching that is performed within MDM solutions. Even if you have a MDM solution with data matching capabilities, you might still consider where data matching should be done. Some considerations I have come across are:

Acquisition and silo consolidation circumstances

A common use case for data matching is as part of an acquisition or internal consolidation of data silos where two or more populations of party master data, product master data and other important entities are to be merged into a single version of truth (or trust) in terms of uniqueness, consistency and other data quality dimensions.

While the MDM hub must be the end goal for storing that truth (or trust) there may be good reasons for doing the data matching before the actual on-boarding of the master data.

These considerations includes

The point of entry

The MDM solution isn’t for many good reasons not always the system of entry. To do the data matching at the stage of data being put into the MDM hub may be too late. Expanding the data matching capabilities as Service Oriented Architecture component may be a better way as pondered in the post Service Oriented Data Quality.

Avoiding data matching

Even being a long time data matching practitioner I’m afraid I have to bring up the subject of avoiding data matching as further explained in the post The Good, The Better and The Best Way of Avoiding Duplicates.

Bookmark and Share

Customer MDM Magic Wordles

The Gartner Magic Quadrant for Master Data Management of Customer Data 2014 is out. One place to get it for free is by using the Informatica registry style page offered in the Informatica communication here.

So, what is good and what is bad when looking for a MDM vendor if you are focusing on customer data right now?

Some words in the strengths assessment of vendors are:

Magic plus

Some words in the cautions assessment of vendors are:

Magic minus

Bookmark and Share

Falsus in Uno, Falsus in Omnibus

The title of this blog post is a Latin legal phrase meaning “false in one thing, false in everything”. It refers to a principle about regarding everything a witness says as not credible, if one thing said by the witness is proven not to be true. This has been a part of the plot in plenty of courtroom films and TV-shows.

This principle has meaning related to data quality too. An example from direct marketing will be a receiver of a direct mail saying: “If you can’t get my name right, how can I trust you in getting anything right during a purchase?”

Somed data quality dimensions
Some data quality dimensions

An example from the multi-channel world, or should we say omni-channel today, would be a shopper saying: “If you say one thing about the product in the shop and another thing on the website, how can I trust any of your product information?” Falsehood in omni-channel so to speak.

Measuring the impact of such attitudes and thereby the Return on Investment (ROI) in data quality improvement based on this principle is very hard. We usually only have random anecdotal evidence about that this happens.

But, what we can say is: Don’t lie in court and don’t neglect your data quality. It will hurt your credibility and then in the end your creditworthiness.

Bookmark and Share

The Path to Multi-Domain MDM

Multi-Domain Master Data Management (MDM) is about dealing with master data in several different data domains as customer (or party), product, location, asset or calendar. The typical track today is starting in one domain. There are many, even contradicting, good reasons for that.

Depending on in what industry vertical you are the main pain points that urges you to start doing MDM belongs to either of the MDM domains. Customer MDM is the most common one typically seen where you have a large number of customer records in your databases. We see starting with product MDM in organizations with many products in the databases. This is for example the case for large retailers and distributors.

Master DataIt can be other domains as well. One example from a MDM conference I recall is that Royal Mail in the UK started with the calendar domain. Besides that this domain had pain points for that organization a reason to do that was to start small before taking on the big chunks.

Even though you start with one domain, you must think about the end state. One thing to consider multi-domain wise is the data governance part, as you will not come out well if you choose different approaches to data governance for each master data domain. Of course, the technology part is there too. Choosing a solution that eventually will take you all the way is appealing to many organizations looking for a MDM platform.

Another approach to multi-domain MDM can be through what I know at least one MDM tool vendor calls Evolutionary MDM™. But we can call it other things. Agile or lean MDM for example. Using that approach you do not solve everything within one domain before going on to the next one.

It is about eliminating as many pain points as possible in the shortest feasible time-frame.

Bookmark and Share

The Scary Data Lake

The concept of the data lake seems to have a revival these days. Perhaps it reemerged about a year ago as told in the post Do You Like the Lake?

The idea of having a data lake scares the hell out of data quality people as seen in the title used by Garry Allemann in the post Data Lake vs Data Cesspool.

The data lake is mostly promoted as a data source for analytics opposite to something being part of daily operations. That is horrifying enough. Imagine Joe last month using 80 % of his time fixing data quality issues when doing one batch of analytics. And this month Sue spend 80 % of her time fixing data quality issues in the same data lake in her analytic quest and 50 % of Sue’s data quality issues are in fact the same as Joe’s challenges from last month.

As Halloween is just around the corner, it is time to ask: What is your data lake horror story?

Hadooween

Bookmark and Share

Post No. 666

666This is post number 666 on this blog. 666 is the number of the beast. Something diabolic.

The first post on my blog came out in June 2009 and was called Qualities in Data Architecture. This post was about how we should talk a bit less about bad data quality and instead focus a bit more on success stories around data quality. I haven’t been able to stick to that all the time. There are so many good data quality train wrecks out there, as the one told in the post called Sticky Data Quality Flaws.

Some of my favorite subjects around data quality were lined up in Post No. 100. They are:

The biggest thing that has happened in the data quality realm during the five years this blog has been live is probably the rise of big data. Or rather the rise of the term big data. This proves to me that changes usually starts with technology. Then we after sometime starts thinking about processes and finally peoples roles and responsibilities.

Bookmark and Share

When Rhino Hunt and the HiPPO Principle makes a Perfect Storm

A frequent update on my LinkedIn home page these days is about the HiPPO principle. The HiPPO principle is used to describe a leadership style based on priority for the leader’s opinion opposite to using data as explained in the Forbes article here.

HiPPO

The hippo (hippopotamus) is one of largest animals on this planet. So is the rhino (rhinoceros). The rhino is critically endangered because it is hunted by humans due to a very little part of its body, being the horn.

I guess anyone who has been in business for some years has met the hippo. Probably you also have experienced a rhino hunt being a project or programme of very big size but aiming at a quite narrow business objective that may have been expressed as a simple slogan by a hippo.

Bookmark and Share

The “Fit for Purpose” Trap

Gartner (the analyst firm), represented by Saul Judah, takes data quality back to basics in the recent post called Data Quality Improvement.

While I agree with the sentiment around measuring the facts as expressed in the post I have cautions about relying on that everything is good when data are fit for the purpose for business operations.

Some clues lies in the data quality dimensions mentioned in the post:

Accuracy (for now):

As said in the Gartner post data are indeed temporal. The real world changes and so does business operations. When you got your data fit for the purpose of use the business operations has changed. And when you got your data re-fit for the new purpose of use the business operations has changed again.

Furthermore most organizations can’t take all business operations into account at the same time. If you go down the fit for purpose track you will typically address a single business objective and make data fit for that purpose. Not at least when dealing with master data there are many business objectives and derived purposes of use. In my experience that leads to this conclusion:

“While we value that data are of high quality if they are fit for the intended use we value more that data correctly represent the real-world construct to which they refer in order to be fit for current and future multiple purposes”

Existence – an aspect of completeness:

The Gartner post mentions a data quality dimension being existence. I tend to see this as an aspect of the broader used term completeness.

For example having a fit for purpose completeness related to product master data has been a huge challenge for many organizations within retail and distribution during the last years as explained in the post Customer Friendly Product Master Data.

Omni

Bookmark and Share