Data Quality 3.0 Revisited

Back in 2010 I played around with the term Data Quality 3.0. This concept is about how we increasingly use external data within data management opposite to the traditional use of internal data, which are data that has been typed into our databases by employees or has been internally collected in other ways.

cropped-great-belt-brdige.jpg

The rise of big data has definitely fueled the thinking around using external data as reported in the post Adding 180 Degrees to MDM.

There are other internal and external aspects for example internal and external business rules as examined in the post Two Kinds of Business Rules within Data Governance. This post has been discussed in the Data Governance Know How group on LinkedIn.

In a comment Thomas Tong says:

“It’s really fun when the internal components of governance are running smooth, giving the opportunity to focus on external connections to your data governance program. Finding the right balance between internal and external influences is key, as external governance partners can reduce the load/complexity of your overall governance program. It also helps clarify the difference between a “external standard” vs “internal standard”, as well as what is “reference data” vs “master data”… and a little preview of your probable integration strategy with external.”

This resonates very much with my mindset. Since 2010 my own data quality journey has increasingly embraced Master Data Management (MDM) and Data Governance as told in the recent blog post called Data Governance, Data Quality and MDM.

So, in my quest to coin these 3 disciplines into one term I, besides the word information, also may put 3.0 into the naming: “Information Quality 3.0”, hmmm …..

Bookmark and Share

The Unruly Information Quality Community

Yesterday Daragh O Brien posted an Open Letter to my Information Quality Peers. The essence is that Daragh isn’t completely satisfied with how things are in The International Association for Information and Data Quality (IAIDQ).

That reminds me of that I was a charter member of IAIDQ.

IAIDQ Membership

But now checking I probably haven’t renewed the membership. This is not deliberate. It just may have slipped. Maybe, as being one of Daragh’s critique points, because broadcasting from IAIDQ has decreased the last years.

> Correction: Double checking I am actually still a member. I renewed for 2 years last time (usually I’m not that careless with money). I just lost my Charter Mbr designation in the process.

Another critique point raised by Daragh is the failed mission to make the organization truly international, as the organization have had difficulties maintaining chapters around the world.

Forming and maintaining regional chapters is about getting and upholding a critical mass of active members. An example of that this is possible is the German Information Quality Society – Deutsche Gesellschaft für Informations- und Datenqualität e. V. However, this organization doesn’t seem to be a IAIDQ chapter, but being another church obeying the same god.

The current unrest in IAIDQ is not the first of its kind. I remember that some years ago one of the founding members, Larry English, sent a strange email to members telling that he quitted the organization not being satisfied with something.

It is ironic that information quality practitioners are preaching communication and collaboration, but we don’t seem to get it when it comes to organizing our own little world.

Bookmark and Share

Data Governance, Data Quality and MDM

The data governance discipline, the data quality discipline and the Master Data Management (MDM) discipline are closely related and happens to be my fields of work.

Data quality improvement is important within data governance and MDM. Furthermore you seldom see an MDM implementation without a (master) data governance work stream today.

Information Ven

Over time it has often been suggested that data quality should rightfully be named information quality as told in the post New Blog Name. In addition, data governance could be referred to as information governance as suggested in the Mike2 Open Methodology here.

Within MDM we have the term Product Information Management (PIM) which is partly,  but maybe not fully,  the same as Product MDM,  as examined by Monica McDonnell of Informatica in the post PIM is Not Product MDM – Product MDM is not PIM.

Product is one of several domains within MDM, where customer (or rather party), location and asset are other domains going into multi-domain MDM as reported in the post Multi-Entity MDM vs Multidomain MDM.

While replacing the term data with the term information for data quality, data governance and for that matter (multi-domain) master data management has had limited success outside academic circles, I do see it very suitable for being part of a term covering these three disciplines as a whole.

So what should these three disciplines be called as a whole? Have you noticed any good terms or smart hypes out there? Or are they just three out of more disciplines within data or information management?

Bookmark and Share

What do we know about Data Governance?

When advising about and doing actual work within the data governance realm you often need to refer to open available resources.

open-doorAs data governance still is an emerging discipline the available resources are of that nature too. There are plenty of good and insightful articles, blog posts and other pieces of information around. But when you try to put them together to work in a data governance journey, the recommendations may point in a lot of different directions.

When it comes to open available resources where there is a kind of consistent framework for a data governance programme I have seen these two out there:

Have you found, or made available, other more or less complete journey plans for data governance out there?

Bookmark and Share

Putting Two Things in One Field

A very common data quality issue is when a field in a data record is populated with more than one piece of information.

Sometimes this is done as a work around, because we have a piece of information,  but we haven’t a field with that distinct purpose of use. Then we find a more or less related existing field where in we can squeeze this additional piece of information.

But we also have some very common cases where this bad habit is required by external business rules or wide spread tradition.

Legal formsLegal Form in Company Names

This example is examined in the post Legal Forms from Hell.

One should think that it is time for changing the bad (legal demanded) practice of mixing legal forms with company names and serve the original purpose in another more data quality friendly way.

An Address Line

An address line will typically hold a couple of elements as a street (thoroughfare) name, a house number and maybe some kind of unit identification.

By the way the order of street name and house number is opposite in approximately two equal parts of the world, with the exception of places where numbering within blocks between streets is the standard.

Education in Person Name

You can put professor in front of your name and even MBA – Master of Business Administration!! – after your name in the name field.

In the next few days I will put AFCM (Accidental Field Content Misuser) after my name.

Bookmark and Share

Customer Friendly Product Master Data

Data is of high quality if they are fit for the purpose of use. This mantra has been around in the data management realm for many years.

In a recent article by Andy Hayler on CIO about MDM at Harrods there is a good example of a piece of data of such a high quality. It is a product description:

XX 6621/74 BLK VNN SS TOP 969B S

This product description was nicely fit for the purpose of use when Harrods handled their product data in a material master in an ERP system I guess. But when switching from buy-side focus to sell-side focus in a multi-channel world, this product description gives no meaning to the customer.

HarrodsSuch problems with changing purposes of use for product master data is not only a luxury problem at Harrods but a common challenge within retail and distribution. The challenge involve having customer friendly product descriptions, a range of atomized product attributes that varies by product category and having related digital assets that helps the customer.

Organizations around are, as explained by Andy Hayler, tackling this challenge by implementing Master Data Management (MDM) solutions – in this case those ones specialized in Product Information Management (PIM).

MDM is said to be about a single version of the truth. While this in the customer (or rather party) MDM world is much about achieving uniqueness by matching and merging several different representations of the same real world individual or legal entity, the main challenge in product MDM is a bit different. Here completeness is a big issue. This involves gathering several different pieces of the truth from different sources. And a certain level of completeness may be fit for the purpose of use today but not fit enough tomorrow.

So, how can organizations overcome the huge task of gathering so much product data? I think it is much about Sharing Product Master Data.

Bookmark and Share

Selecting a MDM Vendor

How do you select a Master Data Management (MDM) vendor? There is of course the RFP way of scoring vendors against a bunch of carefully specified requirements within data model, user interface, architecture and so on. But as I have seen it, maybe the multi-domain way is much more used.

The multi-domain MDM vendor selection process has three basic parameters:

  • Distance between locations
  • Chemistry between parties
  • Price of products

Distance between locations:

Here you measure four numbers:

  • N1 = Northern UTM geocode of buyers head quarter
  • E1 = Eastern UTM geocode of buyers head quarter
  • N2 = Northern UTM geocode of vendors head quarter or major regional office
  • E2 = Eastern UTM geocode of vendors head quarter or major regional office

Then using the Pythagorean Theorem you get:

UTM distance

(You could make up the distance on Google Maps as well, but that doesn’t look very scientific).

Chemistry between parties:

Here you, at the meetings between the buying team and the vendor team, measure the occurrence of these sentences:

  • Could you repeat that question please?
  • Could you repeat that answer please?

(Observe that there may be a correlation with distance in cases where distance calls for the use of a webex for a meeting).

Price of products:

I guess everyone knows how to sum up euros/dollars/pounds/whatever.

Bookmark and Share

MDM Aware MDM Solutions

The concept of MDM aware applications have been around for some time. What the Master Data Management establishment, including yours truly, is hoping for, is that applications like CRM, ERP and other systems will start to utilize the master entities in MDM solutions instead of having their own more or less useful data models within data silos around master data entities as parties, products, locations and assets as well as exploiting other good structures and services in the MDM realm.

puzzleBut what about MDM solutions themselves? Are MDM solutions that smug that they don’t take in good capabilities from other MDM solutions?

One reason to do so is if a MDM vendor have several MDM solutions to offer. An example of that I experienced recently was when attending the Informatica MDM day for EMEA in London the other day. Informatica has recently acquired the Product MDM specialist firm Heiler and has therefore two MDM solutions to offer to the market. It has been too early for the newest version 10 of the general Informatica MDM solution to embrace the Heiler solution, so what I learned from one of the good now Informatica folks was that the Heiler solution is becoming MDM aware – at least aware of the Informatica MDM version 10 solution I guess.

On another front I’m working with the iDQ™ MDM Edition. Here we do have a default data model for party master entities, but we are not that smug that we can’t be aware of other MDM solutions and their capabilities in a given IT landscape. Even in the party domain.

Bookmark and Share

Buying a PIM Solution at Harrods

Today I attended the Informatica MDM Day for EMEA here in London.

London has a lot of attractions. If you for example want to see a lot of big price tags and go to a public toilet with a very nice odeur the place to go is the famous luxury department store called Harrods.

Harrods

Harrods, represented by Peter Rush, presented their Product Information Management (PIM) journey at the Informatica event. So, how does a luxury PIM implementation look like?

It starts with realising that traditional product master data in retail has mostly been about the buy-side, but today, not at least in light of the multi-channel challenge, you must add the sell-side to product master data, meaning having customer friendly product information.

After setting that scene Harrods went into selecting a PIM solution, meaning eliminating possible vendors one by one until the lucky one was chosen. In this case Heiler (now Informatica). In the last stages evaluated vendors were sent home based on criteria like roadmap, being in Texas and as the last step the price.

Bookmark and Share

Hierarchical Completeness within Product Information Management

Some years ago I wrote a blog post called Hierarchical Completeness. This post also had some excellent comments and David Loshin made a good follow up post called Hierarchy Data Completeness and Semantic Convergence.

HierarchyThe importance of hierarchical completeness, not at least within Product Information Management (PIM), has become close to me again.

It is a numbers game. Often having an advanced PIM solution on board is based by the fact that you have many products to manage. Too many products for a single data steward to control. Add to that today’s challenges of doing multi-channel business and tomorrows challenges of embracing social media engagement. This means a lot more attributes and digital assets per product and perhaps more products to manage as told in the post called Social PIM.

All products aren’t equal. The one size fits all term doesn’t apply to selling shoes or any other range of products. The attributes and assets needed differ per product categorization and so does the performance measures and expectations for each product.

Bookmark and Share