Data governance tools: The new snake oil?

Traditionally data governance has been around the people and process side of data management. However we now see tools marketed as data governance tools either as a pure play tool for data governance or as a part of a wider data management suite as told in the post Who needs a data governance tool?

Snake-oilThe post refers to a report by Sunil Soares. In this report data governance tools are seen as tools related to six areas within enterprise data management: Data discovery, data quality, business glossary, metadata, information policy management and reference data management.

While IBM have tools for everything, according to the report it does not seem like a single tool cures it all – yet.

But will we go there? If we need tools at all, do we need an all-cure snake oil tool for data governance? Or will we be better off with different lubricants for data discovery, data quality, business glossary, metadata, information policy management and reference data management?

Bookmark and Share

Agile MDM. Using IT.

MDM (Master Data Management) projects may have a bad name as large IT projects using huge amount of resources, taken a lot of time and ending up with producing very little measurable results.

This phenomenon isn’t new at all in the IT world. There are often two answers to that challenge:

  1. Don’t treat it as an IT project. It’s all about people and culture.
  2. Do it the agile way using IT.

Lean Evolutionary MDMAfter having a lot of fun with option one you will sooner or later realize that the master data pain points still exists and then come to option two.

I have earlier written some agile posts about Lean MDM and Eating the MDM Elephant and the relevance of having MDM technology that supports the agile way has in my eyes only become more and more apparent since then.

What are your experiences? Who is doing agile MDM – using IT? Is it good?

Bookmark and Share

Multi-Domain MDM Uptake

Within Master Data Management (MDM) doing multi-domain MDM has been trending for a couple of years. Yesterday Gartner (the analyst firm) had a chat session on twitter preceding the upcoming Gartner MDM summits around the world.

Along the way @BillOKane of @Gartner_inc revealed some numbers about multi-domain MDM from the Gartner camp:

Multi-Domain 1

Multi-Domain 2

So, stating these numbers using the MoSCoW method we have that among companies considering MDM:

  • 3 % sees multi-domain MDM as a MUST have now
  • 10 % thinks they SHOULD have multiple-domain MDM now
  • 17 % regards multi-domain MDM as something they COULD have now
  • 70 % WONT have multi-domain MDM now

Bookmark and Share

So, we have four and a half multi-domain MDM vendors

It has been discussed a few times on this blog if Gartner should make a single (multi-domain) Master Data Management quadrant latest in the post MDM for Product Data Quadrant: No challengers. A half visionary.

Well, if Gartner will not Forrester will, as Forrester has just released their MDM wave focusing on multi-platform platforms for MDM. Yep, there is nothing like analyst firms insisting in using their wording as multi-entity, multi-domain or multi-platform MDM within their special visualization as quadrant, landscape, wave and other bulls…. eye stuff to blur things a bit.

If you are a Forrester client or otherwise like to pay money to Forrester, you can get the report here. If you would like to feed one of your eMail addresses into the Informatica marketing machine, you can get the report here.

MDM Brands
This is not the wave. Just some names.

There are only four and half multi-platform MDM vendors in the universe according to Forrester. Three and a half of them are no surprise. Maybe Talend is. I guess one or two more of the Trois acteurs français dans le marché du MDM would like to be there as well.

Bookmark and Share

When High Quality Data doesn’t Yield High Quality Service

Better data quality is a prerequisite of better quality of service but unfortunately high quality data doesn’t necessarily lead to high quality service when the data flow is broken. This happened to me last night.

ubicabs2When landing in London Heathrow Airport I usually, economically as I am, use the train to reach my doorstep. However, when I have to catch an early morning flight I order a cab, which actually has a very reasonable price. So yesterday I decided to book a cab in order to cut 30 to 40 minutes of the journey home on the expense of a minor amount of extra pounds.

Excellent data capture

Usually I just call the cab, but as I arrived by airplane and my local cab service is part of an online booking service, I used that service for the first time. The user interface is excellent. There is rapid addressing for entering the pick-up place which quickly presented me the possible terminals at Heathrow. The destination was just a smooth. As the pick-up is an airport they prompted me for the flight number. Very nice as that makes tracking delays possible for them and also you can check that the airline and terminal is a correct match.

Also they have an app that I geekly downloaded to my phablet.

Going down

Landing times at Heathrow are difficult to predict as it often happens that your flight has a couple of circles over London before landing due to heavy traffic. Yesterday was good though as we came directly down and therefore were ahead of schedule.

ubicabsSo it was OK that my name wasn’t at the signs held by drivers already waiting at the passenger exit. Actually I was so early that I could have reached the not so frequent direct train home. But as I now already had troubled the driver to go there I of course waited while spending time on the app.

There actually also was a driver tracking on the app. Marvelous. At first glance it seemed the driver was there. But then I noticed a message saying driver tracking wasn’t available and therefore the spot in the terminal 3 building would be my own position or requested pick-up place.

Going crazy

5 minutes after requested time the driver called:

“Where are you Mr. Sorensen?”

“I’m at the passenger exit where all drivers are waiting.”

“OK. I’m just parking the car. Go to the front of the coffee shop and I’ll be there in a few minutes.”

I spotted a coffee shop in front of the lifts to the short stay parking and went over there.

10 minutes later the driver called:

“Where are you Mr. Sorensen?”

“I am in front of the coffee shop”

“Costa Coffee?”

“No. It has a different name…”. After some ping-pong I mentioned terminal 3.

“Terminal 3?” the driver responded. “I’m at terminal 5. I was told to go here. I’ll be with you in 5 minutes”.

Going by car in 5 minutes I wondered. That would indicate crossing the runways or using the train tunnel.

Well, while spending more happy time on the phablet the clock approached the point where I would be at my doorstep using the slow train.

40 minutes after requested time the driver arrived. I was waiting for the mandatory sorry that Brits use even when they are not sorry at all.

Instead the driver greeted me with: “Did you order the cab yourself Mr. Sorensen?”

“Yes I did. On the internet.”

“Internet?” the driver replied.

“Your company has an excellent online booking system” I friendly remarked.

“When I called you first I asked for confirmation about where you were”.

As I realized that he was trying to establish that everything was my fault I presented the confirmation on the app.

ubicabs3We continued (without the usual smalltalk) to the destination. Here the driver (instead of a discount) presented an upgraded version of the price on the booking confirmation.

At that point it was too difficult to keep calm and carry on…..

Bookmark and Share

Tsundoku

tsundokuThere is a Japanese word called tsundoku. There is no equivalent English word, but in 6 words it means “buying books and not reading them”.

I guess tsundoku could have an eTsundoku variant describing buying software tools and not using them and that could also include data quality tools as told in the post The Worst Best Sale.

My own example isn’t the only one I’m sure. What may be the reasons for buying data quality tools, but not using them? A few suggestions:

  • Organizational changes after ordering (as in my example)
  • Focus has changed before receiving the delivery
  • The tool was never meant to be used as the buy was merely a sign of showing interest in data quality
  • The data quality tool came free (or hidden) as a part of a larger software suite
  • Data quality tools doesn’t solve anything anyway (not my favorite though, as told in the post The Role of Technology in Data Quality Management)

More suggestions?

Bookmark and Share

Do You Like the Lake?

CapgemeniToday Capgemini as a result of a co-innovation partnership with Pivotal released their take on information management in the big data era in a piece called The Principles of the Business Data Lake.

The business data lake concept is a new try on getting rid of all the excel spreadsheets business people operate because of limitations in today’s enterprise data warehouses and the business intelligence solutions sitting on top of those extracted, transformed and loaded data.

In the business data lake you load raw data including unstructured data sources. Single view and related governance is restricted to master and reference data.

It’s not that you are going to load all the data in the world in your business data lake. You will link internal and external data based on where and when needed.

Thomas Redman has made a famous metaphor in the data quality realm about a polluted lake where the best option to deal with that is to prevent polluted water from streaming into the lake. I guess the rise of big data challenges that take as told some years ago in the post Extreme Data Quality.

In the business data lake we will have polluted data. In that view I think it’s a good thing that master and reference data has a special place in the lake.

What do you think? Do you like the lake – the old and/or the new one?

Bookmark and Share

The Role of Technology in Data Quality Management

A recent article called Data’s Credibility Problem by Thomas Redman on Harvard Business Review has rightfully got a lot of mentions in the data quality community on social media including Twitter.

I agree with many things in the article except I have to question the credibility of this saying:

The solution is not better technology: It’s better communication between the creators of data and the data users”.

There is a lot a truth in this saying. But it is in my eyes not valid.

If the human race had relied solely on communication we would still discuss if a wheel should have the shape of a square or a circle. There is a balance between fruitful communication and throwing technology at problems and you may emphasize on one side or the other depending on if you sell data quality consultancy or data quality tools.

I would say:

“The solution is better communication between the creators of data, the data users and the innovators of data quality technology”.

Now, how do I best spread this message….

papertweet

Bookmark and Share

Everyday Year 2000 Problems

14 years ago this was busy times for computer professionals, including yours truly, because of the upcoming year 2000 apocalypse. The handling of the problem indeed had elements of hysteria, but all in all it was a joint effort by heaps of IT people in meeting a non-postponable deadline around fixing date fields that were too short.

everyday y2k problemsData entry and data storage fields that are too short, have an inadequate format or are missing are frequent data quality issues. Some everyday issues are:

Too short name fields

Names can be very long. But even a moderate lengthy name as Henrik Liliendahl Sørensen can be a problem here and there. Not at least typing your name on Twitter, where the 20 characters name field corresponds very well to the 140 character message length, forces many of us to shorten our name. I found a remedy here from a fellow Sørensen on a work around in the post Getting around the real name length limit in Twitter. Not sure if I’m prepared to take the risk.

Too short and restricted postal code fields

When working with IT solutions in Denmark you see a lot of postal code fields defined as 4 digits. Works fine with Danish addresses but is a real show stopper when you deal with neighboring Swedish and German 5 digit postal codes and not at least postal codes with letters from the Netherlands and the United Kingdom and most other postal codes from around the world.

Missing placeholder for social identities

The rise of social media has been incredible during the last years. However IT systems are lacking behind in support for this. Most systems haven’t a place where you can fill in a social handle. Recently James Taylor wrote the blog post Getting a handle on social MDM. Herein James describes a work around in a IBM MDM solution. Indeed we need ways to link the old systems of records with the new systems of engagement.

Bookmark and Share

MDM for Product Data Quadrant: No challengers. A half visionary.

The Gartner Magic Quadrant for Master Data Management of Product Data Solutions is out. You may have a free look at it for example going through Tibco’s press release on the matter here or directly thanks to IBM here.

MDM Brands
This is not the quadrant. Just some names.

My thoughts are kind of the same as told in the post MDM for Customer Data Quadrant: No challengers. No visionaries. Let’s just have a (Multi-Domain) MDM quadrant.

The quadrant for customer MDM solutions had no challengers and no visionaries. The quadrant for product MDM solutions has a half visionary, as Informatica is positioned as a niche player with its recent Heiler acquisition and just on the right side of the border between niche players and visionaries with what must be the Siperian acquisition from a couple of years ago. This solution has multi-domain capabilities as an important strength.

Bookmark and Share