Business and Pleasure

The data quality and master data management (MDM) realm has many wistful songs about unrequited love with “the business”.

This morning I noticed yet a tweet on twitter expressing the pain:

Here Gartner analyst Ted Friedman foresees the doom of MDM if we don’t get at least the traction from “the business” that BI (Business Intelligence) is getting.

In my eyes everything we do in Information Technology is about “the business”. Even computer games and digital entertainment is a core part of the respective industries. I also believe that IT is part of “the business”.

“The rest of the business” does see that some disciplines belong in the IT realm. This goes for database management, programming languages and network protocols. These disciplines are not doomed at all because it is so. “The rest of the business” couldn’t work today without these things around.

Certainly I have seen some IT based disciplines and related tools emerged and then been doomed during my years in the IT business. Anyone remembers case tools?   

With case tools I remember great expectations about business involvement in application design. But according to Wikipedia the main problems with case tools are (were): Inadequate standardization, unrealistic expectations, slow implementation and weak repository controls.

In other words: “The rest of the business” never really got in touch with the case tools because they didn’t work as supposed.

The business traction we see around BI (and the enabling tools) now is in my eyes very much about that the tools have matured, actually works, have become more user friendly and seems to create useful results for “the rest of the business”.

Data quality tools and MDM tools must continue to follow that direction too, because for sure: Data Quality tools and MDM tools does not solve any severe problems internally in the IT part of “the business”.

It’s my pleasure being part of that.

Bookmark and Share

Data Quality: The Movie

Learning from courses, books, articles and so on is good – but sometimes a bit like watching a movie and then realizing that the real world – especially your world – isn’t exactly as in the movie.


The parking experience:

The movie: You are going to visit someone in a huge building in the centre of a large city. You take your car to the front of the building and smoothly place the car on the free parking spot next to the main entrance.

Real life: You drive round and round for ages until finally you find a free parking spot hardly in walking distance from your destination.

My life: I have during my 30 years in the IT business visited a lot of companies and spent time in the IT departments. Nobody does everything by the book. Not even close.

Maybe large companies within financial services are those who in my experience are within some distance of doing something by the book. This is probably because most books about IT seem to be written by folks who had their experiences from working in large financial service businesses.

(And no, I have absolutely no documentation on that. It is just a gut feeling).

Hitting them hard:

The movie: You are a good guy observing a bad guy harassing a good looking girl. You engage the bad guy in an intense fist fight, you are hit over and over again, but in the end you win. The good looking girl thanks you by kissing your beautiful face.

Real life: Well, you may win the fight. But after that you have to go the hospital and have them fix your face – and during the following month any girl can’t look at you without feeling very bad.

My life: Recently I was involved in a data management project aimed at producing some new business intelligence results. Executive sponsorship was no problem, the CEO was the initiator. Objectives were pretty clear. High level business requirements were well known and not to forget, everyone was fully aware of the impact from data quality. The only issue was the absence of more concrete detailed requirements and business rules for reporting. And of course a political settled deadline.

Facing the business rule issue we took a data centric and test driven approach. We produced incremental results, verified test cases, negotiated business rules based on real data examples and in the end a first report came out. The result was far from expected in the sense that the numbers was expected to be different. We dived into data again, found an unexpected data quality issue, corrected accordingly. The result was still far from expected. Based on a specific expected result we dived into a section of data, made detailed reports and compared to real world. In the end it turned out that the report was right, the gut feeling perception of the real world had been wrong for a long time.

Now that’s a winner, right? Well, the project is on hold now for political reasons and also the project has a bad name for going over budget and deadline.

Looking great:

The movie: Morning scene from the nuclear family. Mommy is looking really great (stylish hair, perfect face) while cooking and serving a nice breakfast and helping the kids doing some last minute homework at the same time.

Real life: I think you know.

My life: Actually I have learned that you don’t have to strive for perfection. With data quality; don’t expect you are able to fix everything and having all data fit for every purpose of use at any time.

Bookmark and Share

When computer says maybe

When matching customer master data in order to find duplicates or find corresponding real world entities in a business directory or a consumer directory you may use a data quality kind of deduplication tool to do the hard work.

The tool will typically – depending on the capabilities of the tool and the nature of and purpose for the data – find:

A: The positive automated matches.  Ideally you will take samples for manual inspection.

C: The negative automated matches.

B: The dubious part selected for manual inspection.

Humans are costly resources. Therefore the manual inspection of the B pot (and the A sample) may be supported by a user interface that helps getting the job done fast but accurate.

I have worked with the following features for such functionality:

  • Random sampling for quality assurance – both from the A pot and the manual settled from the B pot
  • Check-out and check-in for multiuser environments
  • Presenting a ranked range of computer selected candidates
  • Color coding elements in matched candidates – like:
    • green for (near) exact name,
    • blue for a close name and
    • red for a far from similar name
  • Possibility for marking:
    • as a manual positive match,
    • as a manual negative match (with reason) or
    • as questionable for later or supervisor inspection (with comments)
  • Entering a match found by other methods
  • Removing one or several members from a duplicate group
  • Splitting a duplicate group into two groups
  • Selecting survivorship
  • Applying hierarchy linkage

Anyone else out there who have worked with making or using a man-machine dialogue for this?

The Myth about a Myth

A sentiment repeated again and again related to Data (Information) Quality improvement goes like this:

“It’s a myth that Data Quality improvement is all about technology”.

In fact you see the same related to a lot of other disciplines as:

  • “It’s a myth that Master Data Management is all about technology”.
  • “It’s a myth that Business Intelligence is all about technology”.
  • “It’s a myth that Customer Relationship Management is all about technology”.

I have a problem with that: I have never heard anyone say that DQ/MDM/BI/CRM… is all about technology and I have never seen anyone writing so.

When I make the above remark the reaction is almost always this:

“Of course not, but I have seen a lot of projects carried out as if they were all about technology – and of course they failed”.

Unquestionable true.

But the next question is then about root cause. Why did those projects seem to be all about technology? I think it was:

  • Poor project management or
  • Bad balance between business and IT involvement or
  • Immature technology alienating business users.

In my eyes there is no myth about that Data Quality (and a lot of other things) is all about technology. It’s a myth it’s a myth.

Bookmark and Share

Driving Data Quality in 2 Lanes

Yesterday I visited a client in order to participate in a workshop on using a Data Quality Desktop tool by more users within that organisation.

This organisation makes use of 2 different Data Quality tools from Omikron:

  • The Data Quality Server, a complete framework of SOA enabled Data Quality functionality where we need the IT-department to be a critical part of the implementation.
  • The Data Quality Desktop tool, a user friendly piece of windows software installable by any PC user, but with sophisticated cleansing and matching features.

During the few hours of this workshop we were able to link several different departmental data sources to the server based MDM hub, setting up and confirming the business rules for this and reporting the foreseeable outcome of this process if it were to be repeated.

Some of the scenarios exercised will continue to run as ad hoc departmental processes and others will be upgraded into services embraced by the enterprise wide server implementation.

As I – for some reasons – went to this event going by car over a larger distance I had the time to compare the data quality progress made by different organisations with the traffic on the roads where we have:

  • Large busses with persons and large lorries with products being the most sustainable way of transport – but they are slow going and not too dynamic. Like the enterprise wide server implementations of Data Quality tools.
  • Private cars heading at different destinations in different but faster speeds. Like the desktop Data Quality tools.

 I noticed that:

  • One lane with busses or lorries works fine but slowly.
  • One lane with private cars is bit of a mess with some hazardous driving.
  • One lane with busses, lorries and private cars tends to be mortal.
  • 2 (or more) lanes works nice with good driving habits.

800px-E20_53So, encouraged by the workshop and the ride I feel comfortable with the idea of using both kind of Data Quality tools to have coherent user involved agile processes backed by some tools and a sustainable enterprise wide solution at the same time.

Bookmark and Share