Data Quality is an Ingredient, not an Entrée

Fortunately it is more and more recognized that you don’t get success with Business Intelligence, Customer Relationship Management, Master Data Management, Service Oriented Architecture and many more disciplines without starting with improving your data quality.

But it will be a big mistake to see Data Quality improvement as an entrée before the main course being BI, CRM, MDM, SOA or whatever is on the menu. You have to have ongoing prevention against having your data polluted again over time.

Improving and maintaining data quality involves people, processes and technology. Now, I am not neglecting the people and process side, but as my expertise is in the technology part I will like to mention some the technological ingredients that help with keeping data quality at a tasty level in your IT implementations.


Many data quality flaws are (not surprisingly) introduced at data entry. Enterprise data mashups with external reference data may help during data entry, like:

  • An address may be suggested from an external source.
  • A business entity may be picked from an external business directory.
  • Various rules exist in different countries for using consumer/citizen directories – why not use the best available where you do business.

External ID’s

Getting the right data entry at the root is important and it is agreed by most (if not all) data quality professionals that this is a superior approach opposite to doing cleansing operations downstream.

The problem hence is that most data erodes as time is passing. What was right at the time of capture will at some point in time not be right anymore.

Therefore data entry ideally must not only be a snapshot of correct information but should also include raw data elements that make the data easily maintainable.

Error tolerant search

A common workflow when in-house personnel are entering new customers, suppliers, purchased products and other master data are, that first you search the database for a match. If the entity is not found, you create a new entity. When the search fails to find an actual match we have a classic and frequent cause for introducing duplicates.

An error tolerant search are able to find matches despite of spelling differences, alternative arranged words, various concatenations and many other challenges we face when searching for names, addresses and descriptions.

Bookmark and Share

Birthday Party

Today this blog has been online one year. It’s time for a birthday party.

The economy around a birthday party usually goes like this:

  • You, the guest, spend some money on a nice birthday present
  • I, the host, spend some money on fine food and beverage

Now, a blog is a virtual thing and I reckon that most of my readers live far, far away from the Copenhagen South Coast.  So it’s going to be a remote birthday party and as most other things happening in the social media realm actually no money is going to be exchanged.

Anyway, here is what I would have liked to serve in the real world:


The dish I have prepared the most times when we have guests is the Spanish paella. I love paella very much and so do all our polite guests.

Also I am a shrimp addict, so I usually like to add two or three different kind of shrimps as the smaller but extremely tasteful Greenlandic shrimps to delicious giant Thai tiger prawns.


My second favorite meal is a steak. You probably don’t get a better steak than those originated from cattle grazing on the Argentinean pampas.

As I live in the Northern Hemisphere it’s summertime now and perfect weather for preparing the steak outside on the grill.


There is so much good wine coming from many places around the world. I like Californian wine, wine from Chile, South African wine, Australian wine, French wine and last but not least Italian wine including the unbeatable Amarone.


As I am a native Dane you will probably expect me to propose a Carlsberg. Don’t get me wrong: Carlsberg is probably a good beer. But there are many other good beers around. When I am in England I like the ultimate mainstream beer: A John Smith (now owned by Dutch Heineken). The best mainstream beer in my opinion is the Belgian Leffe.


Thanks to everyone who has read this blog, subscribed, made a re-tweet and not at least those who has commented.

Bookmark and Share

Data Quality and World Food

I have touched the analogy between food (quality) and data (quality) several times before for example in the posts “Bon Appétit” and “Under New Master Data Management”.

Why not continue down that road?

Let’s have a look at some local food that has become popular around the world.


Imagine you go to a restaurant where you order a fish dish. When starting to consume your dinner you realize that the fish hasn’t been boiled, fried or in any other way exposed to heat. Then I guess it is perfectly normal to shout out: THE FISH IS RAW – and demanding apologies from the chef, the head waiter, Gordon Ramsey or anyone else in charge. Unless of course if you are in a sushi restaurant where the famous Japanese dish that may include raw fish is prepared.


Köttbullar is the Swedish word for meatballs. This had rightfully stayed as a fact only known to Swedes if it wasn’t for cheap furniture sold around the world by IKEA. By reasons still unclear to me IKEA has chosen to serve Köttbullar in the store cafeterias and even sell the stuff along with the particle board furniture on their e-commerce sites.


Italian originated dish usually brought to you by someone on a bike or in extreme cases in a very old car.


Selling food of different kind in the form as a burger works in the United States – and by reasons that I can’t explain even in France.

Data Quality analogies

Well, let’s just say that data quality tools and services:

  • May be regarded very different around the world,
  • Usually are sold along with tools and services made for something completely different,
  • Are brought to you in various ways by local vendors and
  • By reasons I can’t explain often are made for use in the United States (no other pun intended but pure admiration of execution).

Bon appétit.

Bookmark and Share

Under new Master Data Management

”Under new management” is a common sign in the window of a restaurant. The purpose of the sign is to tell: Yes, we know: Really bad food was served in a really bad way here. But from now on we have a new management dedicated to serve really good food in a really good way.

By the way: Restaurants are one of the more challenging business entities to handle in Party Master Data Management:

  • They do change owner more often than most other business entities making them a new legal entity each time which is important for some business contexts like credit risk.
  • On the other hand it’s the same address despite a new owner, which makes it being the same entity in the eyes of other business contexts like logistics.
  • In many cases you may have a name (trade style) of the restaurant and another official name of the business – a variant of this is when the restaurant is franchised.

Master Data Management is not trivial – serving restaurants or not.

Improving Master Data Management starts with the sign in the window: Yes, we know: Really bad information was served here in a really bad way. But from now on we have a new master data management dedicated to serve really good information in a really good way.

Then you may have a look at the menu. Do we have the right mix of menu items for the guests we like to serve? How are we going to govern a steady flow of fresh raw data that’s going to be prepared and selected from the menu and end up at the tables?

What about the waiters attitude? Serving is much more fun if you are proud about the dishes coming from the kitchen. It’s pleasant to bring compliments from guests back to the kitchen – not at least given along with great tips.

The information chef have to be very much concerned about the raw data quality and the tools available for what may be similar to rinsing, slicing, mixing and boiling food.

Bon appetit.

Bookmark and Share

Bon Appetit

If I enjoy a restaurant meal it is basically unimportant to me what raw ingredients from where were used and which tools the chef used during preparing the meal. My concerns are whether the taste meet my expectations, the plate looks delicious in my eyes, the waiter seems nice and so on.

This is comparable to when we talk about information quality. The raw data quality and the tools available for exposing the data as tasty information in a given context is basically not important to the information consumer.

But in the daily work you and I may be the information chef. In that position we have to be very much concerned about the raw data quality and the tools available for what may be similar to rinsing, slicing, mixing and boiling food.

Let’s look at some analogies.

Best before

Fresh raw ingredients is similar to actualized raw data. Raw data also has a best before date depending on the nature of the data. Raw data older than that date may be spiced up but will eventually make bad tasting information.


Buying all your raw ingredients and tools for preparing food – or taking the shortcut with ready made cookie cutting stuff – from a huge supermarket is fast and easy (and then never mind the basket usually also is filled with a lot of other products not on the shopping list).

A good chef always selects the raw ingredients from the best specialized suppliers and uses what he consider the most professional tools in the preparing process.

Making information from raw data has the same options.


Governments around the world has for long time implemented regulations and inspection regarding food mainly focused at receiving, handling and storing raw ingredients.

The same is now going on regarding data. Regulations and inspections will naturally be directed at data as it is originated, stored and handled.


Have you ever tried to prepare your favorite national meal in a foreign country?

Many times this is not straightforward. Some raw ingredients are simply not available and even some tools may not be among the kitchen equipment.

When making information from raw data under varying international conditions you often face the same kind of challenges.