As promised earlier today, here is the first post in an endless row of positive posts about success in data quality improvement.
This beautiful morning I finished yet another of these nice recurring jobs I do from time to time: Deduplicating bunches of files ready for direct marketing making sure that only one, the whole one and nothing but one unique message reaches a given individual decision maker, be that in the online or offline mailbox.
Most jobs are pretty similar and I have a fantastic tool that automates most of the work. I only have the pleasure to learn about the nature of the data and configure the standardisation and matching process accordingly in a user friendly interface. After the automated process I’m enjoying looking for any false positives and checking for false negatives. Sometimes I’m so lucky that I have the chance to repeat the process with a slightly different configuration so we reach the best result possible.
It’s a great feeling that this work reduces the costs of mailings at my clients, makes them look more smart and professional and facilitates that correct measure of response rates that is so essential in planning future even better direct marketing activities.
But that’s not all. I’m also delighted to be able to have a continuing chat about how we over time may introduce data quality prevention upstream at the point of data entry so we don’t have to do these recurring downstream cleansing activities any more. It’s always fascinating going through all the different applications that many organisations are running, some of them so old that I didn’t dream about they existed anymore. Most times we are able to build a solution that will work in the given landscape and anyway soon the credit crunch is totally gone and here we go.
I’ll be back again with more success from the data quality improvement frontier very soon.
Beautifully written, Henrik,
It really is a great feeling when you have found and removed so many errors. I remember it well. Such a sense of achievement!
However, this presents us with the Great Data Quality Dilemma!
Do we really want to deprive so many data quality practitioners of this wonderful sense of achievement by removing all data errors at source?
And what about all of the developers and vendors of clever software that helps us find and remove these errors? Will they all go out of business?
Truly a dilemma! Is Data Quality Assurance, with zero data defects, worth the price?
Having practiced it, I am happy to report that it is.
The sense of achievement to be had from designing, building and implementing quality systems is immense. You will have to enjoy it quietly though as there is no razzmatazz. The systems simply work properly from day one. No errors to report, no errors to be corrected.
However, the feeling to be had from seeing the users of the system, and the enterprise as a whole, enjoying the benefits of using a truly quality system are immeasurable.
What about the developers of the clever software? No problem. The same clever tricks they now use to find and rectify data errors will be designed into the systems being built and so prevent data errors ever getting into the databases.
I have heard many people claim that zero data defects is “pie in the sky”. It is not. When a person says this they are really saying “I don’t know how to do it”. They probably don’t, but it does not mean that it cannot be done.
Once again, a great post, Henrik.
Thanks John. Right on. One of the things I do besides the joyful purging and merging of all these duplicates is actually participating in making astonishing components for solutions that will prevent having the duplicates in the first place. I am planning to include the happy resulting success stories in future postings.
Another good post Henrik.
I’m right with you here. There’s nothing quite like the feeling of satisfaction when a database is a happy database again is there?
Not so sure about the Great Data Quality Dilemma though John.
I’m of the view that there are so many companies that have issues with data quality that our job is to help them all! Fix it now, implement ongoing prevention and relax. Our work here is done.
Thanks Daryl. I like the term ”Happy Database” 🙂
Yep, i get my kicks from transforming how we create and manage global data in my company. I advocate maximising good data quality at the point of creation, in order to minimise downstream costs for rectification. Interestingly these downstream ‘cleansing’ activities are often seen as data quality actions, when in fact they are all part of the cost and consequence of ‘bad data’ (of course they provide value and benefit versus not performing at all, however in an ideal world they would be minimised).
Henrik, i’d be interested to hear about your ‘components that help prevent duplicates in the first’ – particularly thoughts on managing in a multi-channel environment and the integration of customer data across an enterprise.
The core features of the components I’m involved with developing includes:
• Enterprise data mashups with external reference data (from the cloud)
• Assigning external ID’s wherever possible and make subscriptions for changes
• Using error tolerant search
Some more information in the post Data Quality is an Ingredient, not an Entrée.