My Ash Cloud Prediction

The Master Data Management Summit Europe 2010 starts tomorrow. I have attended the IRM events in London several times (and also spoken there once). This year I didn’t plan to go to London in April because I predicted the no fly havoc in Northern Europe that would follow the Iceland volcanic eruption given the wind direction. Not?

Unpredictable Inaccuracy

Let’s look at some statements:

• Business Intelligence and Data Mining is based on looking into historical data in order to make better decisions for the future.

• Some of the best results from Business Intelligence and Data Mining are made when looking at data in different ways than done before.

• It’s a well known fact that Business Intelligence and Data Mining is very much dependent on the quality of the (historical) data.

• We all agree that you should not start improving data quality (like anything else) without a solid business case.

• Upstream prevention of poor data quality is superior to downstream data cleansing.

Unfortunately the wise statements above have some serious interrelated timing issues:

• The business case can’t be established before we start to look at the data in the different way.

• Data is already stored downstream when that happens.

• Anyway we didn’t know precisely what data quality issues we have in that context before trying out new possible ways of looking at data.

Solutions to these timing issues may be:

• Always try to have the data reflect the real world objects they represent as close as possible – or at least include data elements that makes enrichment from external sources possible.

• Accept that downstream data cleansing will be needed from time to time and be sure to have the necessary instruments for that.

Bookmark and Share

The Myth about a Myth

A sentiment repeated again and again related to Data (Information) Quality improvement goes like this:

“It’s a myth that Data Quality improvement is all about technology”.

In fact you see the same related to a lot of other disciplines as:

  • “It’s a myth that Master Data Management is all about technology”.
  • “It’s a myth that Business Intelligence is all about technology”.
  • “It’s a myth that Customer Relationship Management is all about technology”.

I have a problem with that: I have never heard anyone say that DQ/MDM/BI/CRM… is all about technology and I have never seen anyone writing so.

When I make the above remark the reaction is almost always this:

“Of course not, but I have seen a lot of projects carried out as if they were all about technology – and of course they failed”.

Unquestionable true.

But the next question is then about root cause. Why did those projects seem to be all about technology? I think it was:

  • Poor project management or
  • Bad balance between business and IT involvement or
  • Immature technology alienating business users.

In my eyes there is no myth about that Data Quality (and a lot of other things) is all about technology. It’s a myth it’s a myth.

Bookmark and Share

55 reasons to improve data quality

The business value in data quality improvement is an ever recurring topic in the realm of data quality.

In the following I will list the first 55 reasons that comes to my mind for improving data quality related to the single most frequent data quality issue around, which is duplicates (and unresolved hierarchies) in party master data – names and addresses.

It goes like this:

1.  It’s a waste of money sending the same printed material twice or more times to the same individual consumer.

2.  Allowing the same customer enter twice or more times for an introduction offer challenges the return of investment in such campaigns.

3.  When measuring churn and win-back two or more unrelated accounts for the same business hierarchy will produce an incomplete result leading to a wrong decision.

4.  Sending the same promotion eMail twice or more times to the same individual consumer looks like spam even if different eMail addresses are used. Spam has more offending than selling power.

5.  It’s probably a waste of money sending the same printed material with presentation and offerings to a household already having a customer.

6.  Assigning different credit terms for two or more unrelated accounts for the same business hierarchy will make uncontrolled financial risk.

7.  When measuring cross selling results two or more unrelated accounts for the same household will produce an incomplete result leading to a wrong decision.

8.  When measuring life time value two or more unrelated accounts for the same individual consumer will produce a wrong result leading to a wrong decision.

9.  It’s probably a waste of money sending the same printed material twice or more times to the same household.

10.  When measuring life time value two or more unrelated accounts for the same individual being a consumer and a business owner will produce an incomplete result leading to a wrong decision.

11.  When wanting a 1-1 dialogue two or more unrelated accounts for the same individual consumer will not lead to a 1-1 dialogue.

12.  Having companies represented in two or more unrelated accounts for the same company with a different line-of-business assigned will produce an incomplete segmentation.

13.  When trying to point at your best customers being households in order to find similar households two or more unrelated accounts for the same household will produce an incomplete segmentation.

14.  When measuring cross selling results two or more unrelated accounts for the same individual consumer will produce a wrong result leading to a wrong decision.

15.  It’s a waste of money sending printed material with presentation and offerings to an individual consumer already being a customer.

16.  When wanting a 1-1 dialogue two or more unrelated accounts for the same business hierarchy will not lead to a complete 1-1 dialogue.

17.  When measuring life time value two or more unrelated accounts for the same business hierarchy will produce an incomplete result leading to a wrong decision.

18.  Assigning different credit terms for two or more unrelated accounts for the same individual consumer will increase financial risk.

19.  When measuring cross selling results two or more unrelated accounts for the same individual being a consumer and a business owner will produce only an incoherent result leading to a wrong decision.

20.  When wanting a 1-1 dialogue two or more unrelated accounts for the same household will not lead to a true 1-1 dialogue.

21.  Assigning different credit terms for two or more unrelated accounts for the same business entity could increase financial risk.

22.  Having activities related to companies attached to two or more unrelated accounts for the same company will show an incomplete customer history with the risk of taking damaging actions.

23.  It’s a waste of money and credibility sending printed material with presentation and offerings to an individual business decision maker in a business entity already being a customer.

24.  When buying from a supplier having two or more unrelated accounts despite being the same business entity you may miss discount opportunities.

25.  Having companies represented in two or more unrelated accounts for the same company with a different lead source assigned will produce a false measure of marketing and sales performance.

26.  Sending the same promotion eMail or newsletter twice or more times to the same individual business decision maker looks like spam even if different eMail addresses are used. Spam has more offending than selling power.

27.  When measuring  churn and win-back two or more unrelated accounts for the same household will produce an incomplete result leading to a wrong decision.

28.  Having activities related to influencers attached to two or more unrelated business contact records for the same person will show an incomplete business partner history with the risk of retaking already made actions.

29.  When buying from a supplier having two or more unrelated accounts despite they are belonging the same business hierarchy you could miss discount opportunities.

30.  Having activities related to households attached to two or more unrelated accounts for the same household will show an incomplete customer history with the risk of taking insufficient  actions.

31.  When trying to point at your best customers being individual consumers in order to find similar individuals two or more unrelated accounts for the same individual consumer will produce a wrong segmentation.

32.  Having companies represented in two or more unrelated accounts for the same company with a different address assigned will produce an incomplete segmentation.

33.  When measuring life time value two or more unrelated accounts for the same business entity will produce a false result leading to a wrong decision.

34.  Having activities related to decision makers in companies attached to two or more unrelated contacts for the same person will show an incomplete customer contact history with the risk of not taking appropriate actions.

35.  When wanting a 1-1 dialogue two or more unrelated accounts for the same business entity will not lead to a real 1-1 dialogue.

36.  When trying to point at your best customers being companies in order to find similar companies two or more unrelated accounts for the same company will produce a false segmentation.

37.  Maintaining data related to two or more unrelated accounts for the same real world entity will probably be more costly than necessary when exploiting external reference data.

38.  It’s probably a waste of money sending printed material with presentation and offerings to a business entity already being a customer at a higher or lower hierarchy level.

39.  Having individual consumers represented in two or more unrelated accounts for the same individual consumer with a different lead source assigned will produce a wrong measure of marketing and sales performance.

40.  Allowing the same customer re-enter for an offer already turned down (e.g. credit services) will create unnecessary double validation work.

41.  When measuring churn and win-back two or more unrelated accounts for the same business entity will produce a false result leading to a wrong decision.

42.  When wanting a 1-1 dialogue two ore more unrelated accounts for the same individual being a consumer and a business owner will not lead to a sensible 1-1 dialogue.

43.  When measuring cross selling results two or more unrelated accounts for the same business entity will produce a false result leading to a wrong decision.

44.  Having activities related to individual consumers attached to two or more unrelated accounts for the same individual consumer will show an incomplete customer history with the risk of taking wrong actions.

45.  When measuring life time value two or more unrelated accounts for the same household will produce an incomplete result leading to a wrong decision.

46.  Having activities related to customers attached to two or more unrelated accounts for the same real world entity may lead to that different sales representatives are working against each other.

47.  Allowing sales representatives creating new accounts for already existing customers may create time consuming commission disputes.

48.  Having households represented in two or more unrelated accounts for the same household with a different lead source assigned will produce an incomplete measure of marketing and sales performance.

49.  Maintaining data related to two or more unrelated accounts for the same real world entity will consume more manual work than necessary.

50.  When measuring churn and win-back two or more unrelated accounts for the same individual consumer will produce a wrong result leading to a wrong decision.

51.  When buying from a supplier having two or more unrelated accounts despite being the same business entity you may have multiple unnecessary inventory costs.

52.  It’s a waste of money and credibility sending the same printed material twice or more times to the same individual business decision maker.

53.  When measuring churn and win-back two or more unrelated accounts for the same individual being a consumer and a business owner will produce only an incoherent result leading to a wrong decision.

54.  Assigning different credit terms for two or more unrelated accounts for the same household may increase financial risk.

55.  When measuring cross selling results two or more unrelated accounts for the same business hierarchy will produce an incomplete result leading to a wrong decision.

Bookmark and Share

Slowly Changing Hierarchies

The term “slowly changing dimensions” is known from building data warehouses and attempting to make sense of data with business intelligence using reference data.

family treeThe fact that the world is changing all the time is also present when we look at Master Data Management and the essential hierarchy building taking place when structuring these data.

Company family trees are a common hierarchy structure in Master Data. One source of information about company family trees is the D&B Worldbase – a database operated by Dun & Bradstreet holding over 150 million business entities from all over the world.

I used to have Dun & Bradstreet as a customer. I don’t have that anymore – but I’m still working with the very same project. Because since I started this assignment US based Dun & Bradstreet handed over the operation in a range of European countries to the Swedish publishing group Bonnier. They later handed it over to Swedish company Bisnode. I started the project when I worked for Swedish consultancy group Sigma, continued in my Danish sole proprietorship and now serve Bisnode through German data quality tool vendor Omikron. Slowly changing relationships indeed.

As with many other activities in the realm of data quality establishing the “golden view”, “the single version of the truth” is only the beginning. If that “golden view” is not put into an ongoing maintenance the shiny gold will fade – slowly but steady.

Bookmark and Share

Data Quality and Common Sense

My favourite story is the fairytale “The Emperor’s new clothes” by Hans Christian Andersen.

Hans_Christian_AndersenIn this tale an emperor hires two swindlers (aka consultants) who offer him the finest dress from the most beautiful cloth. This cloth, they tell him, is invisible to anyone who is either stupid or unfit for his position. In fact there is no cloth at all, but no one (but at the end a little child) dares to say.

The Data Quality discipline is tormented by belonging to both the business side and the technology side of practice. This means that we have to live with the buzzwords and the smartness coming from both the management consultants and the technology consultants and vendors – including myself.

So you really have to believe in a lot of things and terms said in order not to look stupid or unfit for your position.

A way to cope with this is to look behind all the fine terms and recognize that most things said and presented is just another way of expressing common sense. Some examples:

Business Process: What you do at work – e.g. selling some stuff and putting data about it into a database so it’s ready for invoicing.

Referential Integrity Error: When you sold something not in the database. You may pick another item from the current list. Bad Change Management: When someone tells you to do it in another way. Now.

Organisational Resistance: When you find that way completely ridiculous because no one tells you why.

Fuzzy logic: This is about the common nature of most questions in life. Statements are not absolutely true or absolutely false but somewhere in between depending on the angle from where you observe.

Business Intelligence: When someone puts your data along with some other data into a new context visualised in a graph in order to replace human gut feeling.

Poor Enterprise Wide Data Quality: The invoicing went well. The decision made from the graph didn’t. 

Data Governance: Meetings and documents about what went wrong with the data and how we can do better.

My experience is that the most successful data quality improvements is made when it is guided by common sense and expressed as being that. From there you may find great inspiration and practical skills and tools in each area of expertise.