Multi-Occupancy

The fact that many people doesn’t live in a single family house but live in a flat sharing the same building number on a street with people living in other flats in the same building is a common challenge in data quality and data matching.

The same challenge also applies to companies sharing the same building number with other companies and not to say when companies and households are in the same building. So this is a common party master data issue.

Address verification and geocoding is seen as important methods for achieving data quality improvement related to the top data quality pain all over being quality of party master data and aiming at getting a single customer view.

Multi-occupancy is a pain in the (you know) getting there.

My pain

I have had some personal experiences living at multi-occupancy addresses lately.

One and a half years ago I was living a painless life in single family house in a Copenhagen suburb.

Then I moved closer to downtown Copenhagen in a flat as mentioned in post Down the Street.

The tradition in Denmark is to send letters and make deliveries and register master data with a common format of units within a building and having separate mailboxes with flat ID and names for each flat. I have received most of my post since then and got all deliveries I’m aware of.

Then I moved to London in a flat. Here the flats in my building have numbers. But the postman delivers the letters in one batch in the street door, and there are no names on the doorbells in front of the door.

So now I sense I don’t get many letters and today I had to order the same stuff trice from amazon.co.uk, because I haven’t received the first two packages despite of their state of the art online accessible package tracking systems that tells me that delivery was successful.

Master data pains unresolved

Address reference data at building number level and related geocodes are becoming commonly available many places around these days.

But having reference data and real world aligned location and related party master data at the unit level is still a challenge most places. Therefore we are still struggling with using address verification and geocoding for single customer view where a given building number has more than a single occupancy.

Bookmark and Share

New Eyes on Iceland

This eights Data Quality World Tour blog post is about Iceland.

Patronymics

Rather than using family names, the Icelanders use patronymics. This means that the first Icelandic President Sveinn Björnsson must have been son of Björn and I guess current Prime Minister Jóhanna Sigurðardóttir is the daughter of Sigurð. This must create some havoc for well proven algorithms for finding households. (Add to that that the Prime Minister is in a same-sex marriage).

Volcanoes

In the good old days air traffic wasn’t concerned with the recurring volcanic eruptions on Iceland. Today it seems to be a repeating cause of travel havoc. A bit like poor data quality wasn’t taken seriously in the good old days, but today dirty data creates havoc in business intelligence implementations.  

Previous Data Quality World Tour blog posts:

Relational Data Quality

Most of the work related to data quality improvement I do is done with data in relational databases and is aimed at creating new relations between data. Examples (from party master data) are:

  • Make a relation between a postal address in a customer table and a real world address (represented in an official address dictionary).
  • Make a relation between a business entity in a vendor table and a real world business (represented in a business directory most often derived from an official business register).
  • Make a relation between a consumer in one prospect table and a consumer in another prospect table because they are considered to represent the same real world person.

When striving for multi-purpose data quality it is often necessary to reflect further relations from the real world like:

  • Make a relation in a database reflecting that two (or more) persons belongs to the same household (on the same real world address)
  • Make a relation in the database reflecting that two (or more) companies have the same (ultimate) mother.

Having these relations done right is fundamental for any further data quality improvement endeavors and all the exciting business intelligence stuff. In doing that you may continue to have more or less fruitful discussions on say the classic question: What is a customer?

But in my eyes, in relation to data quality, it doesn’t matter if that discussion ends with that a given row in your database is a customer, an old customer, a prospect or something else. Building the relations may even help you realize what that someone really is. Could be a sporadic lead is recognized as belonging to the same household as a good customer. Could be a vendor is recognized as being a daughter company of a hot prospect. Could be someone is recognized as being fake. And you may even have some business intelligence that based on the relations may report a given row as a customer role in one context and another role in another context.

Enterprise Data Mashup and Data Matching

A mashup is a web page or application that uses or combines data or functionality from two or many more external sources to create a new service. Mashups can be considered to have an active role in the evolution of social software and Web 2.0. Enterprise Mashups are secure, visually rich web applications that expose actionable information from diverse internal and external information sources. So says Wikipedia.

I think that Enterprise Mashups will need data matching – and data matching will improve from data mashups.

The joys and challenges of Enterprise Mashups was recently touched in the post “MDM Mashups: All the Taste with None of the Calories” by Amar Ramakrishnan of Initiate. Data needs to be cleansed and matched before being exposed in an Enterprise Mashup. An Enterprise Mashup is then a fast way to deliver Master Data Management results to the organization.

Party Data Matching has typically been done in these two often separated contexts:

  • Matching internal data like deduplicating and consolidating
  • Matching internal data against an external source like address correction and business directory matching

Increased utilization of multiple functions and multiple sources – like a mashup – will help making better matching. Some examples I have tried includes:

  • If you know whether an address is unique or not this information is used to settle a confidence of an individual or household duplicate.
  • If you know if an address is a single residence or a multiple residence (like a nursing home or campus) this information is used to settle a confidence of an individual or household duplicate.
  • If you know the frequency of a name (in a given country) this information is used to settle a confidence of a private, household or contact duplicate.

As many data quality flaws (not surprisingly) are introduced at data entry, mashups may help during data entry, like:

  • An address may be suggested from an external source.
  • A business entity may be picked from an external business directory.
  • Various rules exist in different countries for using consumer/citizen directories – why not use the best available where you do business.

Also the rise of social media adds new possibilities for mashup content during data entry, data maintenance and for other uses of MDM / Enterprise Mashups. Like it or not, your data on Facebook, Twitter and not at least LinkedIn are going to be matched and mashed up.

Bookmark and Share

55 reasons to improve data quality

The business value in data quality improvement is an ever recurring topic in the realm of data quality.

In the following I will list the first 55 reasons that comes to my mind for improving data quality related to the single most frequent data quality issue around, which is duplicates (and unresolved hierarchies) in party master data – names and addresses.

It goes like this:

1.  It’s a waste of money sending the same printed material twice or more times to the same individual consumer.

2.  Allowing the same customer enter twice or more times for an introduction offer challenges the return of investment in such campaigns.

3.  When measuring churn and win-back two or more unrelated accounts for the same business hierarchy will produce an incomplete result leading to a wrong decision.

4.  Sending the same promotion eMail twice or more times to the same individual consumer looks like spam even if different eMail addresses are used. Spam has more offending than selling power.

5.  It’s probably a waste of money sending the same printed material with presentation and offerings to a household already having a customer.

6.  Assigning different credit terms for two or more unrelated accounts for the same business hierarchy will make uncontrolled financial risk.

7.  When measuring cross selling results two or more unrelated accounts for the same household will produce an incomplete result leading to a wrong decision.

8.  When measuring life time value two or more unrelated accounts for the same individual consumer will produce a wrong result leading to a wrong decision.

9.  It’s probably a waste of money sending the same printed material twice or more times to the same household.

10.  When measuring life time value two or more unrelated accounts for the same individual being a consumer and a business owner will produce an incomplete result leading to a wrong decision.

11.  When wanting a 1-1 dialogue two or more unrelated accounts for the same individual consumer will not lead to a 1-1 dialogue.

12.  Having companies represented in two or more unrelated accounts for the same company with a different line-of-business assigned will produce an incomplete segmentation.

13.  When trying to point at your best customers being households in order to find similar households two or more unrelated accounts for the same household will produce an incomplete segmentation.

14.  When measuring cross selling results two or more unrelated accounts for the same individual consumer will produce a wrong result leading to a wrong decision.

15.  It’s a waste of money sending printed material with presentation and offerings to an individual consumer already being a customer.

16.  When wanting a 1-1 dialogue two or more unrelated accounts for the same business hierarchy will not lead to a complete 1-1 dialogue.

17.  When measuring life time value two or more unrelated accounts for the same business hierarchy will produce an incomplete result leading to a wrong decision.

18.  Assigning different credit terms for two or more unrelated accounts for the same individual consumer will increase financial risk.

19.  When measuring cross selling results two or more unrelated accounts for the same individual being a consumer and a business owner will produce only an incoherent result leading to a wrong decision.

20.  When wanting a 1-1 dialogue two or more unrelated accounts for the same household will not lead to a true 1-1 dialogue.

21.  Assigning different credit terms for two or more unrelated accounts for the same business entity could increase financial risk.

22.  Having activities related to companies attached to two or more unrelated accounts for the same company will show an incomplete customer history with the risk of taking damaging actions.

23.  It’s a waste of money and credibility sending printed material with presentation and offerings to an individual business decision maker in a business entity already being a customer.

24.  When buying from a supplier having two or more unrelated accounts despite being the same business entity you may miss discount opportunities.

25.  Having companies represented in two or more unrelated accounts for the same company with a different lead source assigned will produce a false measure of marketing and sales performance.

26.  Sending the same promotion eMail or newsletter twice or more times to the same individual business decision maker looks like spam even if different eMail addresses are used. Spam has more offending than selling power.

27.  When measuring  churn and win-back two or more unrelated accounts for the same household will produce an incomplete result leading to a wrong decision.

28.  Having activities related to influencers attached to two or more unrelated business contact records for the same person will show an incomplete business partner history with the risk of retaking already made actions.

29.  When buying from a supplier having two or more unrelated accounts despite they are belonging the same business hierarchy you could miss discount opportunities.

30.  Having activities related to households attached to two or more unrelated accounts for the same household will show an incomplete customer history with the risk of taking insufficient  actions.

31.  When trying to point at your best customers being individual consumers in order to find similar individuals two or more unrelated accounts for the same individual consumer will produce a wrong segmentation.

32.  Having companies represented in two or more unrelated accounts for the same company with a different address assigned will produce an incomplete segmentation.

33.  When measuring life time value two or more unrelated accounts for the same business entity will produce a false result leading to a wrong decision.

34.  Having activities related to decision makers in companies attached to two or more unrelated contacts for the same person will show an incomplete customer contact history with the risk of not taking appropriate actions.

35.  When wanting a 1-1 dialogue two or more unrelated accounts for the same business entity will not lead to a real 1-1 dialogue.

36.  When trying to point at your best customers being companies in order to find similar companies two or more unrelated accounts for the same company will produce a false segmentation.

37.  Maintaining data related to two or more unrelated accounts for the same real world entity will probably be more costly than necessary when exploiting external reference data.

38.  It’s probably a waste of money sending printed material with presentation and offerings to a business entity already being a customer at a higher or lower hierarchy level.

39.  Having individual consumers represented in two or more unrelated accounts for the same individual consumer with a different lead source assigned will produce a wrong measure of marketing and sales performance.

40.  Allowing the same customer re-enter for an offer already turned down (e.g. credit services) will create unnecessary double validation work.

41.  When measuring churn and win-back two or more unrelated accounts for the same business entity will produce a false result leading to a wrong decision.

42.  When wanting a 1-1 dialogue two ore more unrelated accounts for the same individual being a consumer and a business owner will not lead to a sensible 1-1 dialogue.

43.  When measuring cross selling results two or more unrelated accounts for the same business entity will produce a false result leading to a wrong decision.

44.  Having activities related to individual consumers attached to two or more unrelated accounts for the same individual consumer will show an incomplete customer history with the risk of taking wrong actions.

45.  When measuring life time value two or more unrelated accounts for the same household will produce an incomplete result leading to a wrong decision.

46.  Having activities related to customers attached to two or more unrelated accounts for the same real world entity may lead to that different sales representatives are working against each other.

47.  Allowing sales representatives creating new accounts for already existing customers may create time consuming commission disputes.

48.  Having households represented in two or more unrelated accounts for the same household with a different lead source assigned will produce an incomplete measure of marketing and sales performance.

49.  Maintaining data related to two or more unrelated accounts for the same real world entity will consume more manual work than necessary.

50.  When measuring churn and win-back two or more unrelated accounts for the same individual consumer will produce a wrong result leading to a wrong decision.

51.  When buying from a supplier having two or more unrelated accounts despite being the same business entity you may have multiple unnecessary inventory costs.

52.  It’s a waste of money and credibility sending the same printed material twice or more times to the same individual business decision maker.

53.  When measuring churn and win-back two or more unrelated accounts for the same individual being a consumer and a business owner will produce only an incoherent result leading to a wrong decision.

54.  Assigning different credit terms for two or more unrelated accounts for the same household may increase financial risk.

55.  When measuring cross selling results two or more unrelated accounts for the same business hierarchy will produce an incomplete result leading to a wrong decision.

Bookmark and Share

Mu

muThe term ”Mu” has several meanings including being a lost continent. In this post I will use the meaning of “mu” being the answer to a question that can’t be answered with a simple “yes” or “no” or even “unknown” as explained on Wikipedia here.

When working with data quality you often encounter situations where the answer to a simple question must be “mu”.

Let’s say you are looking for duplicates in a customer file and have these two rows (Name, Address, City):

Margaret Smith, 1 Main Street, Anytown
Margaret & John Smith, 1 Main Street, Anytown

Is this a duplicate situation?

In a given context like preparing for a direct mail the answer could be “yes”. But in most other contexts the answer is “mu”. Here the question should be something like: How do you handle hierarchy management with these two rows? And the answer could be something like the process presented in my recent post here.

Similar considerations apply to this example (Name, Address, City):

One Truth Consultants att: John Smith, 3 Main Street, Anytown
One Truth Consultants Ltd, 3 Main Street, Anytown

And this (Contact, Company, Address, City):

John Smith, One Truth Consultants, 3 Main Street, Anytown
John Smith, One Truth Services, 3 Main Street, Anytown

The latter example is explained in more details in this post.

Bookmark and Share

Process of consolidating Master Data

stormp1

In my previous blog post “Multi-Purpose Data Quality” we examined a business challenge where we have multiple purposes with party master data.

The comments suggested some form of consolidation should be done with the data.

How do we do that?

I have made a PowerPoint show “Example process of consolidating master data” with a suggested way of doing that.

The process uses the party master data types explained here.

The next questions in solving our business challenge will include:

  • Is it necessary to have master data in optimal shape real time – or is it OK to make periodic consolidation?
  • How do we design processes for maintaining the master data when:
    • New members and customers are inserted?
    • We update existing members and customers?
    • External reference data changes?   
  • What changes must be made with the existing applications handling the member database and the eShop?

Also the question of what style of Master Data Hub is suitable is indeed very common in these kinds of implementations.

Bookmark and Share