OMG: Santa is Fake

santa facebook picturesThis blog has earlier had some December blog posts about how Santa Claus deals with data quality (Santa Quality) and master data management (Multi-Domain MDM Santa Style).

As I like to be on the top of the hype curve I was preparing a post about how Santa digs into big data, including social data streams, to be better at finding out who is nice and who is naughty and what they really want for Christmas. But then I suddenly had a light bulb moment saying: Wait, why don’t you take your own medicine and look up who that Santa guy really is?

santa on twitterStarting in social media checking twitter accounts was shocking. All profiles are fake. FaceBook, Linkedin and other social networks all turned out having no real Santa Claus. Going to commercial third party directories and open government data had the same result. No real Santa Claus there. Some address directories had a postal code with a relation like the postcode “H0 H0 H0” in Canada and “SAN TA1” in the UK, but they seem to kind of fake too.

So, shifting from relying on the purpose of use to real world alignment I have concluded that Santa Claus doesn’t exist and therefore he can’t have a data store looking like a toy elephant or any other big data operations going on.

Also I won’t, based on the above instant data quality mash up, register Santa Claus (Inc.) as a prospective customer in my CRM system. Sorry.

Bookmark and Share

Data Quality vs Identity Checking

Yesterday we had a call from British Gas (or probably a call centre hired by British Gas) explaining the great savings possible if switching from the current provider – which by the way is: British Gas. This is a classic data quality issue in direct marketing operations being accurately separating your current customers and entities belonging to new market.

As I have learned that your premier identity proof in the United Kingdom is your utility bill, this incident may be seen as somewhat disturbing – or by further thinking, maybe a business opportunity 🙂

identity resolutionAt iDQ we develop a solution that may be positioned in the space between data quality prevention and identity check by addressing the identity resolution aspect during data capture.

The nearly two year old post The New Year in Identity Resolution explains some different kinds of identity resolution being:

  • Hard core identity check
  • Light weight real world alignment
  • Digital identity resolution

Since then I have seen a slowly but steady convergence of these activities.

Bookmark and Share

What’s Different about MDM in Denmark?

little_mermaid_copenhagenAfter writing about What’s Different about MDM in France? It’s now time for a few words about what’s different about MDM (Master Data Management) in my native country Denmark.

International aspects of MDM isn’t strange

As a small market the domestic aspects is a minor thing for the large Danish companies as Maersk, Lego and Carlsberg. But even smaller companies grow out of the domestic market very early and that means that international aspects of MDM and related data quality are important in many implementations.

Even the flagship Danish MDM vendor Stibo Systems operated for many years with large foreign clients before recently getting their first domestic client.

Exploiting external reference data is imperative

Some industry sectors like finance and utility are still very domestic oriented. In party master data management good quality external reference data about domestic addresses, properties, companies and citizens are available at an affordable price and in some degree even free.

In my work at iDQ we have utilized this a lot as told in the story from the utility sector in the post instant Data Quality at Work. iDQ also help foreign companies as for example the largest bank in the Eurozone. As an example of another MDM domain than customer the service is also used for the HR domain by one of the the world’s largest employers.

The government isn’t a laggard

As reported in the post Making Data Quality Gangnam Style the government supports the use of reference data in private entities. But the data sources are indeed handled as a MDM program within the public sector ensuring that data silos in the public sector are eliminated with big wins for public administration as another result. Newest development is centralized support for handling master data about Danes abroad. Good to be involved.

Bookmark and Share

What Should a Data Quality Tool Do?

Earlier this month we had this year’s magic quadrant for data quality tools from Gartner (the analyst firm). The magic quadrant always stirs up posts about data quality tools and this is true again this year. For example yours truly had a post here and Lorraine Lawson had a say on the ITBusinessEdge in the post Eight Questions to Ask Before Investing in Data Quality Tools.

Some of these questions asked by Lorraine relates to a grounding principle in the magic quadrant that is, that the data quality tool should be able to do everything data quality and even, as stated in Lorraine’s question 2: Can it be embedded into business process workflows or other technology-enabled programs or initiatives, such as MDM and analytics?

The LEGO StoryThinking that question  to the end inevitably makes you think about where data quality tools ends and where applications for different business processes, with data quality built in, takes over?

That question is close to me as I’m right now working with a tool for maintaining party master data with two main advantages:

  • Making the business process as smooth as possible
  • Ensuring data quality at pre data entry and all through the data lifetime

So, it’s not a true data quality tool. It doesn’t do everything data quality. It’s not a true MDM platform. It doesn’t do everything master data. But I would say that it does do what it does better than the full monty behemoths.

Bookmark and Share

Building an instant Data Quality Service for Quotes

In yesterday’s post called Introducing the Famous Person Quote Checker the issue with all the quotes floating around in social media about things apparently said by famous persons was touched.

The bumblebee can’t fly faster than the speed of light – Albert Einstein
The bumblebee can’t fly faster than the speed of light – Albert Einstein

If you were to build a service that could avoid postings with disputable quotes, what considerations would you have then? Well, I guess pretty much the same considerations as with any other data quality prevention service.

Here are three things to consider:

Getting the reference data right

Finding the right sources for say reference data for world-wide postal addresses was discussed in the post A Universal Challenge.

The same way, so to speak, it will be hard to find a single source of truth about what famous persons actually said. It will be a daunting task to make a registry of confirmed quotes.

Embracing diversity

Staying with postal addresses this blog has a post called Where the Streets have one Name but Two Spellings.

The same way, so to speak again, quotes are translated, transliterated and has gone through transcription from the original language and writing system. So every quote may have many true versions.

Where to put the check?

As examined in the post The Good, Better and Best Way of Avoiding Duplicates there are three options:

1)      A good and simple option could be to periodically scan through postings in social media and when a disputable quote is found sending an eMail to the culprit who did the posting. However, it’s probably too late, as even if you for example delete your tweet, the 250 retweets will still be out there. But it’s a reasonable way of starting marking up all the disputable quotes out there.

2)      A better option could be a real-time check. You type in a quote on a social media site and the service prompts you: “Hey Dude, that person didn’t say that”. The weak point is that you already did all the typing, and now you have to find a new quote. But it will work when people try to share disputable quotes.

3)    The best option would be that you start typing “If you can’t explain it simply… “ and the service prompts a likely quote as: “Everything should be as simple as it can be, but not simpler – Albert Einstein”.

Bookmark and Share

Undertaking in MDM

Pluto's moon CharonIn the post Last Time Right the bad consequences of not handling that one of your customers aren’t among us anymore was touched.

This sad event is a major trigger in party master data lifecycle management like The Relocation Event I described last week.

In the data quality realm handling so called deceased data has been much about suppression services in direct marketing. But as we develop more advanced master data services handling the many aspects of the deceased event turns up as an important capability.

Like with relocation you may learn about the sad event in several ways:

  • A message from relatives
  • Subscription to external reference data services, which will be different from country to country
  • Investigation upon returned mail via postal services

Apart from in Business-to-Consumer (B2C) activities the deceased event also has relevance in Business-to-Business (B2B) where we may call it the dissolved event.

One benefit of having a central master data management functionality is that every party role and related business processes can be notified about the status which may trigger a workflow.

An area where I have worked with handling this situation was in public transit where subscription services for public transport is cancelled when learning about a decease thus lifting some burden on relatives and also avoiding processes for paying back money in this situation.

Right now I’m working with data stewardship functionality in the instant Data Quality MDM Edition where the relocation event, the deceased event and other important events in party master data lifecycle management must be supported by functionality embracing external reference data and internal master data.

Bookmark and Share

The Good, Better and Best Way of Avoiding Duplicates

Having duplicates in databases is the most prominent data quality issue around and not at least duplicates in party master data is often pain number one when assessing the impact of data quality flaws.

A duplicate in the data quality sense is two or more records that don’t have exactly the same characters, but are referring to the same real world entity. I have worked with these three different approaches to when to fix the duplicate problem:

  • Downstream data matching
  • Real time duplicate check
  • Search and mash-up of internal and external data

Downstream Data Matching

The good old way of dealing with duplicates in databases is having data matching engines periodically scan through databases highlighting the possible duplicates in order to facilitate merge/purge processes.

Finding the duplicates after they have lived their own lives in databases and already have attached different kind of transactions is indeed not optimal, but sometimes it’s the only option as explained in the post Top 5 Reasons for Downstreet Cleansing.

Real Time Duplicate Check

The better way is to make the match at data entry where possible. This approach is often orchestrated as a data entry process where the single element or range of elements is checked when entered. For example the address may be checked against reference data and a phone number may be checked for adequate format for the country in question. And then finally when a proper standardized record is submitted, it is checked whether a possible duplicate exist in the database.

Search and Mash-Up of Internal and External Data

The best way is in my eyes a process that avoids entering most of the data that is already in the internal databases and taking advantage of data that already exists on the internet as external reference data sources.

iDQ mashup
instant Data Quality

The instant Data Quality concept I currently work with requires the user to enter as few data as possible for example through a rapid addressing entry, a Google like search for a name, simply typing a national identification number or in worst case combining some known facts. After that the system makes a series of fuzzy searches in internal or external databases and presents the results as a compact mash-up.

The advantages are:

  • If the real world entity already exists you avoid the duplicate and avoid entering data again. You may at the same time evaluate accuracy against external reference data.
  • If the real world entity doesn’t exist in internal data you may pick most of the data from external sources and that way avoiding typing too much and at the same time ensuring accuracy.

Bookmark and Share

The Relocation Event

relocationWhen maintaining party master data one of the challenges is to have the data about the physical address, and sometimes the physical addresses, of a registered party up to date.

You may learn about that your customer, supplier, employee or whatever party you are keeping on record has moved in many ways. Most common are:

  • The person or organization in question is so kind to tell you so. For some purposes for example in the utility sector this event is a future event that triggers a whole workflow of actions.
  • You get the message via a subscription to external reference data for example using available National Change of Address (NCOA) services and services related to business directories and citizen registries.
  • Your mail to a person or organization is returned from postal services often with no information about the new address, so this means investigation work ahead.

Capability to handle this important issue in party master data management (MDM) embracing all the above mentioned scenarios is essential for many enterprises and doing it on an international scale with the different sources and services available in different countries is indeed a daunting task.

Handling the relocation event is a core functionality in the master data service (iDQ™ MDM Edition) I’m currently working with. There’s lot to do in this quest, so I better move on.

Bookmark and Share

Time To Turn Your Customer Master Data Management Social?

The title of a post on the Nimble blog has this question: Time To Turn Your Sales Team Social?´ The post has a lot of evidence on why sales teams that embrace social selling are doing better than teams that doesn’t do that.

We do see new applications supporting social selling where Nimble is one example from the Customer Relationship  Management (CRM) sphere as explored in the post Sharing Social Master Data. Using social services and exploiting social data in sales related business processes will over time affect the way we are doing customer master data management.

Social MDM2Apart from having frontend applications being social aware we also need social aware data integration services and we do indeed need social aware Master Data Management (MDM) solutions for handling data quality issues and ensuring a Single Customer View (SCV) stretching from the old systems of record to the new systems of engagement.

One service capable of doing data integration between the old world and the new world is FlipTop and some months ago I was interviewed on the FlipTop blog about the links to Social MDM here. Currently I’m working with a social aware Master Data Management solution being the iDQ™ MDM Edition.

What about you? Are your Customer Master Data Management and related data quality activities becoming social aware?

Bookmark and Share

A Universal Challenge

Yesterday on The Postcode Anywhere blog Guy Mucklow wrote a nice piece called University Challenge. The blog post is about challenges with shared addresses and a remedy at least for addresses in the United Kingdom.

And sure, I also had my challenges with a shared address in the UK as reported in the post Multi-Occupancy.

But I guess the University Challenge is a universal challenge.

The postal formats and available reference data sources are of course very different around. Below is an example from the iDQ™ (instant Data Quality) tool when handling a Danish address with multiple flats. Here the tool continuously display what options is available to make the address unique:

iDQ(tm) multi occupancy

Bookmark and Share