Cross Border Data Quality

In data quality improvement you always have to find a balance between the almost impossible, and usually not sensible, vision of achieving zero percent defects and the good old 80-20 rule about aiming at the 80% most frequent issues and leaving the 20% not so frequent issues to a random fate.

One of the issues that usually falls into the 20% neglected issues is cross border challenges with contact master data.

In a recent blog post on the Postcode Anywhere blog Graham Rhind describes the data quality flaws arising from his relocation from Holland in the Netherlands to Germany. The post is called Validate … intelligently.

Personally I have had a lot of similar issues when moving from Denmark to England in the United Kingdom as for example described in the post Staying in Doggerland.

My guess is that we will see an increasing demand for cross border data quality services not at least as regulators are increasingly looking into cross border issues. The FATCA regulation from the United States tax authorities is an example as described in the post The Taxman: Data Quality’s Best Friend.

As globalization moves forward organizations will increasingly work cross border, people will move between countries and more frequently live in one country and work in another country and buy services in another country. In coping with this reality you can’t keep up with data quality by just using a National Change of Address service and other data quality services focused on and optimized for a single country.

Bookmark and Share

instant Data Quality at Work

DONG Energy is one of the leading energy groups in Northern Europe with approximately 6,400 employees and EUR 7.6 billion in revenue in 2011.

The other day I sat down with Ole Andres, project manager at DONG Energy, and talked about how they have utilized a new tool called iDQ™ (instant Data Quality) in order to keep up with data quality around customer master data.

iDQ™ is basically a very advanced search engine capable of being integrated into business processes in order to get data quality for contact data right the first time and at the same time reduce the time needed for looking up and entering contact data.

Fit for multiple business processes

Customer master data is used within many different business processes. Dong Energy has successfully implemented iDQ™ within several business processes, namely:

  • Assigning new customers and ending old customers on installation addresses
  • Handling returned mail
  • Debt collection

Managing customer master data in the utility sector has many challenges as there are different kinds of addresses to manage such as installation addresses, billing addresses and correspondence addresses as well as different approaches to private customers and business customers including considering the grey zone between who is a private account and who is a business account.

New technology requires change management

Implementing new technology into a large organization doesn’t just go by itself. Old routines tend to stick around for a while. DONG Energy has put a lot of energy, so to say, into training the staff in reengineering business processes around customer master data on-boarding and maintenance including utilizing the capabilities of the iDQ™ tool.

Acceptance of new tools comes with building up trust in the benefits of doing things in a new way.

Benefits in upstream data quality 

A tool like iDQ™ helps a lot with safeguarding the quality of contact data where data is born and when something happens in the customer data lifecycle. A side effect, which is at least as important stresses Ole Andres, is that data collection is going much faster.

Right now DONG Energy is looking into further utilizing the rich variety of reference data sources that can be found in the iDQ™ framework.

Bookmark and Share

Business Entity Identifiers

The least cumbersome way of uniquely identifying a business partner being a company, government body or other form of organization is to use an externally provided number.

However, there are quite a lot of different numbers to choose from.

All-Purpose National Identification Numbers

In some counties, like in Scandinavia, the public sector assigns a unique number to every company to be used in every relation to the public sector and open to be used by the private sector as well for identification purposes.

As reported in the post Single Company View I worked with the early implementation of such a number in Denmark way back in time.

Single-Purpose National Identification Numbers

In most countries there are multiple systems of numbers for companies each with an original special purpose. Examples are registration numbers, VAT numbers and employer identification numbers.

My current UK company has both a registration number and a VAT number and very embarrassing for a data quality and master data geek these two numbers have different names and addresses attached.

Other Numbering Systems

The best known business entity numbering system around the world is probably the DUNS-number used by Dun & Bradstreet. As examined in the post Select Company_ID from External_Source Where Possible the use of DUNS-numbers and similar business directory id’s is a very common way of uniquely identifying business partners.

In the manufacturing and retail world legal entities may, as part of the Global Data Synchronization Network, be identified with a Global Location Number (GLN).

There has been a lot of talk in the financial sector lately around implementing yet a new numbering system for legal entities with an identifier usually abbreviated as LEI. Wikipedia has the details about a Legal Entity Identification for Financial Contracts.

These are only some of the most used numbering systems for business entities.

So, the trend doesn’t seem to be a single source of truth but multiple sources making up some kind of the truth.

Bookmark and Share

Where is the Spot?

One of things we often struggle with in data quality improvement and master data management is postal addresses. Postal addresses have different formats around the world, names of streets are spelled alternatively and postal codes may be wrong, too short or suffer from other flaws.

An alternative way of identifying a place is a geocode and sometimes we may think: Hurray, geocodes are much better in uniquely identifying a place.

Well, unfortunately not necessarily so.

First of all geocodes may be expressed in different systems. The most used ones are:

  • Latitude and longitude: Even though the globe is not completely round, this system for most purposes is good for aligning positions with the real world.
  • UTM: When the world is reflected on a paper or on a computer screen it becomes flat. UTM reflects the world on a flat surface very well aligned with the metric system making distance calculations straight forward.
  • WGS: This is the system in use in many GPS devices and also the one behind Google Maps.

Next, where is the address exactly placed?

I have met at least three different approaches:

  • It could be where the building actually is and then if the precision is deep and/or the building is big on different places around the building.
  • It could be where the ground meets a public road. This is actually most often the case, as route planning is a very common use case for geocodes. The spot is fit for the purpose of use so to say.
  • It could, as reported in the post  Some Times Big Brother is Confused, be any place on (and beside) the street as many reference data sources interpolates numbers equally along the street or in other ways gets it wrong by keeping it simple.

Bookmark and Share

The Big MDM Trend

Back in 2011 Gartner (the analyst firm) released a document where Gartner Highlights Three Trends That Will shape the Master Data Management Market.

The three things were:

  • Multi-Domain MDM
  • MDM in the Cloud
  • MDM and Social Networks

MDM and Social Networks (also called Social MDM) was described as shown below:

Gartner 3 MDM things 2011

In a 2012 article on Computerweekly called Three trends that will shape the master data management market also by John Radcliffe of Gartner the three trends are repeated however with social MDM now described in the context of MDM and big data:

Gartner 3 MDM things 2012

The slightly different use of terms to describe the trends and what it entails used by Gartner follows the big trend of using the term “big data” by everyone else in the industry as discussed in the post Data Quality vs Big Data, where you see that the use of the term “big data” exploded just after the original Gartner piece on the three trends.

Bookmark and Share

Going in the Wrong Direction

When travelling with the London Underground I have several times noticed that the onboard passenger information system is set wrong, typically as if we are going in the opposite direction as what was announced on the station and where the train actually is heading.

People’s reactions

The reaction among the passengers to this data quality flaw varies. Most people who seem to be frequent commuters don’t seem to bother but keeps calm and carries on. Tourists on the other hand get confused and immediately try to appoint the culprit among them who apparently got them on the wrong train.

As the information system keeps on announcing the next station as the one we just left everyone not being new passengers keeps calm and carries on in the opposite direction of the data presented.

Big data quality issues

The problem with wrong journey settings in data collection within public transportation has actually been a challenge I have worked with a lot.

Besides confusing the passengers if presented on the onboard passenger information display and voicing, the data collection may also be corrupted leading to data quality issues when data is stored in a data warehouse or by other techniques in order to facilitate analysis of passenger travel patterns, how well the services applies to schedules and other reporting based on these big numbers of transaction data collected every day.

Aligning with master data

The challenge is to correctly join the transaction data with the right master data entities. A vehicle stop, and in some cases the passenger boarding and alighting, must be associated with the right product being a given journey on a given service according to a given time schedule.

Many other exploitations of big data shares the same basic data quality challenge. If we don’t get the transaction data joined correctly with the master data entities involved, any analysis and reporting may be going in the wrong direction.

Bookmark and Share

”Fitness for Use” is Dead

The definition of data quality as being ”fitness for use” is challenged. “Real world alignment” or similar expressions are gaining traction.

Back in May Malcolm Chisholm made a tweet about the shortcomings of the “fitness for use” definition reported here on the blog in the post The Problem with Multiple Purposes of Use.

Last week the tweet was elaborated on the Information Management article called Data Quality is Not Fitness for Use. Today Jim Harris has a follow post called Data and its Relationships with Quality.

When working with data quality in the domain with far the most data quality issues being the quality of contact data (customer, supplier, employee and other party master data) I have many times experienced that making data fit for more than a single purpose of use almost always is about better real world alignment. Having data that actually represents what it purports to represent always helps with making data fit for use, even with more than one purpose of use.

In practice that in the contact data realm for example means:

  • Getting a standardized address at contact data entry makes it possible for you to easily link to sources with geo codes, property information and other location data for multiple purposes.
  • Obtaining a company registration number or other legal entity identifier (LEI) at data entry makes it possible to enrich with a wealth of available data held in public and commercial sources making data fit for many use cases.
  • Having a person’s name spelled according to available sources for the country in question helps a lot with typical data quality issues as uniqueness and consistency.

Also, making data real world aligned from the start is a big help when maintaining data as the real world will change over time.

Data quality tools will in my eyes also have to apply to this trend as discussed with Gartner in the post Quality of Data behind the Data Quality Magic Quadrant.

Bookmark and Share

Social PIM

During the last couple of years I have been talking about social MDM (Social Master Data Management) on this blog.

MDM (Master Data Management) mainly consists of two disciplines: CDI (Customer Data Integration) and PIM (Product Information Management).

With social MDM most of the talk have been around CDI as the integration of social network profiles with traditional customer (or party) master data.

But there is also a PIM side of social MDM.

Making product data lively

The other day Kimmo Kontra had a blog post called With Tiger’s clubs, you’ll golf better – and what it means to Product Information Management. Herein Kimmo examines how stories around products help with selling products. Kimmo concludes that within master data management there is going to be a need for storing and managing stories.

I agree. And having stories related to your products and services is a must for social selling. Besides having the right hard facts about products consistent across multiple channels, and having the right images and other rich media consistent as well, you will also need to include the right and consistent stories when the multiple channels embraces social media.

Sharing product data

How do we ensure that we share the same product information, including the same stories, across the ecosystem of product manufacturers, distributors and retailers?

Recently I learned about a cloud service called Actualog aiming at doing exactly that with emphasis on the daunting task of sharing product data in an international environment with different measurement systems, languages, alphabets and script systems.

Actualog very much resembles the cloud service called iDQ™ I’m working with related to customer data integration.

Listening to big data

As discussed in the post Big Data and Multi Domain Master Data Management a prerequisite for getting sense out of analyzing social data (and other big data sources) is, that you not only have a consistent view of the product data related to products that you sell yourself, but also have a consistent view of competing products and how they relate to your products.

So, social PIM requires you to extend the volume of products handled by your product information management solution probably in alternate product hierarchies.

Bookmark and Share

Sometimes Google Translate is a Foolish Friendship

This morning I stumbled upon an article in a Norwegian online newspaper. A rather unlikely incident actually happened to a driver, as he avoided hitting an elk on the road, but then ran into a bear.

The original text in Norwegian is here:

As I wanted to see how that would be in English, I hit the Google Translate button:

In the headline the two animals are translated from “elg” to “elk” and from “bjørn” to “bear”. Very well.

But in the subtitle the two words are translated differently. Now “elg” is “moose” and “bjørn” is “disservice”.

Hmmm…

Not sure why elk is substituted to moose. The two words are used synonymously. As I understand it, it must have been a moose, which is called an elk. Wkipedia has the details here.

But how did the bear become a disservice. Well, I guess it relates to an old fable called “The Bear and the Gardener” or the variant “The Hermit and the Bear”. Here a human becomes friend with a bear. While the man takes a nap, the bear helps driving off the flies, but eventually crushes the mans head in doing so. The moral is that you should not make foolish friendships.

In Danish/Norwegian such a well-meant but very bad attempt to help is a “bear’s service” (bjørnetjeneste) also known in German as a bärendienst. Just like Google Translate in this case became a disservice.

Bookmark and Share

instant Data Quality and Business Value

During the last couple of years I have been working with a cloud service called instant Data Quality (iDQ™).

iDQ™ is basically a very advanced search engine capable of being integrated into business processes in order to get data quality for contact data right the first time and at the same time reduce the time needed for looking up and entering contact data.

With iDQ™ you are able to look up what is known about a given address, company and individual person in external sources (I call these big reference data) and what is already known inside your internal master data.

Orchestrating the contact data entry and maintenance processes this way does create better data quality along with creating business value.

The testimonials from current iDQ™ clients tells that story.

Dong Energy, a leader in providing clean and reliable energy, says:

Dong says

From the oil and gas industry Kuwait Petroleum, a company with trust as a core value, adds in:

Q8 says

In the non-profit sector the DaneAge Association, an organization supporting and counselling older people to make informed decisions, also get it:

DaneAge says

You may learn more about iDQ™ on the instant Data Quality site.

Bookmark and Share