Troubled Bridge Over Water

In the recent blog post A pain in the… I described my summer holiday fun being a cycling tour round the Baltic coast.

You meet a lot of data quality issues on such a tour.

One experience was when we arrived in the Polish town Świnoujście. I planned the tour using Google Maps. According to the plan we would arrive in Świnoujście from the west, cross the bridge over the river Świna and reach the ferry to Sweden on the east bank close to the railway station.

Nice plan. Only thing: Opposite to what’s shown on Google Maps – and told in the route planner, there is no bridge across the river in the real world.

Fortunately there was a free ferry service across the river. So we did catch the once a day big ferry to Sweden in time.

PS: The road name on the bridge on Google Maps is by the way Wodna. Wodna is Polish for (something with) water.

Bookmark and Share

Psychographic Data Quality

I have just read an article on Mashable by Jamie Beckland called The End of Demographics: How Marketers Are Going Deeper With Personal Data.

The article explains how new sources of available data makes it possible for marketers to get a much closer look at potential customers and thereby going from delivering a broad message to a huge crowd to delivering a very targeted message to a small group of people with a high probability of getting a response.  In short: Marketers are going from demographic marketing to psychographic marketing.

I believe this is true and ongoing (as I have also been involved in such activities).

The data quality issues we have always known in direct marketing is surely very similar in the psychographic marketing which is going on in the social media realm and in connection with eBusiness.

In my eyes, the concept of a single customer view is also a key to getting success in psychographic marketing.  

You are not delivering a targeted message if you are delivering two different messages to two user profiles belonging to the same real world individual.

Your message will be very frustrating if you treat someone as a prospect customer if that someone already is an existing customer perhaps in another channel.

The effectiveness of psychographic marketing depends on a match between the psychographic variables, the behavioral variables and the demographic variables. As seen in the example in the Mashable article a good old thing as geocoding will be needed here.

An exciting thing in the rise of psychographic marketing is that it will add to the trend in data quality technology where it’s much more than simple name and address cleansing and deduplication.  Rich location data will despite the virtual playground be further important. The relations between customers and products as described in the post Customer Product Matrix Management will be further refined in psychographic marketing.       

Bookmark and Share

Georgian Geography and History

This is the sixth post in a series of short blog posts focusing on data quality related to different countries around the world. I am not aiming at presenting a single version of the full truth but rather presenting a few random observations that I hope someone living in or with knowledge about the country are able to clarify in a comment.

Georgia

Georgia is the English name for a sovereign state in the South Caucasus where Europe meets Asia. Georgia was a part of the Soviet Union under the English name Georgian SSR from 1922 to 1991. Back in the 4th century BC a unified kingdom of Georgia was established as an early example of an advanced state organization under one king and an aristocratic hierarchy.

Georgia

Georgia is a state located in the southeastern United States. Back in the 18th century the area was known as the Province of Georgia within the British colonies. Before the arrival of the Europeans some of current Georgia was part of the Cofitachequi paramount chiefdom.

Ambiguous place names and slowly changing dimensions

Like with Georgia there are lots of examples of place names belonging to more than one place on Earth. Besides that location reference data like the Georgia’s have slowly changing dimensions as what area is covered, where in a hierarchy it belongs and what it is called at a certain time.

Previous Data Quality World Tour blog posts:

A Business Rule and a Missing Master Data Hub

It seems that the United States of America has a problem with the business rule saying you have to be born in the country to become president and a missing citizen master data hub telling about who’s born in the country.

This is an aspect of a previous blog post called Did They Put a Man on the Moon.

Bookmark and Share

Single Company View

Getting a single customer view in business-to-business (B2B) operations isn’t straight forward. Besides all the fuzz about agreeing on a common definition of a customer within each enterprise usually revolving around fitting multiple purposes of use, we also have complexities in real world alignment.

One Number Utopia

Back in the 80’s I worked as a secretary for the committee that prepared a single registry for companies in Denmark. This practice has been live for many years now.

But in most other countries there are several different public registries for companies resulting in multiple numbering systems.

Within the European Union there is a common registry embracing VAT numbers from all member states. The standard format is the two letter ISO country code followed by the different formatted VAT number in each country – some with both digits and letters.

The DUNS-number used by Dun & Bradstreet is the closest we get to a world-wide unique company numbering system.  

2-Tier Reality

The common structure of a company is that you have a legal entity occupying one or several addresses.

The French company numbering system is a good example of how this is modeled. You have two numbers:

  • SIREN is a 9-digit number for each legal entity (on the head quarter address).
  • SIRET is a 14-digit (9 + 5) number for each business location.

This model is good for companies with several locations but strange for single location companies.

Treacherous Family Trees (and Restaurants)

The need for hierarchy management is obvious when it comes to handling data about customers that belongs to a global enterprise.

Company family trees are useful but treacherous. A mother and a daughter may be very close connected with lots of shared services or it may be a strictly matter of ownership with no operational ties at all.

Take McDonald’s as a not perfectly simple (nor simply perfect) example. A McDonald’s restaurant is operated by a franchisee, an affiliate, or the corporation itself. I’m lovin’ modeling it.

Bookmark and Share

Do You Have an Official SnoopBook Account?

I have earlier written about how Facebook resembles a typical Business-to-Consumer customer table in the post Out of Facebook.

Like any customer table the Facebook member table will suffer from a number of different data quality issues like:

  • Some individuals are signed up more than once using different profiles.
  • Some individuals who created a profile are not among us anymore.
  • Some profiles are not an individual person, but a company or other form of establishment.

One type of the latter one seems to be government and other authorities who want to snoop into your daily whereabouts in order to see if you are paying the taxes you should and not receiving welfare services you shouldn’t.

Recently I read a story about a British woman who got jailed on such an account. Link here.

It was not said if the authorities used a special account for the investigation or it was the civil servants personal accounts that were used.

This morning I read an article (in Danish) about the Danish tax authority’s activities in this field. They have realized that they illegally have used personal accounts for such activities, but have stopped that now. However, they will now create an account for the organization to be used for snooping.         

Bookmark and Share

Inside India

This is the second post in a series of short blog posts focusing on data quality related to different countries around the world. I am not aiming at presenting a single version of the full truth but rather presenting a few random observations that I hope someone living in or with knowledge about the country are able to clarify in a comment.

Cultural Diversity

India‘s culture is marked by a high degree of syncretism and cultural pluralism. Every state and union territory has its own official languages, and the constitution also recognizes 21 languages.

National Identification Number for 1.2 Billion People

The government of India has initiated a program for assigning a unique citizen ID for the over 1.2 billion people living in India. The program called Aadhaar is the largest of that kind in the world.

A System Integration Superpower

Tata, Satyam, Infosys, Wipro is just some of the many mega system integrators within master data management and data quality with headquarters in India. Add to that that companies like Cognizant and many others have most of their professionals based in India.  

Bookmark and Share

Check out the Czech Republic

This is the first post in a planned series of short blog posts focusing on data quality related to different countries around the world. I am not aiming at presenting a single version of the full truth but rather presenting a few random observations that I hope someone living in or with knowledge about the country are able to clarify in a comment.

Companies all over

Last time I checked the Czech Republic had the highest number of Duns Numbers (unique company ID’s in the Dun & Bradstreet WorldBase) per capita in the world. Wonder if this is because of a very effective public sector registration, some special rules for incorporation or is it duplicates?

Exonyms, endonyms and beers

Many Czeck cities are known by the English exonyms (the name in English) but of course have a local endonym (name in Czech). The capital Prague is Praha in Czech. The town Pilsen is called Plzeň in Czech, but there are several towns around the world called Pilsen – and then of course there is a sort of beer called pilsener. (České) Budějovice is Czech for Budweis in German and English. We are certainly talking beer here also.

Ataccama

The data quality and master data management firm Ataccama was founded in the Czech Republic.

Bookmark and Share

Typos in the Cloud

By 1st January this year the next largest city in Denmark changed its name. It was only a minor change from “Århus” to “Aarhus” – replacing the Scandinavian letter Å with a double A, which is the normal conversion to the English alphabet.

Data quality would be a lot easier if people, companies and cities stopped changing names. It always goes wrong. First of all a lot of data will be out-of-sync. And then the change may go wrong.

That is what happened at Google Maps. They introduced a typo so the name of the city on the map now is “Aahrus” – swapping the r and the h in the middle of the name.    

For those out there not sure where on earth Århus/Aarhus/Aahrus is, it is the red dot in the upper right corner, where you have London and Paris in the lower left corner on the map below. You may click on map to enlarge.

Bookmark and Share

Foreign Affairs

There is a famous poster called The New Yorker. This poster perfectly illustrates the centricity we often have about the town, region or country we live in.

The same phenomenon is often seen in data management.

I mentioned United States centricity as a minor criticism in my recent book review about the excellent book “Master Data Management and Data Governance”.  

An example from the book is this statement:

“It is important to differentiate between U.S. domestic addresses and international addresses. This distinction is important for U.S.-centric MDM solutions because U.S. domestic addresses are normally better defined and therefore can be processed in a more automatic fashion, while international addresses require more manual intervention.”

The same fact could be expressed by saying:

“It is important to differentiate between Danish domestic addresses and international addresses. This distinction is important for Danish-centric MDM solutions because Danish domestic addresses are normally better defined and therefore can be processed in a more automatic fashion, while international addresses require more manual intervention.”

Only, the better formatted address in the first case is the messy address in the last case, and the better formatted address in the last case is the messy address in the first case.

If your MDM scope is country-centric it is sensible to concentrate on automation related to that country.

If your MDM scope is international there are two options:

  • The easy way: The one size fits all option. This is a moderate investment, but also, it only yields moderate results in terms of automation and data quality.
  • The hard way: You have to implement specialized automation and investigate best external reference data for each country. I made a Danish-centric post on that last year here.

Bookmark and Share