When working with customer, or rather party, master data management and related data quality improvement and prevention for traditional offline and some online purposes, you will most often deal with names, addresses and national identification numbers.
While this may be tough enough for domestic data, doing this for international data is a daunting task.
Names
In reality there should be no difference between dealing with domestic data and international data when it comes to names, as people in today’s globalized world move between countries and bring their names with them.
Traditionally the emphasize on data quality related to names has been on dealing with the most frequent issues be that heaps of nick names in the United States and other places, having a “van” in bulks of names in the Netherlands or having loads of surname like middle names in Denmark.
With company names there are some differences to be considered like the inclusion of legal forms in company names as told in the post Legal Forms from Hell.
Address formats varies between countries. That’s one thing.
The availability of public sources for address reference data varies too. These variations are related to for example:
- Coverage: Is every part of the country included?
- Depth: Is it street level, house number level or unit level?
- Costs: Are reference data expensive or free of charge?
As told in the post Postal Code Musings the postal code system in a given country may be the key (or not) to how to deal with addresses and related data quality.
National Identification Numbers
The post called Business Entity Identifiers includes how countries have different implementations of either all-purpose national identification numbers or single-purpose national identification numbers for companies.
The same way there are different administrative practices for individuals, for example:
- As I understand it is forbidden by constitution down under to have all-purpose identification numbers for individuals.
- The United States Social Security Number (SSN) is often mentioned in articles about party data management. It’s an example of a single-purpose number in fact used for several purposes.
- In Scandinavian countries all-purpose national identification numbers are in place as explained in the post Citizen ID within seconds.
Dealing with diversity
Managing party master data in the light of the above mentioned differences around the world isn’t simple. You need comprehensive data governance policies and business rules, you need elaborate data models and you need a quite well equipped toolbox regarding data quality prevention and exploiting external reference data.
Correct identification of business partners, customers & vendors is one of the most important and challenging tacks that globally working are handling. Of course You need to use local registration numbers for legal purposes but when handling the need for global reporting of sales, risk & spend there are other needs of ID:s. In these cases it is more important to have a common ID cross border. Examples are US & parts of UK goverment that uses D&B DUNS Numbers to be able to get control of reporting. As the world gets more global the “home market” gets global an companies need a global-local ID.
Thanks a lot for commenting Bernt-Olof. Indeed the global-local view is another part of handling international party master data. For names it will be about for example having names both in Roman characters and Cyrillic characters for Russian names in Russia. For addresses it could be having the city of Gothenburg as Göteborg as well for both global and local use.
Absolutely, I think the global/local challenge is considerable. For example, in Wales (my home country) it is mandatory for location signs to be in both English and Welsh, as are addresses obviously.
It’s valid for someone to say “4 Mountain View, Dyffryn, Anglesey” and “4 Mountain View, Valley, Anglesey” as Dyffryn and Valley are the same.
Also, due to the complexity of some of our longer place names abbreviations are common e.g.
Llanfair P.G or Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch
Names are also problematic, on the electoral roll people may be labelled as Dafydd but when dealing with everyday post they may use David.
Welcome to Wales, possibly the best testing ground for data matching tools?
Completely agree Dylan. And when a database system can actually recognise that “ap XXXXX” is a surname, with two words and a lower case “a” I will be astounded …
In Jersey (and no doubt the other Channel Islands) addresses show a similar problem to Dylan’s Welsh addresses. The Postal Address File frequently records the address in mediaeval Norman French (not the same as modern French, oh dear me no). But the inhabitants translate the address into English (because they don’t speak mediaeval Norman French) .
Then the place names (St Helier, St Brélade, etc) are not towns or cities but parishes. And the Islands are not part of the UK, but if a correspondent in the US misses out “UK” there’s a fairly good chance that the letter will end up in New Jersey or in the other Channel Islands (off the coast of California)
Thanks Dylan, Darran and Jack for commenting. I’m wondering if Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch has an exonym in mediaeval Norman French.