A problem in data cleansing I have come across several times is when you have some name and address registrations where it is uncertain to which country the different addresses belong.
Many address-cleansing tools and services requires a country code as the first parameter in order to utilize external reference data for address cleansing and verification. Most business cases for address cleansing is indeed about a large number of business-to-consumer (B2C) addresses within a particular country. But sometimes you have a batch of typical business-to-business (B2B) addresses with no clear country registration.
The problem is that many location names applies to many different places. That is true within a given country – which was the main driver for having postal codes around. If a none-interactive tool or service have to look for a location all over the world that gets really difficult.
For example I’m in Richmond today. That could actually be a lot of places all over the world as seen on Wikipedia.
I am actually in the Richmond in the London, England, UK area. If I were in the state capital of the US state of Virginia, I could have written I’m in “Richmond, VA”. If an international address-cleansing tool looked at that address, I guess it would first look for a country code, quickly find VA as a two-character country code in the end of the string and firmly conclude I’m at something called Richmond in the Vatican City State.
Have you tried using or constructing an international address cleansing process? Where did you end up?