One of things we often struggle with in data quality improvement and master data management is postal addresses. Postal addresses have different formats around the world, names of streets are spelled alternatively and postal codes may be wrong, too short or suffer from other flaws.
An alternative way of identifying a place is a geocode and sometimes we may think: Hurray, geocodes are much better in uniquely identifying a place.
Well, unfortunately not necessarily so.
First of all geocodes may be expressed in different systems. The most used ones are:
- Latitude and longitude: Even though the globe is not completely round, this system for most purposes is good for aligning positions with the real world.
- UTM: When the world is reflected on a paper or on a computer screen it becomes flat. UTM reflects the world on a flat surface very well aligned with the metric system making distance calculations straight forward.
- WGS: This is the system in use in many GPS devices and also the one behind Google Maps.
Next, where is the address exactly placed?
I have met at least three different approaches:
- It could be where the building actually is and then if the precision is deep and/or the building is big on different places around the building.
- It could be where the ground meets a public road. This is actually most often the case, as route planning is a very common use case for geocodes. The spot is fit for the purpose of use so to say.
- It could, as reported in the post Some Times Big Brother is Confused, be any place on (and beside) the street as many reference data sources interpolates numbers equally along the street or in other ways gets it wrong by keeping it simple.