Postal Code Musings

When working with master data management and data quality including data matching one of the most frequent pieces of information you work with is a postal code.

Postal codesWikipedia has a good article about postal code.

Some of the data quality issues related to the datum postal code are:

Metadata

Over the world different words are used for a postal code:

  • ZIP code, the United States implementation of a postal code, is often used synonymously for a postal code in many databases and user interfaces. This is not seriously wrong, but not right either.
  • In India a postal code (in English) is called a PIN Code (Postal Index Number). This could definitely trick me.

Format

There are basically two different formats of postal codes around:

  • Numeric postal codes are the most common ones. The number of digits does however differ between countries. And there may be some additional considerations:
    •  For example the 9 digit United States ZIP code is split into the original 5 digits and the additional 4 digits implemented later.
    • Postal codes may begin with 0 which may create formatting errors when treated as numeric.
  • Some countries, for example the United Kingdom, the Netherlands, Canada and Argentina, have alphanumeric postal codes.

Embedded Information

Numeric postal codes usually forms some kind of hierarchy in which you can guess the geographical position within the country and make ranges representing smaller or larger geographical areas. But you never know.

This also goes for Dutch (you know, the ones in the Netherlands) postal codes as the first 4 characters are numeric.

The UK postal codes usually start with a mnemonic of the main city in the area, except in a lot of cases.

Precision

Some postal code systems have postal codes covering larger areas with many streets and some postal code systems are very granular where each street, or part of a street, has a distinct postal code.

The UK postal code system is very granular which have paved the way for using rapid addressing as told in a recent article on the UK Database Marketing Magazine.

Coverage

Utilizing rapid addressing requires that reference data for postal codes practically covers every spot in the country and updates are available on a near real time basis.

Some countries have postal code systems not covering every corner and some countries haven’t a postal code system at all.

Uniqueness

The main reason for implementing postal code systems is that a town or city name in many cases isn’t unique within a country.

But that doesn’t mean that uniqueness works the other way as well. A postal code may in many countries cover several town names. France is an example.

Consistency

While we basically have granular and not so granular postal code systems we of course also have hybrids.

In Denmark for example there is a granular system in the capital Copenhagen with a postal code for each street, named by the street, and a system in the rest of country with a postal code for an area named by the suburban or town.

Fit for purpose

A postal code is a hierarchical element in a postal address. We basically have two forms of postal addresses:

  • A geographical address where the postal address including the postal code points to place you also can visit and meet the people receiving the things sent to there
  • A post-office box which may have more or less geographical connection to where the people receiving the things sent to there are

Penetration of post-office boxes differs around the world. In Namibia it is mandatory. In Sweden most companies have a post-office box address.

Trying to compare data with these different concepts is like comparing apples and oranges, which often goes bananas.

Bookmark and Share

The Letter Å

I have previously written about the letter Æ and the letter Ø. Now it’s time to write about another letter in Scandinavian alphabets that doesn’t belong to the English alphabet: The letter Å which is å in lower case.

When transliterated to the English alphabet Å becomes AA and å becomes aa. When a name begins with Å it becomes Aa. For example the second largest city in Denmark was called Århus being Aarhus in English. Actually the city council by 1st January 2011, as reported here, changed the name of the city to Aarhus.

AarhusThe Master Data Management tool vendor Stibo Systems has it’s headquarter in an Aarhus suburban. As Stibo was founded in 1794 the company has stayed in Århus some of its life.

The term Master Data Management (MDM) wasn’t known in 1794 and IT wasn’t invented then. Stibo is basically a printing company who became a specialist in making catalogues, later electronic catalogues and the software for doing this, which led to being a Product Information Management (PIM) vendor and now a multi-domain MDM solution provider. By the way: å is pronounced as the o in catalogue. Catalåg.

Bookmark and Share

Making Data Quality Gangnam Style

The 21st December 2012 wasn’t the end of the world. But it was the day a music video for the first time passed one billion views on YouTube. It has been said that a reason for this success for Gangnam Style was that the Korean pop singer PSY hasn’t pursued any copyrights related to the video. But that doesn’t mean that PSY doesn’t earn money from the video. On the contrary related commercials are making money Gangnam Style.

A hindrance for better data quality by better real world alignment has traditionally been lack of free and open reference data.  Some issues has been availability and heavy price tags on government collected data.

In my current daily work I mostly use such data within the United Kingdom and Denmark. And here the authorities are taking different paths.

The prices on UK public reference data has traditionally been fairly high and there’s certainly room for innovation around open government data as reported on DataQualityPro in the post Introduction to the Open Data User Group UK.

In Denmark the 21st December 2012 was the day it was published that a unanimous parliament had agreed on the laws behind having Free and Open Public Sector Master Data. From the 1st January 2013 there are no price tags on reference data about addresses, properties, companies (and citizens) and there are plans for making those data even more available, consistent and timely.

Great news for data quality, Gangnam Style.

Data Quality Gangnam Style

Bookmark and Share

The New Year in Identity Resolution

identity resolutionYou may divide doing identity resolution into these categories:

  • Hard core identity check
  • Light weight real world alignment
  • Digital identity resolution

Hard Core Identity Check

Some business processes requires a solid identity check. This is usually the case for example for credit approval and employment enrolment. Identity check is also part of criminal investigation and fighting terrorism.

Services for identity checks vary from country to country because of different regulations and different availability of reference data.

An identity check usually involves the entity who is being checked.

Light Weight Real World Alignment

In data quality improvement and Master Data Management (MDM) you often include some form of identity resolution in order to have your data aligned with the real world. For example when evaluating the result of a data matching activity with names and addresses, you will perform a lightweight identity resolution which leads to marking the matched results as true or false positives.

Doing such kind of identity resolution usually doesn’t involve the entity being examined.

Digital Identity Resolution

Our existence has increasingly moved to the online world. As discussed in the post Addressing Digital Identity this means that we also will need means to include digital identity into traditional identity resolution.

There are of course discussions out there about how far digital identity resolution should be possible. For example real name policy enforcement in social networks is indeed a hot topic.

Future Trends

With regard to digital identity resolution the jury is still out. In my eyes we can’t avoid that the economic consequences of the rising social sphere will affect the demand for knowing who is out there. Also the opportunities in establishing identity via digital footprints will be exploited.

My guess is that the distinction between hard core identity check and real world alignment in data quality improvement and MDM will disappear as reference data will become more available and the price of reference data will go down.

That’s why I’m right now working with a solution (www.instantdq.com) that combines identity check features and data universe into master data management with the possibility of adding digital identity into the mix.

Bookmark and Share

Doing Census versus doing Master Data Management

“In those days Caesar Augustus issued a decree that a census should be taken of the entire Roman world. This was the first census that took place while Quirinius was governor of Syria. And everyone went to their own town to register.”

These are the famous words from the Gospel According to Luke that you, if you belong to the part of the world where Christianity is practiced, hear every Christmas.

Today scholars don’t think that there actually was a census for the whole Roman Empire but there are evidences that a local census in Syria and Judea took place around year 1. This was in order to collect taxes in those provinces. As you know: The taxman is data quality’s best friend.

Today doing census is still the most practiced method of knowing about the people living in a given country. The alternative is a public registry that is constantly updated with all the information needed about you. I had the chance to describe such a method in the post on a Canadian blog some years ago. The post is called How Denmark does it.

India has a similar scheme with a centralized citizen registry on the go. This program is called Aadhaar.

As reported in the post Citizen ID and Biometrics the United Kingdom was close to adapting doing citizen Master Data Management some years ago. But it didn’t happen, so it’s still possible to have multiple names and multiple addresses at the same time in different registries while Cameron is Prime Minister of the United Kingdom, First Lord of the Treasury and Minister for the Civil Service.

Merry Christmas.

going to census

Bookmark and Share

Four Reasons for Getting Social with MDM

Social Master Data Management (Social MDM) is about linking the increasing trend of doing business via social media using what we may call “systems of engagement” with the traditional way of supporting business using what we call “systems of record”.

Engaging Human Resources

Creating systems for sharing knowledge between employees within your organization and not at least sharing what knowledge exists among employees has been around for long. Such systems rely on a basic employee master data foundation typically based on core employee registries made for employee contract and payroll functionality.

In a modern organization human resource administration is affected by the way work and live today. Doing work may be done in many other ways than through a traditional employee contract and we are increasingly moving between branches in different countries with different payroll systems often leading to assigning the same individual a complete new identity as a human resource.

Managing employee and other human resource master data will have to embrace this reality in order to ensure effective use of the human resources available and making it possible for all your human resources to help with that will be a big help.

Social MDMFollowing Your Customer’s Digital Footprint

The growth of social networks during the recent years has been almost unbelievable. A lot of activity takes place within social networks and some of this activity relates to buying, recommending and using your products and services and your competitors’ products and services.

Your ability to follow your customer’s and prospective customer’s footprint in the social networks, and other big data sources, will, if not today, be a competitive advantage tomorrow.

You will make the most out of this if you are able to link your traditional collection of customer transactions with the footprints of the customer in the social world. That link will have to be made in a master data hub.

Supporting Complex Sales

Doing social MDM is as a natural consequence of adapting social CRM (Social Customer Relation Management). Many CRM systems are supporting B2B (Business-to Business) activities helping with keeping track of what’s going on with a lot of contacts related to a business account.

Traditional MDM has been much about a single view of the business account and the legal entity behind. As social CRM is much about the relations to the business contacts, the people side of business, we need a solid master data foundation behind the people being those contacts.

The same individual may in fact be an important influencer related to a range of business accounts being the legal entity with who you are aiming for a sales contract. You need a single view of that. So many sales contracts are based on a relation to a buyer moving from one business account to another. You need to be the winner in that game and the answer to that may very well be your ability to do better social MDM.

Sharing Product Data

Product master data are often reborn many times. This happens inside organizations and it happens in the ecosystem of business partners in supply chains encompassing manufactures, distributors, retailers and end users.

Inside an organization it may happen that the purchase function enters a product description fit for the purpose of handling procurement. But this description may not fit the purpose of appreciating and describing the product in the selling process if it’s a resell product or service or it doesn’t fit the engineering process if it’s a spare part or a tool.

As a supplier of products and services through selling channels you are interested in that your descriptions, images and stories around products and services are consistent and timely.

Optimizing Product Information Management (PIM) for multiple purposes requires sharing product data both inside the organization and with your suppliers and customers. In doing that you need that side of social MDM we may call social PIM.

Bookmark and Share

Rising Adoption of MDM in the Cloud

When I back in December 2011 had a look into 2012 and what I was going to do, the topics were very well aligned with what Gartner (the analyst firm) have predicted for MDM, being:

What for me turned out to go faster than I thought was the thing about rising adoption of MDM in the Cloud.

I remember from back when CRM in the Cloud started to grow, not at least driven by the success of Salesforce.com, many voices predicted a slow adoption as most people couldn’t believe that companies would put one of their best secrets, the customer database, up in the cloud where everyone may be able to have a look.

iDQ logoRight now I’m working with implementing my first cloud MDM solution. This solution is based on the instant Data Quality service, which now consequently has an MDM edition. We didn’t expected to be this far already, but here we are.

Bookmark and Share

Is the Holiday Season called Christmas Time or Yuletide?

Johansen_Viggo_Radosne_Boże_NarodzenieIn English we have these two different terms for the coming holiday season: Christmas Time or yuletide. Christmas Time has a religious touch while yuletide is old English and resembles the term juletid still used in Scandinavia. Also notice that Christmas Time is two words (unless written as Christmastime) while yuletide is a compound word like common in Germanic language. And oh, Christmas Time must be written with upper case as first letters while yuletide doesn’t have to (unless maybe in a blog post title). I still struggle a lot with English grammar.

The holiday season may be seen as a religious celebration or, which I think has become prevailing, a special occasion for business. Yuletide is high activity in Business-to-Consumer (B2C) both for brick and mortar shops and for eCommerce, while Christmas Time is almost a stand still for Business-to-Business (B2B) as no one is able to make any decisions because it is the holiday season.

By the way: The only thing I wish for xmas is that people start to standardize on the terms used for the same concept. Not at least at Christmastide it is so disturbing when we don’t have any form of standardisation.

Bookmark and Share

Star Bucks

Occasionally there are stories in the press about how multinational companies don’t pay taxes accordingly to where they earn their money.

Lately there has been a row in the UK about that Starbucks despite being very successful officially are losing money in the UK and therefore don’t pay taxes in the UK. The Guardian’s latest entry on that here.

The Guardian article quotes a call for more international co-operations.

I wonder if that will be done as we can’t even agree on simple concepts as:

  • Having the same format for a date across the globe: Today is 13/12/2012 in most parts of the world but 12/13/2012 in the United States.
  • Using comma or period as decimal mark. I have said that 1,731 times in the UK and 1.731 times when I lived in Denmark.
  • Agreeing about if a house number comes before or after the street name:

UPU S42
and many many more fundamental things about presenting data.

Bookmark and Share

What Happened in 1013

At this time of year it is very popular to try to predict what will happen in the next year, being 2013, within your field of expertise.

However, predictions, not at least about the future, may fail. And within data quality we don’t like flaws. So instead I will tell a little bit about what happened in year 1013 with respect to data quality.

1013As always Wikipedia is your friend when seeking knowledge. So I have picked a few of the highlights from the Wikipedia article about 1013:

Diversity

In 1013 the Viking warlord Sweyn Forkbeard replaced Æthelred the Unready as King of England. These were the happy days when the letter Æ was part of the English alphabet. Today Æ only exists in some of the Viking alphabets.

Definition

Kaifeng, capital of China, becomes the largest city of the world in 1013, taking the lead from Córdoba in Al-Andalus. However this is estimation. And even today, as reported by BBC, we actually can’t tell which one is the largest city in the world.

Multiple versions of the truth

The anti-pope John XVI dies in 1013. An anti-pope is a person who, in opposition to the one who is generally seen as the legitimately elected Pope, makes a significantly accepted competing claim to be the Pope. Even today we can’t always establish a single version of the truth.

Bookmark and Share