Counting Citizens

A main story on BBC this morning is about how collection of UK migration figures is not fit for purpose as reported on the BBC website here.

UK boarderThe problem is that measuring who is going in and out of the country is designed on different purposes like measuring tourism and fighting terrorism.

Some different solutions have been mentioned:

  • The “oh no” solution: More data collection
  • The shiny new solution: Big Data
  • The unwanted solution: Master Data Management

The “oh no” solution: More data collection

Imagining you have to fill in endless forms with rigid checks when going in and out of the airports and ferry ports adding to the checks and security controls already in place. Oh no.

The shiny new solution: Big Data

A system of collecting data from passenger lists on ferries and airplanes called e-Borders is already being implemented and there are hopes that joining these new big data with the old system of record will improve accuracy. Oh, really.

The unwanted solution: Master Data Management

As said in an expert interview on TV the only sustainable solution is a central citizen registry – a solution not unknown for immigrants as me coming from Scandinavia. However, as reported here this solution is unwanted in the UK.

Bookmark and Share

Names, Addresses and National Identification Numbers

When working with customer, or rather party, master data management and related data quality improvement and prevention for traditional offline and some online purposes, you will most often deal with names, addresses and national identification numbers.

While this may be tough enough for domestic data, doing this for international data is a daunting task.


In reality there should be no difference between dealing with domestic data and international data when it comes to names, as people in today’s globalized world move between countries and bring their names with them.

Traditionally the emphasize on data quality related to names has been on dealing with the most frequent issues be that heaps of nick names in the United States and other places, having a “van” in bulks of names in the Netherlands or having loads of surname like middle names in Denmark.

With company names there are some differences to be considered like the inclusion of legal forms in company names as told in the post Legal Forms from Hell.

UPU S42Addresses

Address formats varies between countries. That’s one thing.

The availability of public sources for address reference data varies too. These variations are related to for example:

  • Coverage: Is every part of the country included?
  • Depth: Is it street level, house number level or unit level?
  • Costs: Are reference data expensive or free of charge?

As told in the post Postal Code Musings the postal code system in a given country may be the key (or not) to how to deal with addresses and related data quality.

National Identification Numbers

The post called Business Entity Identifiers includes how countries have different implementations of either all-purpose national identification numbers or single-purpose national identification numbers for companies.

The same way there are different administrative practices for individuals, for example:

  • As I understand it is forbidden by constitution down under to have all-purpose identification numbers for individuals.
  • The United States Social Security Number (SSN) is often mentioned in articles about party data management. It’s an example of a single-purpose number in fact used for several purposes.
  • In Scandinavian countries all-purpose national identification numbers are in place as explained in the post Citizen ID within seconds.

Dealing with diversity

Managing party master data in the light of the above mentioned differences around the world isn’t simple. You need comprehensive data governance policies and business rules, you need elaborate data models and you need a quite well equipped toolbox regarding data quality prevention and exploiting external reference data.

Bookmark and Share

Doing Census versus doing Master Data Management

“In those days Caesar Augustus issued a decree that a census should be taken of the entire Roman world. This was the first census that took place while Quirinius was governor of Syria. And everyone went to their own town to register.”

These are the famous words from the Gospel According to Luke that you, if you belong to the part of the world where Christianity is practiced, hear every Christmas.

Today scholars don’t think that there actually was a census for the whole Roman Empire but there are evidences that a local census in Syria and Judea took place around year 1. This was in order to collect taxes in those provinces. As you know: The taxman is data quality’s best friend.

Today doing census is still the most practiced method of knowing about the people living in a given country. The alternative is a public registry that is constantly updated with all the information needed about you. I had the chance to describe such a method in the post on a Canadian blog some years ago. The post is called How Denmark does it.

India has a similar scheme with a centralized citizen registry on the go. This program is called Aadhaar.

As reported in the post Citizen ID and Biometrics the United Kingdom was close to adapting doing citizen Master Data Management some years ago. But it didn’t happen, so it’s still possible to have multiple names and multiple addresses at the same time in different registries while Cameron is Prime Minister of the United Kingdom, First Lord of the Treasury and Minister for the Civil Service.

Merry Christmas.

going to census

Bookmark and Share

Business Entity Identifiers

The least cumbersome way of uniquely identifying a business partner being a company, government body or other form of organization is to use an externally provided number.

However, there are quite a lot of different numbers to choose from.

All-Purpose National Identification Numbers

In some counties, like in Scandinavia, the public sector assigns a unique number to every company to be used in every relation to the public sector and open to be used by the private sector as well for identification purposes.

As reported in the post Single Company View I worked with the early implementation of such a number in Denmark way back in time.

Single-Purpose National Identification Numbers

In most countries there are multiple systems of numbers for companies each with an original special purpose. Examples are registration numbers, VAT numbers and employer identification numbers.

My current UK company has both a registration number and a VAT number and very embarrassing for a data quality and master data geek these two numbers have different names and addresses attached.

Other Numbering Systems

The best known business entity numbering system around the world is probably the DUNS-number used by Dun & Bradstreet. As examined in the post Select Company_ID from External_Source Where Possible the use of DUNS-numbers and similar business directory id’s is a very common way of uniquely identifying business partners.

In the manufacturing and retail world legal entities may, as part of the Global Data Synchronization Network, be identified with a Global Location Number (GLN).

There has been a lot of talk in the financial sector lately around implementing yet a new numbering system for legal entities with an identifier usually abbreviated as LEI. Wikipedia has the details about a Legal Entity Identification for Financial Contracts.

These are only some of the most used numbering systems for business entities.

So, the trend doesn’t seem to be a single source of truth but multiple sources making up some kind of the truth.

Bookmark and Share

Finding Me

Many people have many names and addresses. So have I.

A search for me within Danish reference sources in the iDQ tool gives the following result:

Green T is positive in the Danish Telephone Books. Red C is negative in the Danish Citizen hub. Green C is positive in the Danish Citizen Hub.

Even though I have left Denmark I’m still registered with some phone subscriptions there. And my phone company hasn’t fully achieved single customer view yet, as I’m registered there with two slightly different middle (sur)names.

Following me to the United Kingdom I’m registered here with more different names.

It’s not that I’m attempting some kind of fraud, but as my surname contains The Letter Ø, and that letter isn’t part of the English alphabet, my National Insurance Number (kind of similar to the Social Security Number in the US) is registered by the name “Henrik Liliendahl Sorensen”.

But as the United Kingdom hasn’t a single citizen view, I am separately registered at the National Health Service with the name “Henrik Sorensen”. This is due to a sloppy realtor, who omitted my middle (sur)name on a flat rental contract. That name was taken further by British Gas onto my electricity bill. That document is (surprisingly for me) my most important identity paper in the UK, and it was used as proof of address when registering for health service.

How about you, do you also have several identities?

Bookmark and Share

Costs of a Single Citizen View

Recently Andrew Dean made a blog post called National Identity Numbers. The post generated some comments in the Data Matching group on LinkedIn.

Andrew’s post is based on the ongoing project in India called Aadhaar, where every citizen is assigned a unique identification number to be used for multiple purposes when interacting with the government and financial institutions.

As Andrew mentions the United Kingdom cancelled such a project a few years ago. This cancellation was, in some part, due to fear of excessive costs. The question Andrew, and comments in the LinkedIn group, poses, is if the (feared) costs will justify the benefits of getting a “single citizen view”.

Indeed large governmental projects have a bad name these days all over the world as I know it.

Back in the late 60’s the United States was able to put a man on the moon.

It was at the same time that the Scandinavian countries implemented their “single citizen view”.

Besides digitalizing the national identification number Sweden also, in 1967, managed to change from driving on the left side of the road to driving on the right side. I’m not sure if Sweden could afford turning to the right side today not to say the United Kingdom doing the same.

Bookmark and Share

Real World Identity

How far do you have to go when checking your customer’s identity?

This morning I read an article on the Danish Computerworld telling about a ferry line now dropping a solution for checking if the passenger using an access card is in fact the paying customer by using a lightweight fingerprint stored on the card. The reason for dropping was by the way due to the cost of upgrading the solution compared to future business value and not any renewed privacy concerns.

I have been involved in some balancing of real world alignment versus fitness for use and privacy in public transport as well as described in the post Real World Alignment. Here it was the question about using a national identification number when registering customers in public transportation.

As citizens of the world we are today used to sometimes having our iris scanned when flying as our passport holds our unique identification that way. Some of the considerations around using biometrics in general public registration were discussed in the post Citizen ID and Biometrics.

In my eyes, or should we say iris, there is no doubt that we will meet an increasing demand of confirming and registering our identification around. Doing that in the fight against terrorism has been there for long. Regulatory compliance will add to that trend as told in the post Know Your Foreign Customer, mentioning the consequences of the FATCA regulation and other regulations.

When talking about identity resolution in the data quality realm we usually deal with strings of text as names, addresses, phone numbers and national identification numbers. Things that reflect the real world, but isn’t the real world.

We will however probably adapt more facial recognition as examined in the post The New Face of Data Matching. We do have access to pictures in the cloud, as you may find your B2C customers picture on FaceBook and your B2B customer contacts picture on LinkedIn or other similar services. It’s still not the real world itself, but a bit closer than a text string. And of course the picture could be false or outdated and thus more suitable for traction on a dating site.

Fingerprint is maybe a bit old fashioned, but as said, more and more biometric passports are issued and the technology for iris and retinal scanning is used around for access control even on mobile devices.

In the story starting this post the business value for reinvesting in a biometric solution wasn’t deemed positive. But looking from the print on my fingers down to my hand lines I foresee some more identity resolution going beyond name and address strings into things closer to the real world as facial recognition and biometrics.

Bookmark and Share

Inaccurately Accurate

The public administrative practice for keeping track of the citizens within a country is very different between my former country of living being Denmark and my current country of living being the United Kingdom.

In Denmark there is an all-purpose citizen registry where you are registered “once and for all” seconds after you are born as told in the post Citizen ID within Seconds.

In the United Kingdom there are separate registries for different purposes. For example there is a registry dealing with your health care master data and there is a registry, called the electoral roll, dealing with your master data as a voter.

Today I was reading a recent report about data quality within the British electoral roll. The report is called Great Britain’s electoral registers 2011

The report revolves around the two data quality dimensions: Accuracy and completeness.

In doing so, these two bespoke definitions are used:

There is a note about accuracy saying:


This is a very interesting precision, so to speak. Having fitness for the purpose of use is indeed the most common approach to data quality.

This does of course create issues when such data are used for other purposes. For example credit risk agencies here in the UK use appearance on the electoral roll as a parameter for their assessment of credit risk related to individuals.

Surely, often there isn’t a single source of the truth as pondered in the post The Big ABC of Reference Data.

However, this mustn’t make us stop in the search for getting high quality data. We just have to realize that we may look in different places in order to mash up a best picture of the real world as explained in the post Reference Data at Work in the Cloud.  

Bookmark and Share

Some Voter Musings

Tomorrow there is a general election in my home country Denmark.

Voter registration

There are different systems of voter registration around the world.

In some countries there are electoral roles being data silos of citizen master data more or less integrated with other citizen master data silos for other purposes as driving license administration, social security and taxation.

In Denmark we have an all-purpose single master data hub for citizens. When we have to vote, the ballots are extracted from the hub based on your age (from 18 on election day) and citizen status (excluding citizens of other countries living or working here).

The political scope

The voter’s role is to select members for the parliament. Then the parliament will select a prime minister.

One of the two most likely candidates for next prime minister is the current one with the nickname “Little Lars”, who came to power when the former one became general secretary of NATO and moved to the HQ in Brussels. Lars is head of the political party called Left (Venstre), which is a right wing party. He is going to defend the welfare state, including universal healthcare and free college.

His main opponent has the nickname “Gucci Helle”.  She is leading the left block. She is going to defend the welfare state, including universal healthcare and free college.

Head of state

As voters we are not trusted to select the head of state. The queen was born to be queen, and her eldest son will be the next king. On the other hand, the members of the Royal Family are not allowed to vote in the election.  This is the exception that confirms the rule.

Bookmark and Share

The trees never grow into heaven

This morning most of digital Denmark was closed. You couldn’t do anything at the online bank, you couldn’t do much at public sector websites and you couldn’t read electronic mail from your employer, pension institution and others.

It wasn’t because someone cut a big cable or a computer virus got a lucky strike. The problem was that the centralized internet login service had a three hour outage. It was a classic single point of failure incident.

In Denmark we have a single sign-on identity solution used by public sector, financial services and other organizations. The service is called NemID (Easy ID) and is based on an all-purpose unique national ID for every citizen.

As more and more interaction with public sector and financial services along with online shopping is taking place in the cloud, we are of course more and more vulnerable to these kind of problems.

The benefits of having a single source of truth about who you are became a single point of failure here.

Well, we have this local saying: “The trees never grow into heaven”. All good things have their limit. Even in instant Identity Resolution.

Bookmark and Share