2010 predictions

Today this blog has been live for ½ year, Christmas is just around the corner in countries with Christian cultural roots and a new year – even decade – is closing in according to the Gregorian calendar.

It’s time for my 2010 predictions.

Football

Over at the Informatica blog Chris Boorman and Joe McKendrick are discussing who’s going to win next years largest sport event: The football (soccer) World Cup. I don’t think England, USA, Germany (or my team Denmark) will make it. Brazil takes a co-favorite victory – and home team South Africa will go to the semi-finals.

Climate

Brazil and South Africa also had main roles in the recent Climate Summit in my hometown Copenhagen. Despite heavy executive buy-in a very weak deal with no operational Key Performance Indicators was reached here. Money was on the table – but assigned to reactive approaches.

Our hope for avoiding climate catastrophes is now related to national responsibility and technological improvements.

Data Quality

Reactive approach, lack of enterprise wide responsibility and reliance on technological improvements are also well known circumstances in the realm of data quality.

I think we have to deal with this also next year. We have to be better at working under these conditions. That means being able to perform reactive projects faster and better while also implementing prevention upstream. Aligning people, processes and technology is a key as ever in doing that. 

Some areas where we will see improvements will in my eyes be:

  • Exploiting rich external reference data
  • International capabilities
  • Service orientation
  • Small business support
  • Human like technology

The page Data Quality 2.0 has more content on these topics.

Merry Christmas and a Happy New Year.

Bookmark and Share

Phony Phones and Real Numbers

There are plenty of data quality issues related to phone numbers in party master data. Despite that a phone number should be far less fuzzy than names and addresses I have spend lots of time having fun with these calling digits.

Challenges includes:

  • Completeness – Missing values
  • Precision – Inclusion of country codes, area codes, extensions
  • Reliability – Real world alignment, pseudo numbers: 1234.., 555…
  • Timeliness – Outdated and converted numbers
  • Conformity – Formatting of numbers
  • Uniqueness – Handling shared numbers and multiple numbers per party entity

You may work with improving phone number quality with these approaches:

Profiling:

Here you establish some basic ideas about the quality of a current population of phone numbers. You may look at:

  • Count of filled values
  • Minimum and maximum lengths
  • Represented formats – best inspected per country if international data
  • Minimum and maximum values – highlighting invalid numbers

Validation:

National number plans can be used as a basis for next level check of reliability – both in batch cleansing of a current population and for an upstream prevention with new entries. Here numbers not conforming to valid lengths and ranges can be marked.

Also you may make some classification telling about if it is a fixed net number or cell number – but boundaries are not totally clear in many cases.

In many countries a fixed net number includes an area code telling about place.

Match and enrichment:

Names and addresses related to missing and invalid phone numbers may be matched with phone books and other directories having phone numbers and thereby enriching your data and improving completeness.

Reality check:

Then you of course may call the number and confirm whether you are reaching the right person (or organization). I have though never been involved in such an activity or been called by someone only asking if I am who I am.

Data Quality and Climate Change Management

A month ago I made a blog post titled “Data Quality and climate politics”. In this post I highlighted some similarities between data governance / data quality and climate politics mainly focussing on why sometimes nothing is done.

Today, 1 day before the United Nations climate change summit commence in my hometown Copenhagen, it seems that executive buy-in has come through. Over 100 heads of states and government will attend the conference among them key stake holders as Indian prime minister Singh and US president Obama.

The plan for how to manage climate change seems at this moment to have some ingredients with similarities to how to manage data quality change.  

The bill

Related to my previous post Eugene Desyatnik commented on LinkedIn:

In both cases, everyone in their heart agrees it’s a noble cause, and sees how they can benefit — but in both cases, everyone also hopes someone else will pay for most of it.

Progress in fighting climate change seems to be closely related to that the rich countries seems to be in agreement about paying a fair share.

With enterprise data quality you also can’t rely on that one business unit will pay for solving all enterprise wide data quality issues related to common data domains. 

Key Performance Indicators

Reductions in greenhouse gas emissions are key performance indicators and goals in fighting climate change – measuring temperatures is more like looking at the final outcome.

For data quality we also knows that the business outcome is related to information in context but in order to look at improving progress we have to measure (raw) data quality at the root.  

Using technology

This article from BBC “Tackling climate change with technologypoints at a wealth of different technologies that may help fighting global warming while we still get the power we need. There is pros and cons for each. Some technologies works in some geographies but not somewhere else. Some technologies are mature now and some will be in the future. There is no silver bullet but a range of different possibilities

Very similar to data quality technology.

Santa Quality

On the 3rd of December I feel inspired to relate some data quality issues to Mr. Santa Claus – or what is exactly the name. Is it:

  • Saint Nicholas or
  • Père Noël as they say in French or
  • Weihnachtsmann as they say in German or
  • Julemand as we say in Denmark or
  • Plenty of other local names?

Santa Claus versus Saint Nicholas is an example of the use of nicknames which is a main issue in name matching in many cultures.

It’s also important to observe that the German and Danish name is one word versus two words in English and French. Many company names and other names in respective languages shares the same linguistic characteristic.

Father Christmas is an alternative identification maybe more being a job title.

Another question is where he lives.

The North Pole is acknowledged as the correct geographical address in Anglo countries – but there seems to be alternative mailing possibilities as:

  • Santa Claus, North Pole, Canada, HOH OHO
  • Father Christmas, North Pole, SAN TA1 (UK)

However the Finish claims the valid address to be:

In my home country Denmark we will accept nothing but:

  • Julemanden, Box 1615, 3900 Nuuk, Greenland

Finally I could imagine which data quality issues the Santa business has to face:

  • Too many duplicates on the “nice list” leading to heavy overhead in gift spending as well as extra costs in reindeer management.
  • Inaccurate product masters resulting in complaints from nice boys and girls and a lot of scrap and rework.
  • Fraud entries from children already on the ‘naughty list’ may be a challenge.
  • A lot of missing chimney positions may cause severe delivery problems.

But then, why should Santa be smarter than everyone else?

Bookmark and Share