Most Times the Home Team Wins

This summer is going to be huge if you like sports. The Olympics is coming to London and only 14 days away from now we have the European football (soccer) championship in Poland and Ukraine.

As usual hopes are high for the England soccer team. But statistics doesn’t support the hopes. The England team haven’t really succeeded since the World Cup victory on home ground at Wembley in 1966. That victory was mainly (and now I’m going to be shot in the streets of London) due to a ghost goal.

In business, and in data quality and MDM business too, the home team usually also wins.

Yesterday I noticed a tweet telling that the MDM tool vendor Orchestra Network has been selected as tool vendor by a large bank. The bank is Credit Agricole, a big financial service provider based in France. Orchestra Networks is also based in France. A home win so to say.

In the post The Pond it was told how else dominating American tool vendors may in the first place succeed in expansion to Europe by coming to London, but in fact having a hard time competing in continental Europe due to diversity issues.

European tool vendors going to North America often tries to disguise as a home team. Orchestra Network for example uses Boston & Paris as place of origin in the messaging. Other examples are the leading open source data management tool vendor Talend with dual head quarter in Paris and California, hot Danish MDM vendor Stibo Systems messaging out of Atlanta and the Swedish business intelligence success QlikTech who officially has moved to Pennsylvania.

Bookmark and Share

Häagen-Dazs Datakvalitet

There is a term called foreign branding. Foreign branding is describing an implied cachet or superiority of products and services with foreign-sounding names

Häagen-Dazs ice cream is an example of foreign branding. Though the brand was established in New York the name was supposed to sound Scandinavian.

However, Häagen-Dazs does sound and look somewhat strange to a Scandinavian. The reason is probably that the constellation of the letters “äa” and “zs” are not part of any native Scandinavian words.

By the way, datakvalitet is the Scandinavian compound word for data quality.

Getting datakvalitet right in world wide data isn’t easy. What works in some countries doesn’t work in other countries, not at least when we are talking datakvalitet regarding party master data such as customer master data, supplier master data and employee master data.

One of the reasons why datakvalitet for party master data is different is the various possibilities with applying big reference data sources. For example the availability of citizen data is different in New York than in Scandinavia. This affects the ways of reaching optimal datakvalitet as reported in the post Did They Put a Man on the Moon.

As part of the ongoing globalization handling international datakvalitet is becoming more and more common. Many enterprises try to deploy enterprise wide datakvalitet initiatives and shared service centers handles party master data uncommon to the people working there. This often results in finding a strange word like Häagen-Dazs.

Bookmark and Share

255 Reasons for Data Quality Diversity

255 is one source of truth about how many countries we have on this planet. Even with this modest list of reference data there are several sources of the truth. Another list may have 262 entries and a third list 240 entries.

As I have made a blog post some years ago called 55 reasons to improve data quality I think 255 fits nice in the title of this post.

The 55 reasons to improve data quality in the former post revolves around name and address uniqueness. In the quest for having uniqueness, and fulfilling other data quality dimensions as completeness and timeliness, a have often advocated for using deep (or big) reference data sources as address directories, business directories and consumer/citizen directories.

Doing so in the best of breed way involves dealing with a huge number of reference data sources. Services claimed to have worldwide coverage often falls a bit short compared to local services using local reference sources.

For example when I lived in Denmark, at tiny place in one corner of the world, I was often amazed how address correction services from abroad only had (sometimes outdated) street level coverage, while local reference data sources provides building number and even suite level validation.

Another example was discussed in the post The Art in Data Matching where the multi-lingual capacities needed to do well in Belgium was stressed in the comments.

Every country has its own special requirement for getting name and address data quality right, the data quality dimensions for reference data are different and governments has found 255 (or so) different solutions to balancing privacy and administrative effectiveness.

Right now I’m working on internationalization and internationalisation of a data and software service called instant Data Quality. This service makes big reference data from all over the world available in a single mashup. For that we need at least 255 partners.

Bookmark and Share

Bat-and-ball Data Quality

Lately Jim Harris of the OCDQblog has written two excellent blog posts, or may I say home runs, discussing data quality with inspiration from baseball.

In the post Quality Starts and Data Quality Jim talks about that you may have a tough loss in business despite stellar data quality and have a cheap win in business despite of horrible data quality, but in the long run by starting off with good data quality, your organization have a better chance to succeed.

The follow up post called Pitching Perfect Data Quality Jim ponders that business success is achievable without perfect data quality, but data quality has a role to play.

Now, despite that baseball is a very popular sport in the United States, but largely unknown in the rest of world, I think we all understand the metaphors.

Also we have different but similar sports, with other rules, statistics and terms attached, over the world. The common name for these sports is bat-and-ball games.

In Britain, where I live now, cricket is huge and can be used to attract awareness of data issues. As late as yesterday the Ordnance Survey, a government body that have registries with addresses, coordinates and maps, made a blog post called Anyone for cricket? British blogger Peter Thomas also wrote among others a post on cricket and data quality called Wager.

Before coming to Britain I lived in Denmark, where we don’t know baseball, don’t know cricket but sometimes at family picnics, perhaps after a Carlsberg and a snaps or two, plays a similar game called rundbold, with kids and grandpa friendly rules and score board and usually using a tennis ball.

Data quality, not at least data quality in relation to party master data, which is the most prominent domain within the discipline, is also a same same but different game around the world as told in the post Partnerships for the Cloud.

Understanding the rules, statistics and terms of baseball, cricket, rundbold and all the other bat-and-ball games of the world is a daunting task, even though we all know how to hit a ball with a bat.

Bookmark and Share

The Taxman: Data Quality’s Best Friend

Collection of taxes has always been a main driver for having registries and means of identifying people, companies and properties.

5,000 years ago the Egyptians made the first known census in order to effectively collect taxes.

As reported on the Data Value Talk blog, the Netherlands have had 200 years of family names thanks to Napoleon and the higher cause of collecting taxes.

Today the taxman goes cross boarder and wants to help with international data quality as examined in the post Know Your Foreign Customer. The US FATCA regulation is about collecting taxes from activities abroad and as said on the Trillium blog: Data Quality is The Core Enabler for FATCA Compliance.

My guess is that this is only the beginning of a tax based opportunity for having better data quality in relation to international data.

In a tax agenda for the European Union it is said: “As more citizens and companies today work and operate across the EU’s borders, cooperation on taxation has become increasingly important.”.

The EU has a program called FISCALIS in the making. Soon we not only have to identify Americans doing something abroad but practically everyone taking part in the globalization.

For that we all need comprehensive accessibility to the wealth of global reference data through “cutting-edge IT systems” (a FISCALIS choice of wording).

I am working on that right now:

Bookmark and Share

Happy Easter

If you are in a country with Western Christian roots this weekend is Easter weekend. Countries with Eastern Christian roots have it the next weekend.

Many countries (and states or provinces within) have holidays around Easter. For many Easter Monday is a day off. Some had Good Friday as a none working day and a few countries even had Maundy Thursday as a none productive day for most people.

The passed over Maundy Thursday was the day of The Last Supper. The famous Last Supper painting by Leonardo da Vinci has in my eyes, as told in a post from last year, something in common with Data Quality Evangelism.

Happy Easter.

Bookmark and Share

Credit Ratings Turned Upside Down

In a recent reform the usual way of expressing credit ratings by assigning AAA as the top rating, AA+ as the next best and so on has been changed.

If we look at sovereign credit ratings being those ratings assigned to countries, the world picture looks somewhat different than before.

The new top rating is LMAO followed by LOL+ and so on. As of 1st April 2012 only three countries have the top rating. These countries are Zimbabwe, Greece and Wales.

The improved Zimbabwean rating is due to a simplification (Keep It Simple, Stupid) in the way of handling currencies. Now the Zimbabwean dollar equals the US dollar. Much easier, indeed.

Until now Greece has been a bit of a scapegoat for the Eurozone problems. With a new way of measuring things that has certainly changed. Already tomorrow German chancellor Merkel must go to Athens and present a plan telling how to pay back the balance.

Wales have until now been rated as part of the United Kingdom. But as a credit bureau spokesman says: “If you have a national soccer team and a national rugby team you should definitely also have your own sovereign credit rating”. As a main reason for the Welsh economic strength most analysts point to the new Welsh shadow currency called Nidwyfynrhoicachufelltithamarianfijystyneudefnyddiowrthfiwedieu – or just short Nidwyfynrhoicachufelltithamarianbasta.

Bookmark and Share

Well Met, Stranger

Finally wordpress.com, the hosted version of WordPress that I am using, has added geography to the stats.

The counter has been running for 14 days now, so I have tried to have a first look into the numbers.

First of all I’m pleased that I during these 14 days have had visitors from 67 different countries around the globe:

Most visitors have been from the United States, followed by my current home country United Kingdom and then my former home country Denmark:

Note: This figure is made by copying the results into excel.

If grouped by regions of the world, it looks like this:

The world has certainly become a small place. Of course your interactions are biased towards your neighborhood, but in blogging as well as in business our success will increasingly become dependent on meeting, understanding and interacting with (maybe not so) strange people of the world.

Bookmark and Share

Your Point, My Comma

Spam mails can be great food for thought.

This morning I had this one in one of my many mailboxes:

So, the amount in question was:

It’s interesting to see how the spammer used points and commas in the large amount of money he wanted to trick me with. Don’t know if he was sloppy or had the problem of showing an amount to a not segmented audience of the world that are:

  • Using point as decimal mark and comma as thousand separator
  • Using comma as decimal mark and point as thousand separator

The use of a sign for decimal mark and thousand separators is indeed divided across the globe as seen on this map:

The blue countries are using point as decimal mark and comma as thousand separator and the green countries are doing the opposite.

Then there may be diversities within a country as in Canada there are always questions about Quebec, where they are following the French custom. India also has its own numerals with 100 groupings besides the English heritage.  

The pattern of a approximately one half world using one standard and approximately another half of the world using an opposite standard is seen in other notations as arranging person names, writing street addresses as well as place names and postal codes as told in the post Having the Right Element to the Left.

Bookmark and Share

Broken Links

When passing the results of data cleansing activities back to source systems I have often encountered what one might call broken links, which have called for designing data flows that doesn’t go by book, doesn’t match the first picture of the real world and eventually prompts last minute alternate ways of doing things.

I have had the same experience when passing some real (and not real) world bridges lately.

The Trembling Lady: An Unsound Bridge

When walking around in London a sign on the Albert Bridge caught my eye. The sign instructs troops to break steps when marching over.

In researching the Albert Bridge on Wikipedia I learned that the bridge has an unsound construction that makes it vibrate not at least when a bunch of troops marches across in rhythm. The bridge has therefore got the nickname “The Trembling Lady”.

It’s an old sign. The bridge is an old bridge. But it’s still standing.

The same way we often have to deal with old systems running on unstable databases with unsound data models. That’s life. Though it’s not the way we want to see it, we most break the rhythm of else perfectly cleansed data as discussed in the post Storing a Single Version of the Truth.  

The Øresund Bridge: The Sound Link

The sound between the city of Malmö in Sweden and København (Copenhagen) in Denmark can be crossed by the Øresund Bridge. If looking at a satellite picture you may conclude that the bridge isn’t finished. That’s because a part of the link is in fact an undersea tunnel as told in the post Geocoding from 100 Feet Under.

Your first image about what can be done and what can’t be done isn’t always the way of the world. Dig into some more sources, find some more charts and you may find a way.

However, life isn’t always easy. Sometimes charts and maps can be deceiving.

Wodna: The Sound of Silence.

As reported in the post Troubled Bridge over Water I planned a cycling trip last summer. The route would take us across the Polish river Świna by a bridge I found on Google Maps.

When, after a hard day’s ride in the saddle, we reached the river, the bridge wasn’t there. We had to take a ferry across the river instead.

I maybe should have known. The bridge on the map was named Wodna. That is Polish for (something with) water.

Bookmark and Share