Data Models and Real World Alignment

Usually data models are made to fit a specific purpose of use. As reported in the post A Place in Time this often leads to data quality issues when the data is going to be used for purposes different from the original intended. Among many examples we not at least have heaps of customer tables like this one:

Customer Table

Compared to how the real world works this example has some diversity flaws, like:

  • state code as a key to a state table will only work with one country (the United States)
  • zipcode is a United States description only opposite to the more generic “Postal Code”
  • fname (First name) and lname (Last name) don’t work in cultures where given name and surname have the opposite sequence
  • The length of the state, zipcode and most other fields are obviously too small almost anywhere

More seriously we have:

  • fname and lname (First name and Last name) and probably also phone should belong to an own party entity acting as a contact related to the company
  • company name should belong to an own party entity acting in the role as customer
  • address1, address2, city, state, zipcode should belong to an own place entity probably as the current visiting place related to the company

In my experience looking at the real world will help a lot when making data models that can survive for years and stand use cases different from the one in immediate question. I’m not talking about introducing scope creep but just thinking a little bit about how the real world looks like when you are modelling something in that world, which usually is the case when working with Master Data Management (MDM).

Bookmark and Share

Data Quality in Different Languages

The term ”data quality” exists in many different languages.

As reported in the post Häagen-Dazs Datakvalitet, the Scandinavian word for data quality is datakvalitet. Well, actually there is no such language as Scandinavian, but datakvalitet is used in Danish, Swedish and Norwegian all together. Maybe even in both Norwegian languages, though Google Translate only know of one Norwegian language.

In other Germanic languages the words for data quality are close to datakvalitet. In German: Datenqualität. In Dutch: Datakwaliteit.

The above terms are compound words. Even though English is also classified as a Germanic language we see a Latin influence as “data quality” is two words in English. And that goes for all English variants. It is only when it comes to if we have to standardise this or standardize that we are in trouble. British English is best when we have to select if data quality improvement is a program or a programme.

In true Latin languages we have three words. French: Qualité des données. Spanish: Calidad de datos.

And then there are of course terms in other alphabets than latin and other script systems:

data quality in different languages

Bookmark and Share

Happy New Year and Merry Christmas

A week ago I had a quick vote here on the blog about when it will be Next Christmas.

Vote on xmasThe results are as seen to the right (or above on a mobile device). Most readers think it will be on 25th December 2013 either written in the straight forward date format as 25/12/2013 or in the awkward date format used in the United States thus being 12/25/2013. Some people, probably from Scandinavia, think it’s today the 24/12/2013. For people living in countries mostly observing the Eastern Orthodox Church Christmas will be on the 7th January, 07/01/2014 in the straight forward date format used there, using the secular Gregorian calendar. This is because the Eastern Church still sticks to the old Julian calendar which is 14 days behind the Gregorian calendar.

So, depending on what you celebrate and in which order:

  • Happy Holidays
  • Merry Christmas and Happy New Year
  • Happy New Year and Merry Christmas

Bookmark and Share

What’s Different about MDM in France?

franceAs told in the post about French MDM vendors yesterday I have been on a MDM (Master Data Management) event in Paris today.

An interesting take away from the event’s presentations and the mingling is some differences between how MDM is handled in France (and the rest of continental Europe as I know it) compared to the English speaking world. Some observations are:

People, process and technology

Many MDM gurus (and gurus in other disciplines) stress that you shouldn’t focus on technology (alone) but take people and process very serious too. That’s not so important in France. Everyone knows that already.

Multi-Domain MDM

In France it’s common to start with product MDM and then continue with customer (party) MDM.

The Quadrant Magic

If you made a Gartner Magic Quadrant for MDM solutions in France you wouldn’t have a quadrant for customer data and another one for product data. There would be only one quadrant for (multi-domain) MDM and some of the local vendors would be leaders as discussed in the post MDM for Customer Data Quadrant: No challengers. No visionaries.

Bookmark and Share

Hello Leading MDM Vendor

This morning I received messages from a leading MDM vendor about an upcoming webinar the 12th September.


As we have the 3rd October today this is strange and the vendor of course sent out a correction later today:


That’s OK. Shit happens. Even at data quality and MDM vendors marketing departments.

I am probably a kind of a strange person been living in two countries lately, so I got the original message and the correction both to my Scandinavian identity from the vendor’s Scandinavian body:


As well as to my UK identity from the vendor’s UK body:


That’s OK. Getting a 360 degree view of migrating persons is difficult as discussed in the post 180 Degree Prospective Customer View isn’t Unusual.

Both (double) messages have a salutation.





Being Mr. Sorensen in the UK is OK. Using Mister and surname fits with an English stiff upper lip and The Letter ø could be o in the English alphabet.

I’m not sure if Dear Mr. Sørensen is OK in a Scandinavian context. Hello Henrik would be a better fit.

Bookmark and Share

The Postal Address Hierarchy

Using postal addresses is a core element in many data quality improvement and master data management (MDM) activities.

HierarchyAs touched many times on this blog postal addresses are formatted very differently around the world. However they may all be arranged in a sort of hierarchy, where there are up to 6 general levels being:

  • Country
  • Region
  • City or district
  • Thoroughfare (street) or block
  • Building number
  • Unit within building

In addition to that the postal code (postcode or zip code) is part of many address formats. Seen in the hierarchical light the postal code is a tricky concept as it may identify a city, district, thoroughfare, a single building or even a given unit within or section of a building. The latter is true for my company address in the United Kingdom, where we have a very granular postcode system.


As discussed in the post The Country List even the top level of a postal address hierarchy isn’t a simple list fit for every purpose. Some issues are:

  • There are different sources with different perceptions of which are the countries on this planet
  • What we regard as countries comes in hierarchies
  • Several coding systems are available


The region is an element in some address formats like the states in the United States and the provinces in Canada, while other countries like Germany that is divided into quite independent Länder do not have the region as a required part of the postal address. The same goes for Swiss cantons.

City or district

I once read that if you used the label city in a web form in Australia, you would get a lot of values like: “I do not live in a city”.

Anyway this level is often (but as mentioned certainly not always) where the postal code is applied. The postal code district may be a single town with surroundings, several villages or a district within a big city.

Thoroughfare (street) or block

Most countries use thoroughfares as streets, roads, lanes, avenues, mews, boulevards and whatever they are called around. Beware that the same street may have several spellings and even several names.

Japan is a counterexample of the use of thoroughfares, as here it’s the blocks between the thoroughfares that are part of the postal address.

Building number

Usually this element will be an integer. However formats with a letter behind the integer (example: 21 A) or a range of integers (example: 21-23) are most annoying. And then this British classic: One Main Grove. OMG.

Unit within a building

This element may or may not be present in a postal address depending on if the building is a single family house or company site, the postal delivery sees it as such or you may actually indicate where within the building the delivery goes or you go. The ups and downs of this level are examined in the post A Universal Challenge.

Bookmark and Share

Think global from day one

The title of this post is taken from a blog post by Hans Peter Bech. The post is called Entering a Foreign Market – The 9 Steps to Success for Software Companies.


In the post Hans Peter says:

“German software companies having access to 7% of world demand and US based companies with a domestic market representing 38% of world demand often ignore the global perspective until forced to face the challenge. That’s very fortunate for the smaller companies from the smaller countries!”

This observation from the software market in general certainly also applies to software for data quality improvement and master data management as examined in the post 255 Reasons for Data Quality Diversity.

If you are a software company in the data management space the meaning of thinking global may apply to various activities as:

  • How the product is designed in respect to handling data from all over the world. Here thinking global from day one is crucial.
  • How the product is marketed to a world-wide audience. Here the global approach could wait a bit.

On the latter matter I have teased one of the magic quadrant data quality tool vendors, Trillium Software, for having used a date format only used in the United States on their blog. Maybe it’s a small matter and just me who is sensitive to this normal glitch. Anyway I’m pleased to congratulate Trillium Software on their new blog design with a world-wide fit date format. Check out the blog, which is a good one indeed, here.

Bookmark and Share

Know Your Fan

A variant of the saying “Know Your Customer” for a football club will be “Know Your Fan” and indeed fans are customers when they buy tickets. If they can.

FC Copenhagen

FC Copenhagen cruised into stormy waters when they apparently cancelled all purchases for the upcoming Champions League (European soccer club paramount tournament) clashes against Real Madrid, Juventus and Galatasaray if the purchasers didn’t have a Danish sounding name. The reason was to prevent mixing fans of the different clubs, but surely this poorly thought screening method wasn’t received well among the FC Copenhagen fans not called Jensen, Nielsen or Sørensen.

The story is told in English here on Times of India.

Actually methods of verifying identities are available and cheap in Denmark so I’m surprised to see FC Copenhagen caught offside in this situation.

Bookmark and Share

Know Your (Foreign Luxury Bag) Customer

Gucci BagA story featured a lot in the media the last days is the incident where one of richest women on the planet, Oprah Winfrey, was told that she couldn’t afford the handbag she wanted to look at in a Zürich shop. Was it racism or a misunderstanding because Oprah isn’t good at speaking German?

Either way it was for sure an example of bad things happening when you don’t know your customer. This story also highlights the issues we have with foreign customers as Oprah may not be just as famous in Zürich as in New York.

We have these challenges in customer master data management all over as described in the post Know Your Foreign Customer.

And oh: Maybe it’s time to start a sister blog called Liliendahl on Fashion. This is my second post on luxury handbags. The first post was called Data Quality Luxury.

Bookmark and Share