What’s in an Address (and a Product)?

Our company Product Data Lake has relocated again. Our new address, in local language and format, is:

Havnegade 39
1058 København K

If our address were spelled and formatted as in England, where the business plan was drafted, the address would have looked like this:

The Old Seed Office
39 Harbour Street
Copenhagen, 1058 K

Across the pond, a sunny address could look like this:

39 Harbor Drive
Copenhagen, CR 1058
U.S. Virgin Islands

copenhagen_havnegadeNow, the focal point of Product Data Lake is not the exciting world of address data quality, but product data quality.

However, the same issues of local and global linguistic and standardization – or should I say standardisation – issues are the same.

Our lovely city Copenhagen has many names. København in Danish. Köpenhamn in Swedish. Kopenhagen in German. Copenhague in French.

So have all the nice products in the world. Their classifications and related taxonomy are in many languages too. Their features can be spelled in many languages or be dependent of the country were to be sold. The documents that should follow a product by regulation are subject to diversity too.

Handling all this diversity stuff is a core capability for product data exchange between trading partners in Product Data Lake.

Cultured Freshwater Pearls of Wisdom

One of my current engagements is within jewelry – or is it jewellery? The use of these two respectively US English and British English words is a constant data quality issue, when we try to standardize – or is it standardise? – to a common set of reference data and a business glossary within an international organization – or is it organisation?

Looking for international standards often does not solve the case. For example, a shop that sells this kind of bijouterie, may be classified with a SIC code being “Jewelry store” or a NACE code being “Retail sale of watches and jewellery in specialised stores”.

shiny thingsA pearl is a popular gemstone. Natural pearls, meaning they have occurred spontaneously in the wild, are very rare. Instead, most are farmed in fresh water and therefore by regulation used in many countries must be referred to as cultured freshwater pearls.

My pearls of wisdom respectively cultured freshwater pearls of wisdom for building a business glossary and finding the common accepted wording for reference data to be used within your company will be:

  • Start looking at international standards and pick what makes sense for your organization. If you can live with only that, you are lucky.
  • If not, grow the rest of the content for your business glossary and reference data by imitating the international or national standards for your industry, and use your own better wording and additions that makes the most sense across your company.

And oh, I know that pearls of wisdom are often used to imply the opposite of wisdom 🙂

Bookmark and Share

Choosing the Best Term to Use in MDM

Right now I am working with a MDM (Master Data Management) service for sharing product data in the business ecosystems of manufacturers, distributors, retailers and end users of product information.

One of the challenges in putting such a service to the market is choosing the best term for the entities handled by the service.

Below is the current selection with the chosen term and some recognized alternate terms used frequently and found in various standards that exists for exchanging product data:


Please comment, if you think there are other English (or variant of English) terms that deserves to be in here.

Takeaways from MDM Summit Europe 2016

Yesterday I popped in at the combined Master Data Management Summit Europe 2016 and Data Governance Conference Europe 2016.

This event takes place Monday to Thursday, but unfortunately I only had time and money for the Tuesday this year. Therefore, my report will only be takeaways from Tuesday’s events. On a side note the difficulties in doing something pan-European must have troubled the organisers of this London event as avoiding the UK May bank holidays has ended in starting on a Monday where most of the rest of Europe had a day off due to being Pentecost Monday.


Tuesday morning’s highlight for me was Henry Peyret of Forrester shocking the audience in his Data Governance keynote by busting the myth about the good old excuse for doing nothing, being the imperative of top-level management support, is not true.

Back in 2013 I wondered if graph databases will become common in MDM. Certainly graph databases has become the talk of the town and it was good to learn from Andreas Weber how the Germany based figurine manufacturer Schleich has made a home grown PIM / Product MDM solution based on graph database technology.

Ivo-Paul Tummers of Jibes presented the MDM (and beyond) roadmap for the Dutch food company Sligro. I liked the alley of embracing multi-channel, then omnichannel with self-service at the end of the road and how connect will overtake collect during this journey. This is exactly the reason of being for the Product Data Lake venture I am working on right now.

Bookmark and Share

Multilingual? Mais oui! Natürlich.

Is that piece of data wrong or right? This may very well be a question about in what language we are talking about.

In an earlier double post on this blog I had a small quiz about the name of the Pope in the Catholic church. The point was that all possible answers were right as explained in post When Bad Data Quality isn’t Bad Data. The thing is that the Pope over the wold has local variants over the English name Francis. François in French, Franziskus in German, Francesco in Italian, Francisco in Spanish Franciszek in Polish, Frans in Danish and Norwegian and so on.

In today’s globalized, or should I say globalised, world, it is important that our data can be represented in different languages and that the systems we use to handle the data is built for that. The user interface may be in a certain flavor/flavour of English only, but the data model must cater for storing and presenting data in multiple languages and even variants of languages as English in its many forms. Add to that the capability of handling other characters than Latin in other script systems than alphabets as examined in the post called Script Systems.

This challenge is very close to me right when we are building a service for sharing product information in business ecosystems. So will the Product Data Lake be multilingual? Mais oui! Natürlich. Jo da.

PDL Example

PS: The Product Data Lake will actually help with collecting product information in multiple languages through the supply chains of product manufacturers, distributors, retailers and end users.

Bookmark and Share

The Data Quality Market Just Passed 1 Billion USD

The Data Quality Landscape – Q1 2015 from Information Difference is out. A bit ironically, the report states that the data quality market for the calendar year 2014 was worth a fraction over $1 billion. As the $ sign  could mean a lot of different currencies like CAD, AUD or FJD this statement is very ambiguous, but I guess Andy Hayler means USD.

dollarWhile there still is a market for standalone data quality tools an increasing part of data quality tooling is actually made with tools being a Master Data Management (MDM) tool, a Data Governance tool, an Extract Load and Transform (ETL) tool, a Customer Relationship Management (CRM) tool or an other kind of tool or software suite.

This topic was recently touched on this blog in the post called Informatica without Data Quality? Herein the reasons behind why the new owners of Informatica did not mention data quality as a future goodie in the Informatica toolbox was examined.

In a follow up mail an Informatica officer explained: “As you know Data Quality has become an integral part of multidomain MDM and of the MDM fueled Product Catalog App. We still serve pure DQ (Data Quality) use cases, but we see a lot growth in DQ as part of MDM initiatives”.

You can read the full DQ Landscape 2015 here.

Bookmark and Share