The world – Liliendahl on Data Quality

Annus Horribilis 2020, Annus Mirabilis 2021?

21st December 202021st December 2020Henrik Gabs Liliendahl4 Comments

At this time of the year, it is custom to make a foreseeing about what will happen next year usually within a specific area – as for example data management.

After 2020 one should think that making any qualified guess about next year should be regarded within a huge amount of uncertainty.

Well, let us have a go anyway.

The horrible year of the outbreak of the pandemic has also affected the data management scene. One often mentioned theme is the accelerated digitalization, which all the bad things about the pandemic aside, seen in isolation (so to speak), is a positive development.

Digitalization also push globalization. Now you do not have to work with data management partners who is within a 5 miles reach – 5,000 kilometres will be the same.

In fact, the outlook for the data management industry is not bad at all. Digital transformation initiatives will require investments in data management consultancy, data management services and data management technology. The competition will intensify with many partners available at a global range. This will be an opportunity for smaller consultancies with broad visions, nimble service providers with scalable offerings and forward-looking tool vendors with doable solutions.

The chances for gaining market shares in a developing market are good for those of you who sell data management stuff.

The chances for getting the best help are good for those of you who buy data management stuff.

A Merry Christmas to you who celebrate this and a Happy Calendar New Year to all of you.

Mapping MDM and PIM Solutions

14th July 2019Henrik Gabs LiliendahlLeave a comment

There are several parameters considered by organizations on the look for solutions that handles Master Data Management (MDM) and Product Information Management (PIM) or both. One is how MDM’ish or PIM’ish the solution is as examined in the post MDM, PIM or Both.

Another aspect is the geographical presence. This includes where the solution provider is based and of course also the presence around the world through local offices and partner network.

Here are some of the solution providers from North America and Europe on a map:

MDM World Map

Reltio is a Silicon Valley based MDM provider. Learn more about Reltio Cloud here.

Semarchy has moved their head quarter to Silicon Valley but has their origin and most of the operation still in Lyon, France. Learn more about Semarchy xDM here.

Riversand is coming out of Houston, Texas. Learn more about Riversand here.

EnterWorks is based in Sterling, Virginia. Learn about Enterworks here.

CONTENTSERV is head quartered in Baar, Switzerland. Learn more about CONTENTSERV here.

SyncForce is located in Eindhoven, Netherlands. Learn about SyncForce here.

Dynamicweb PIM is from Aarhus, Denmark. Learn more about Dynamicweb PIM here.

Informatica is another Silicon Valley firm. Informatica has bought firms from around the world as lately Toronto, Canada based AllSight, now branded as Informatica Customer 360 Insights. Learn more about Informatica Customer 360 Insights here.

Magnitude and Agility® are now married. They are respectively located in Austin, Texas and York, UK. Learn more about Magnitude MDM here and learn more about Agility here.

Where is your (preferred) MDM / PIM solution located? – and what is the world reach?

Data Quality and the Climate Issue

20th June 201920th June 2019Henrik Gabs LiliendahlLeave a comment

The similarities between getting awareness for data quality issues and the climate issue was touched 10 years ago here on this blog in the post Data Quality and Climate Politics.

The challenges are still the same.

There are many examples published where the results of climate change are pictured. A recent one is the image from Greenland showing huskies pulling sleds not over the usual ice, but through water.

Greenland-melting-ice-sheet-0613-01-exlarge-169

(Image taken by Steffen Malskær Olsen, @SteffenMalskaer, here published on CNN)

We also see statistics showing a development towards melting ice masses with rising sea levels as the foreseeable result. However, statistics can always be questioned. Is the ice thickening somewhere else? Has this happened many times before?

These kind of questions shows the layers we must go through getting from data quality to information quality, then decision quality and on top the wisdom in applying the right knowledge whether that is to achieve business outcomes or avoiding climate change.

DIKW data quality

A Master Data Mind Map

3rd May 20194th May 2019Henrik Gabs LiliendahlLeave a comment

Please find below a mind map with some of the data elements that are considered to be master data.

The map is in no way exhaustive and if you feel some more very important and common data elements should be there, please comment.

The data elements are grouped within the most common master data domains being party master data, product master data and location master data.

Some of the data elements have previously been examined in posts on this blog. This include:

Product classifications as GPC, ETIM, eClass, Harmonized System (HS) and UNSPSC in the post Five Product Classification Standards.
Product identification codes as GTIN, EAN and UPC in the post Visiting the Product Information Castle.
Duns-Number, SIREN and registration numbers in the post Single Company View.
SIC and NACE codes in the post The World of Reference Data.
ZIP, PLZ and PIN in the post Some Kinds of Reference Data.

The mind map has a selection of flags around where master data are geographically dependent. Again, this is not exhaustive. If you have examples of diversities within master data, please also comment.

Looking at The Data Quality Tool World with Different Metrics

28th April 201928th April 2019Henrik Gabs LiliendahlLeave a comment

The latest market report on data quality tools from Information Difference is out. In the introduction to the data quality landscape Q1 2019 this example of the consequences of a data quality issue is mentioned: “Christopher Columbus accidentally landed in America when he based his route on calculations using the shorter 4,856 foot Roman mile rather than the 7,091 foot Arabic mile of the Persian geographer that he was relying on.”.

Information Difference has the vendors on the market plotted this way:

Information Difference DQ Landscape Q1 2019

As reported in the post Data Quality Tools are Vital for Digital Transformation also Gartner recently published a market report with vendor positions. The two reports are, in terms on evaluating vendors, like Roman and Arabic miles. Same same but different and may bring you to a different place depending on which one you choose to use.

Vendors evaluated by Information Difference but not Gartner are veteran solution providers Melissa and Datactics. On the other side Gartner has evaluated for example Talend, Information Builders and Ataccama. Gartner has a more spread out evaluation than Information Difference, where most vendors are equal.

PS: If you need any help in your journey across the data quality world, here are some Popular Offerings.

Diversities in Civil Registration

1st May 2018Henrik Gabs LiliendahlLeave a comment

Citizen Registry

The way governments around the world has organized their Master Data Management (MDM) is quite different. When it comes to registering citizens, the practice varies a lot as described in the post Citizen Master Data Management.

I have lived most of my years in Denmark where our national ID is unique and used for everything by public agencies and also a lot by private companies. Some years ago I lived in the United Kingdom, where the public agencies (and my bank) had no clue about who I were, when I came, what I did and when I left.

Recently the World Economic Forum has circulated some videos on LinkedIn telling about how stuff is done differently around the world. The video below is about the Danish civil registry (which by the way is similar in other Scandinavian countries):

What do you think? Would this public MDM and data quality practice work in USA, UK, Germany or where else you live?

Where a Major Tool is Not So Cool

1st February 201818th April 2019Henrik Gabs LiliendahlLeave a comment

During my engagements in selecting and working with the major data management tools on the market, I have from time to time experienced that they often lack support for specialized data management needs in minor markets.

Two such areas I have been involved with as a Denmark based consultant are:

Address verification
Data masking

Address verification:

The authorities in Denmark offers a free of charge access to very up to data and granular accurate address data that besides the envelope form of an address also comes with a data management friendly key (usually referred to as KVHX) on the unit level for each residential and business address within the country. Besides the existence of the address you also have access to what activity that takes place on the address as for example if it is a single-family house, a nursing home, a campus and other useful information for verification, matching and other data management activities.

If you want to verify addresses with the major international data managements tools I have come around, much of these goodies are gone, as for example:

Address reference data are refreshed only once per quarter
The key and the access to more information is not available
A price tag for data has been introduced

Data Masking:

In Denmark (and other Scandinavian countries) we have a national identification number (known as personnummer) used much more intensively than the national IDs known from most other countries as told in the post Citizen ID within seconds.

The data masking capabilities in major data management solutions comes with pre-build functions for national IDs – but only covering major markets as the United States Social Security Number, the United Kingdom NINO and the kind of national id in use in a few other large western countries.

So, GDPR compliance is just a little bit harder here even when using a major tool.

Data Masking National ID.png — From IBM Data Masking documentation

Country of Origin: An Increasingly Complex Data Element

19th June 2017Henrik Gabs LiliendahlLeave a comment

When you buy stuff one of the characteristics you may emphasis on is where the stuff is made: The country of origin.

Buying domestic goods has always been both a political issue and something that in people’s mind may be an extra quality sign. When I lived in The UK I noticed that meat was promoted as British (maybe except from Danish bacon). Now when back in Denmark all meat seems to be best when made in Denmark (maybe except from an Argentinian beef). However, regulations have already affected the made in marking for meat, so you have to state several countries of origins in the product lifecycle.

For some goods a given country of origin seems to be a quality sign. With luxury goods as fine shoes you can still get away with stating Italy or France as country of origin while most of the work has been made elsewhere as told in this article from The Guardian that Revealed: the Romanian site where Louis Vuitton makes its Italian shoes.

Country of origin is a product data element that you need to handle for regulatory reasons not at least when moving goods across borders. Here it is connected with commodity codes telling what kind of product it is in the custom way of classifying products as examined in the post Five Product Classification Standards.

When working with product data management for products that moves cross border you are increasingly asked to be more specific about the country of origin. For example, if you have a product consisting of several parts, you must specify the country of origin for each part.

The Problem with English

14th June 2017Henrik Gabs Liliendahl4 Comments

– and many other languages

This blog is in English. However, as a citizen in a country where English is not the first language, I have a problem with English. Which flavour or flavor of English should I use? US English? British English? Or any of the many other kinds of English?

It is, in that context, more a theoretical question than a practical one. Despite what Grammar Nazis might think, I guess everyone understands the meaning in my blend of English variants and occasional other spelling mistakes.

The variants of English, spiced up with other cultural and administrative differences, does however create real data quality issues as told in the post Cultured Freshwater Pearls of Wisdom.

English When working with Product Data Lake, a service for sharing product information between trading partners, we also need to embrace languages. In doing that we cannot just pick English. We must make it possible to pick any combination of English and country where English is (one of) the official language(s). The same goes for Spanish, German, French, Portuguese, Russian and many other languages in the extend that products can be named and described with different spelling (in a given alphabet or script type).

You always must choose between standardization or standardisation.

	Henrik Gabs Lilienda… on The Intersection of Data Obser…
	Shanker on The Intersection of Data Obser…
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on Data Matching Efficiency
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on From Platforms to Ecosyst…
	Michael Fieg on From Platforms to Ecosyst…
	From Platforms to Ec… on What is Collaborative Product…
	From Platforms to Ec… on MDM and Knowledge Graph
	Henrik Gabs Lilienda… on SAP and Master Data Manag…
	Conrad Greer on SAP and Master Data Manag…
	Henrik Gabs Lilienda… on SAP and Master Data Manag…
	Michael Fieg, Parsio… on SAP and Master Data Manag…
	Asifa on Data Fabric and Master Data…
	Henrik Gabs Lilienda… on Data Fabric and Master Data…