Data Quality and the Climate Issue

The similarities between getting awareness for data quality issues and the climate issue was touched 10 years ago here on this blog in the post Data Quality and Climate Politics.

The challenges are still the same.

There are many examples published where the results of climate change are pictured. A recent one is the image from Greenland showing huskies pulling sleds not over the usual ice, but through water.

Greenland-melting-ice-sheet-0613-01-exlarge-169

(Image taken by Steffen Malskær Olsen, @SteffenMalskaer, here published on CNN)

We also see statistics showing a development towards melting ice masses with rising sea levels as the foreseeable result. However, statistics can always be questioned. Is the ice thickening somewhere else? Has this happened many times before?

These kind of questions shows the layers we must go through getting from data quality to information quality, then decision quality and on top the wisdom in applying the right knowledge whether that is to achieve business outcomes or avoiding climate change.

DIKW data quality

 

A Master Data Mind Map

Please find below a mind map with some of the data elements that are considered to be master data.

Master Data Mind Map

The map is in no way exhaustive and if you feel some more very important and common data elements should be there, please comment.

The data elements are grouped within the most common master data domains being party master data, product master data and location master data.

Some of the data elements have previously been examined in posts on this blog. This include:

The mind map has a selection of flags around where master data are geographically dependent. Again, this is not exhaustive. If you have examples of diversities within master data, please also comment.

Looking at The Data Quality Tool World with Different Metrics

The latest market report on data quality tools from Information Difference is out. In the introduction to the data quality landscape Q1 2019 this example of the consequences of  a data quality issue is mentioned: “Christopher Columbus accidentally landed in America when he based his route on calculations using the shorter 4,856 foot Roman mile rather than the 7,091 foot Arabic mile of the Persian geographer that he was relying on.”.

Information Difference has the vendors on the market plotted this way:

Information Difference DQ Landscape Q1 2019

As reported in the post Data Quality Tools are Vital for Digital Transformation also Gartner recently published a market report with vendor positions. The two reports are, in terms on evaluating vendors, like Roman and Arabic miles. Same same but different and may bring you to a different place depending on which one you choose to use.

Vendors evaluated by Information Difference but not Gartner are veteran solution providers Melissa and Datactics. On the other side Gartner has evaluated for example Talend, Information Builders and Ataccama. Gartner has a more spread out evaluation than Information Difference, where most vendors are equal.

PS: If you need any help in your journey across the data quality world, here are some Popular Offerings.

Diversities in Civil Registration

Citizen Registry

The way governments around the world has organized their Master Data Management (MDM) is quite different. When it comes to registering citizens, the practice varies a lot as described in the post Citizen Master Data Management.

I have lived most of my years in Denmark where our national ID is unique and used for everything by public agencies and also a lot by private companies. Some years ago I lived in the United Kingdom, where the public agencies (and my bank) had no clue about who I were, when I came, what I did and when I left.

Recently the World Economic Forum has circulated some videos on LinkedIn telling about how stuff is done differently around the world. The video below is about the Danish civil registry (which by the way is similar in other Scandinavian countries):

What do you think? Would this public MDM and data quality practice work in USA, UK, Germany or where else you live?

Where a Major Tool is Not So Cool

During my engagements in selecting and working with the major data management tools on the market, I have from time to time experienced that they often lack support for specialized data management needs in minor markets.

Two such areas I have been involved with as a Denmark based consultant are:

  • Address verification
  • Data masking

Address verification:

The authorities in Denmark offers a free of charge access to very up to data and granular accurate address data that besides the envelope form of an address also comes with a data management friendly key (usually referred to as KVHX) on the unit level for each residential and business address within the country. Besides the existence of the address you also have access to what activity that takes place on the address as for example if it is a single-family house, a nursing home, a campus and other useful information for verification, matching and other data management activities.

If you want to verify addresses with the major international data managements tools I have come around, much of these goodies are gone, as for example:

  • Address reference data are refreshed only once per quarter
  • The key and the access to more information is not available
  • A price tag for data has been introduced

Data Masking:

In Denmark (and other Scandinavian countries) we have a national identification number (known as personnummer) used much more intensively than the national IDs known from most other countries as told in the post Citizen ID within seconds.

The data masking capabilities in major data management solutions comes with pre-build functions for national IDs – but only covering major markets as the United States Social Security Number, the United Kingdom NINO and the kind of national id in use in a few other large western countries.

So, GDPR compliance is just a little bit harder here even when using a major tool.

Data Masking National ID.png
From IBM Data Masking documentation

Country of Origin: An Increasingly Complex Data Element

When you buy stuff one of the characteristics you may emphasis on is where the stuff is made: The country of origin.

Buying domestic goods has always been both a political issue and something that in people’s mind may be an extra quality sign. When I lived in The UK I noticed that meat was promoted as British (maybe except from Danish bacon). Now when back in Denmark all meat seems to be best when made in Denmark (maybe except from an Argentinian beef). However, regulations have already affected the made in marking for meat, so you have to state several countries of origins in the product lifecycle.

Luxury shoes
Luxury shoes of multi-cultural origin

For some goods a given country of origin seems to be a quality sign. With luxury goods as fine shoes you can still get away with stating Italy or France as country of origin while most of the work has been made elsewhere as told in this article from The Guardian that Revealed: the Romanian site where Louis Vuitton makes its Italian shoes.

Country of origin is a product data element that you need to handle for regulatory reasons not at least when moving goods across borders. Here it is connected with commodity codes telling what kind of product it is in the custom way of classifying products as examined in the post Five Product Classification Standards.

When working with product data management for products that moves cross border you are increasingly asked to be more specific about the country of origin. For example, if you have a product consisting of several parts, you must specify the country of origin for each part.

The Problem with English

– and many other languages

This blog is in English. However, as a citizen in a country where English is not the first language, I have a problem with English. Which flavour or flavor of English should I use? US English? British English? Or any of the many other kinds of English?

It is, in that context, more a theoretical question than a practical one. Despite what Grammar Nazis might think, I guess everyone understands the meaning in my blend of English variants and occasional other spelling mistakes.

The variants of English, spiced up with other cultural and administrative differences, does however create real data quality issues as told in the post Cultured Freshwater Pearls of Wisdom.

EnglishWhen working with Product Data Lake, a service for sharing product information between trading partners, we also need to embrace languages. In doing that we cannot just pick English. We must make it possible to pick any combination of English and country where English is (one of) the official language(s). The same goes for Spanish, German, French, Portuguese, Russian and many other languages in the extend that products can be named and described with different spelling (in a given alphabet or script type).

You always must choose between standardization or standardisation.

Aloha Facebook, Where am I Today?

Facebook is set to fight fake news by using artificial intelligence. A good way to practice may be by playing a bit more around with their geolocation intelligence.

Today I, as far as I know, are on the Canary Islands. This is a part of Spain, though a little bit away from the motherland down the Atlantic Ocean off the North African coast. A main town on the islands is called Las Palmas.

However, according to Facebook I seem to be in a place called Las Palmas Subdivision on Hawaii in the Pacific Ocean on the other side of the globe with Hawaii being a bit away from where it were last time I looked on a map.

Facebook Geolocation Hickup

 

Data Management, Never stop learning

Welcome in the class room to Rick Buijserd from The Netherlands as the next guest blog post author:

class-romm

As a child you were happy when the bell ranged and the school day ended.  It was time to play with your friends and don’t think about learning anymore, just play! Most of us look back at this time as the best time of our lives. A time without any worries and enjoying every moment of it. Even though it wasn’t the main focus as a child it was also the time that we learned new ideas and things every day. Are we still learning every day? Are you learning new things about data management every day? You should and here is why…

Gaining knowledge

Data is the new oil and many of us make a decent living by advising or consulting companies in this area of expertise. But when time goes by so are the developments and in the technology world this goes fast, very fast. In the last couple of years the data environment has become bigger and bigger. First there was just data in companies, now you have the combine sources of data to get a clear view about. And the sources keep on changing. Big data used to be a word that was undefined and unable to use. And for many it still is, but others use big data to enrich and enable growth for their companies. By just summing this up you see the changes that happened in the last couple of years and you have to keep up to stay relevant. Learn and gain knowledge is the only key to success in the long term. Artificial Intelligence and Machine Learning powered by optimal use of data and data management will take over many tasks but in the end human creativity and the ability to learn will provide success and the power to make the difference.

Data Management is never finished and neither is learning about it

As you have been in the world of data management you should know that data management is never finished and so is the possibility of gaining knowledge. New books about data management are published recently, research firms keep on researching and find new discoveries. And many companies use the evolution of the technology to grow. Also Communities are built around topics on many different platforms. The possibility to learn is everywhere! Use it in your benefit, data management is never finished…

data-management-expertsRick Buijserd is author and owner of the platform Data Management Experts and a young professional with experience in the world of data. He started his career at a well-known software vendor as channel manager where he learned the skills of indirect sales and managing partners. Financial, HR, Logistics, Warehousing and PSA were the main elements of his software sales. Building relationships with experts and other vendors are part of his DNA.

rickAfter a couple of years he decided to make a switch and landed in the world of accountancy firms. In this period he enabled himself to become a trusted advisor of many accountancy firms in The Netherlands. The area of finance, financial reporting, tax, auditing and other accountancy related activities are no secret to him. Together with his clients he developed many solutions to solve their challenges. In this period the love for data management came above. Accountancy firms are the ultimate example of being data driven. It is all they know.

In the most recent period of his career he stepped into the world of multinationals and as off today he is still active in this world advising around data management and selling software solutions to multinationals who have challenges in the area of data management. Also he is an expert in the area of social selling via LinkedIn and this knowledge has been brought into practice via a LinkedIn Group for Dutch Data Management Experts in which he gathers the top data management experts from the largest companies in The Netherlands to discuss all kind of data related topics.