Data Quality and Interenterprise Data Sharing

When working with data quality improvement there are three kinds of data to consider:

First-party data is the data that is born and managed internally within the enterprise. This data has traditionally been in focus of data quality methodologies and tools with the aim of ensuring that data is fit for the purpose of use and correctly reflects the real-world entity that the data is describing.  

Third-party data is data sourced from external providers who offers a set of data that can be utilized by many enterprises. Examples a location directories, business directories as the Dun & Bradtstreet Worldbase and public national directories and product data pools as for example the Global Data Synchronization Network (GDSN).

Enriching first-party data with third-party is a mean to ensure namely better data completeness, better data consistency, and better data uniqueness.

Second-party data is data sourced directly from a business partner. Examples are supplier self-registration, customer self-registration and inbound product data syndication. Exchange of this data is also called interenterprise data sharing.

The advantage of using second-party in a data quality perspective is that you are closer to the source, which all things equal will mean that data better and more accurately reflects the real-world entity that the data is describing.

In addition to that, you will also, compared to third-party data, have the opportunity to operate with data that exactly fits your operating model and make you unique compared to your competitors.

Finally, second-party data obtained through interenterprise data sharing, will reduce the costs of capturing data compared to first-party data, where else the ever-increasing demand for more elaborate high-quality data in the age of digital transformation will overwhelm your organization.    

The Balancing Act

Getting the most optimal data quality with the least effort is about balancing the use of internal and external data, where you can exploit interenterprise data sharing through combining second-party and third-party data in the way that makes most sense for your organization.

As always, I am ready to discus your challenge. You can book a short online session for that here.

The Most Annoying Way of Presenting Data

Polls are popular on LinkedIn and I have been a sinner of making a few too recently.

One was about what way of presenting data (data format) that is the most annoying.

There were the 4 mentioned above to choose from.

The MM/DD/YYYY date format is in use practically only in the United States. In the rest of the world either the DD/MM/YYYY format or the ISO recommended YYYY-MM-DD format is the chosen one. The data quality challenge appears when you see a date as 03/02/2021 in an international context, because this can be either March, 2 or 3rd February.  

The 12-hour clock with AM and PM postfix, is more commonly in use around the world. But obviously the 12-hour clock is not as well thought as the 24-hour clock. We need some digital transformation here.

Imperial units of measure like inch, foot, yard, pound, and more is far less logical and structured compared to the metric system. Only 3 countries around the world – United States, Myanmar and Liberia has not adopted the metric system. And then there is United Kingdom, who has adopted the metric system in theory, but not in practice.

The Fahrenheit temperature scale is something only used in the United States opposite to Celsius (centigrade) used anywhere else. When someone writes that it is 30 degrees outside that could be quite cold or rather hot if there is no unit of measure applied.

Another example of international trouble mentioned in the comments to the poll is decimal point. In English writing you will use a dot for the decimal point, in many other cultures you use a comma as decimal point.

Most of the annoyance are handled by that mature software have settings where you can set your preferences. The data quality issues arise when these data are part of a text including when software must convert a text into a number, date or time.

If you spot some grey colour (or is it color) in my hair, I blame varying data formats in CSV files, SQL statements, emails and more.

From Data Quality to Business Outcome

Explaining how data quality improvement will lead to business outcome has always been difficult. The challenge is that there very seldom is a case where you with confidence can say “fix this data and you will earn x money within y days”.

Not that I have not seen such bold statements. However, they very rarely survive a reality check. On the other hand, we all know that data quality problems seriously effect the healthiness of any business.

A reason why the world is not that simple is that there is a long stretch from data quality to business outcome. The stretch goes like this:

  • First, data quality must be translated into information quality. Raw data must be put into a business context where the impact of duplicates, incomplete records, inaccurate values and so on is quantified, qualified and related within affected business scenarios.
  • Next, the achieved information quality advancements must be actionable in order to cater for better business decisions. Here it is essential to look beyond the purpose of why the data was gathered in the first place and explore how a given piece of information can serve multiple purpose of actions.
  • Finally, the decisions must enable positive business outcomes within growth, cost reductions, mitigation of risks and/or time to value. Often these goals are met through multiple chains of bringing data into context, making that information actionable and taking the right decisions based on the achieved and shared knowledge.

Stay tuned – and also look back – on this blog for observations and experiences for proven paths on how to improve data quality leading to positive business outcome.  

The Start of the History of Data and Information Quality Management

I am sad to hear that Larry English has passed away as I learned from this LinkedIn update by C. Lwanga Yonke.

As said in here: “When the story of Information Quality Management is written, the first sentence of the first paragraph will include the name Larry English”.

Larry pioneered the data quality – or information quality as he preferred to coin it – discipline.

He was an inspiration to many data and information quality practitioners back in the 90’s and 00’s, including me, and he paved the way for bringing this topic to the level of awareness that it has today.

In his teaching Larry emphasized on the simple but powerful concepts which are the foundation of data quality and information quality methodologies:

  • Quantify the costs and lost opportunities of bad information quality
  • Always look for the root cause of bad information quality
  • Observe the plan-do-check-act circle when solving the information quality issues

Let us roll up our sleeves and continue what Larry started.

B2B2C in Data Management

The Business-to-Business-to-Consumer (B2B2C) scenario is increasingly important in Master Data Management (MDM), Product Information Management (PIM) and Data Quality Management (DQM).

This scenario is usually seen in manufacturing including pharmaceuticals as examined in the post Six MDMographic Stereotypes.

One challenge here is how to extend the capabilities in MDM / PIM / DQM solutions that are build for Business-to-Business (B2B) and Business-to-Consumer (B2C) use cases. Doing B2B2C requires a Multidomain MDM approach with solid PIM and DQM elements either as one solution, a suite of solutions or as a wisely assembled set of best-of-breed solutions.B2B2C MDM PIM DQMIn the MDM sphere a key challenge with B2B2C is that you probably must encompass more surrounding applications and ensure a 360-degree view of party, location and product entities as they have varying roles with varying purposes at varying times tracked by these applications. You will also need to cover a broader range of data types that goes beyond what is traditionally seen as master data.

In DQM you need data matching capabilities that can identify and compare both real-world persons, organizations and the grey zone of persons in professional roles. You need DQM of a deep hierarchy of location data and you need to profile product data completeness for both professional use cases and consumer use cases.

In PIM the content must be suitable for both the professional audience and the end consumers. The issues in achieving this stretch over having a flexible in-house PIM solution and a comprehensive outbound Product Data Syndication (PDS) setup.

As the middle B in B2B2C supply chains you must have a strategic partnership with your suppliers/vendors with a comprehensive inbound Product Data Syndication (PDS) setup and increasingly also a framework for sharing customer master data taking into account the privacy and confidentiality aspects of this.

This emerging MDM / PIM / DQM scope is also referred to as Multienterprise MDM.

TCO, ROI and Business Case for Your MDM / PIM / DQM Solution

Any implementation of a Master Data Management (MDM), Product Information Management (PIM) and/or Data Quality Management (DQM) solution will need a business case to tell if the intended solution has a positive business outcome.

Prior to the solution selection you will typically have:

  • Identified the vision and mission for the intended solution
  • Nailed the pain points the solution is going to solve
  • Framed the scope in terms of the organizational coverage and the data domain coverage
  • Gathered the high-level requirements for a possible solution
  • Estimated the financial results achieved if the solution removes the pain points within the scope and adhering to the requirements

The solution selection (jump-starting with the Disruptive MDM / PIM / DQM Select Your Solution service) will then inform you about the Total Cost of Ownership (TCO) of the best fit solution(s).

From here you can, put very simple, calculate the Return of Investment (ROI) by withdrawing the TCO from the estimated financial results.

MDM PIM DQM TCO ROI Business Case

You can check out more inspiration about ROI and other business case considerations on The Disruptive MDM / PIM /DQM Resource List.

A Tricky Thing with Data Quality Evangelism

One of the major players on the data quality market, Experian, do a yearly survey of the current data management trends. This year is no exception and I just had the chance to read through the 2020 report.

This year’s report revolves around trusted data, data debt and the skills gap in the light of data literacy. As always, the report holds some good percentage take away you can use in your data quality evangelism.

My favourite this year is a bit tricky:

Experian 2020 Data Survey
Source: Experian

I think this one shows a challenging side of data quality evangelism. While operational efficiency is a bit ahead of other reasons to improve data quality, there are many good reasons to improve data quality. And advocating for every kind of goodness is often harder than being able to pinpoint one absolutely good reason.

Well, see for yourself. Get the 2020 Global data management research from Experian Data Quality here.

Scaling Up The Disruptive MDM / PIM / DQM List

The Disruptive MDM / PIM / DQM List was launched in the late 2017.

Here the first innovative Master Data Management (MDM) and Product Information Management (PIM) tool vendors joined the list with a presentation page showcasing the unique capabilities offered to the market.

The blog was launched at the same time. Since then, a lot of blog posts – including guest blog posts – have been posted. The topics covered have been about the list, the analysts and their market reports as well as the capabilities that are essential in solutions and their implementation.

In 2019 the MDM and PIM tool vendors were joined by some of the forward-looking best-of-breed Data Quality Management (DQM) tool vendors.

The Select Your Solution service was launched at the same time. Here organizations – and their consultants – who are on the look for a MDM / PIM / DQM solution can jumpstart the selection process by getting a list of the best solutions based on their individual context, scope and requirements. More than 100 hundred end user organizations or their consultants have received such a list.

MDMlist timeline

Going into the 20es the list is ready to be scaled up. The new sections being launched are:

  • The Service List: In parallel with the solution providers it is possible for service providers – like implementation partners – to register on The Service List. This list will run besides The Solution List. For an organization on the look for an MDM / PIM / DQM solution it is equally important to select the right solution and the right implementation partner.
  • The Resource List: This is a list – going live soon – with white papers, webinars and other content from potentially all the registered tool vendors and service providers divided into sections of topics. Here end user organizations can get a quick overview of the content available within the themes that matters right now.
  • The Case Study List: The next planned list is a list of case studies from potentially all the registered tool vendors and service providers. The list will be divided into industry sectors. Here end user organizations can get a quick overview of studies from similar organizations.

If you have questions and/or suggestions for valuable online content on the list, make a comment or get in contact here:

Analyst MDM / PIM / DQM Solution Reports Update March 2020

Analyst firms occasionally publish market reports with solution overview for Master Data Management (MDM), Product Information Management (PIM) and Data Quality Management (DQM).

The publication schedule from the analyst firms can be unpredictable.

Information Difference is an exception. There have during the years every year been a Data Quality landscape named Q1 and published shortly after that quarter and an MDM landscape named Q2 and published shortly after that quarter. However, these reports are relying on participation from relevant vendors and not all vendors prioritize this scheme.

Forrester is quite unpredictable both with timing and which market segments (MDM, PIM, DQM) to be covered.

Gartner is a bit steadier. However, for example the MDM solution reports have been coming in varying intervals during the latest years.

Here is an overview of the latest major reports:

Stay tuned on this blog to get the latest on analyst reports and news on market movements.

MDM PIM DQM analysts and solutions