Real world alignment – Liliendahl on Data Quality

The Two Data Quality Definitions

4th December 201914th February 2020Henrik Gabs Liliendahl2 Comments

If you search on Google for “data quality” you will find the ever-recurring discussion on how we can define data quality.

This is also true for the top ranked none sponsored articles as the Wikipedia page on data quality and an article from Profisee called Data Quality – What, Why, How, 10 Best Practices & More!

The two predominant definitions are that data is of high quality if the data:

Is fit for the intended purpose of use.
Correctly represent the real-world construct that the data describes.

Personally, I think it is a balance.

Data Quality Definition

In theory I am on the right side. This is probably because I most often work with master data, where the same data have multiple purposes.

However, as a consultant helping organizations with getting the funding in place and getting the data quality improvement done within time and budget I do end up on the other side.

What about you? Where do you stand in this question?

Cultured Freshwater Pearls of Wisdom

4th July 2016Henrik Gabs LiliendahlLeave a comment

One of my current engagements is within jewelry – or is it jewellery? The use of these two respectively US English and British English words is a constant data quality issue, when we try to standardize – or is it standardise? – to a common set of reference data and a business glossary within an international organization – or is it organisation?

Looking for international standards often does not solve the case. For example, a shop that sells this kind of bijouterie, may be classified with a SIC code being “Jewelry store” or a NACE code being “Retail sale of watches and jewellery in specialised stores”.

shiny things A pearl is a popular gemstone. Natural pearls, meaning they have occurred spontaneously in the wild, are very rare. Instead, most are farmed in fresh water and therefore by regulation used in many countries must be referred to as cultured freshwater pearls.

My pearls of wisdom respectively cultured freshwater pearls of wisdom for building a business glossary and finding the common accepted wording for reference data to be used within your company will be:

Start looking at international standards and pick what makes sense for your organization. If you can live with only that, you are lucky.
If not, grow the rest of the content for your business glossary and reference data by imitating the international or national standards for your industry, and use your own better wording and additions that makes the most sense across your company.

And oh, I know that pearls of wisdom are often used to imply the opposite of wisdom 🙂

Using a Business Entity Identifier from Day One

24th May 2016Henrik Gabs LiliendahlLeave a comment

One of the ways to ensure data quality for customer – or rather party – master data when operating in a business-to-business (B2B) environment, is to on-board new entries using an external defined business entity identifier.

By doing that, you tackle some of the most challenging data quality dimensions as:

Uniqueness, by checking if a business with that identifier already exist in your internal master data. This approach is superior to using data matching as explained in the post The Good, Better and Best Way of Avoiding Duplicates.
Accuracy, by having names, addresses and other information defaulted from a business directory and thus avoiding those spelling mistakes that usually are all over in party master data.
Conformity, by inheriting additional data as line-of-business codes and descriptions from a business directory.

Having an external business identifier stored with your party master data helps a lot with maintaining data quality as pondered in the post Ongoing Data Maintenance.

Busienss Entity Identifiers When selecting an identifier there are different options as national IDs, LEI, DUNS Number and others as explained in the post Business Entity Identifiers.

At the Product Data Lake service I am working on right now, we have decided to use an external business identifier from day one. I know this may be something a typical start-up will consider much later if and when the party master data population has grown. But, besides being optimistic about our service, I think it will be a win not to have to fight data quality issues later with guarantied increased costs.

For the identifier to use we have chosen the DUNS Number from Dun & Bradstreet. The reason is that this currently is the only worldwide covered business identifier. Also, Dun & Bradstreet offers some additional data that fits our business model. This includes consistent line-of-business information and worldwide company family trees.

Multi-Side MDM

22nd March 2016Henrik Gabs Liliendahl3 Comments

As reported in the post Gravitational Waves in the MDM World there is a tendency in the MDM (Master Data Management) market and in MDM programmes around to encompass both the party domain and the product domain.

The party domain is still often treated as two separate domains, being the vendor (or supplier) domain and the customer domain. However, there are good reasons for seeing the intersection of vendor master data and customer master data as party master data. These reasons are most obvious when we look at the B2B (business-to-business) part of our master data, because:

You will always find that many real world entities have a vendor role as well as a customer role to you
The basic master data has the same structure (identification, names, addresses and contact data
You need the same third party validation and enrichment capabilities for customer roles and vendor roles.

These reasons also applies to other party roles as examined in the post 360° Business Partner View.

When we look at the product domain we also have a huge need to connect the buy side and the sell side of our business – and the make side for that matter where we have in-house production.

Multi-Side MDM

Multi-Domain MDM has a side effect, so to speak, about bringing the sell-side together with the buy- and make-side. PIM (Product Information Management), which we often see as the ancestor to product MDM, has the same challenge. Here we also need to bring the sell-side and and the buy-side together – on three frontiers:

Bringing the internal buy-side and sell-side together not at least when looking at product hierarchies
Bringing our buy-side in synchronization with our upstream vendors/suppliers sell-side when it comes to product data
Bringing our sell-side in synchronization with our downstream customers buy-side when it comes to product data

Gravitational Waves in the MDM World

14th February 201615th February 2016Henrik Gabs LiliendahlLeave a comment

One of the big news this week was the detection of gravitational waves. The big thing about this huge step in science is that we now will be able to see things in space, we could not see before. These are things we have plenty of clues about, but we cannot measure them because they do not emit electromagnetic radiation and the light from them is absorbed or reflected by cosmic bodies or dust before it reaches our telescopes.

We have kind of the same in the MDM (Master Data Management) world. We know that there is such a thing called multi-domain Master Data Management but our biggest telescope, the Gartner magic quadrants, only until now clearly identified customer Master Data Management and product Master Data Management as latest touched in the post The Perhaps Second Most Important MDM Quadrant 2015 is Out.

Indeed, many MDM programmes that actually does encompass all MDM domains do split the efforts into traditional domains as customer, vendor and product with separate teams observing their part of the sky. It takes a lot to advocate for that despite vendors belongs to the buy side and customers belongs to the sell side of the organization, there are strong ties between these objects. We can detect gravity in terms of that a vendor and a customer can be the same real world entity and vendors and customers have the same basic structure being a party.

GW MDM

Products do behave differently depending on the industry where your organization belongs. You may make products utilizing raw materials you buy and transform into finished products you sell or/and you may buy and sell the same physical product as a distributor, retailer or other value adding node in the supply chain. In order to handle the drastic increased demand for product data related to eCommerce, PIM (Product Information Management) has been known for long and many organizations everywhere in supply chains have already established PIM capabilities inside their organization with or without and inside or outside product Master Data Management.

What we still need to detect is a good system for connecting the PIM portion of sell sides upstream and buy sides downstream in supply chains. Right now we only see a blurred galaxy of spreadsheets as examined in the post Excellence vs Excel.

To-Be Business Rules and MDM

10th March 2015Henrik Gabs LiliendahlLeave a comment

checklist An important part of implementing Master Data Management (MDM) is to capture the business rules that exists within the implementing organization and build those rules into the solution. In addition, and maybe even more important, is the quest of crafting new business rules that helps making master data being of more value to the implementing organization.

Examples of such new business rules that may come along with MDM implementations are:

In order to open a business account you must supply a valid Legal Entity Identifier (like Company Registration Number, VAT number or whatever applies to the business and geography in question)
A delivery address must be verified against an address directory (valid for the geography in question)
In order to bring a product into business there is a minimum requirement for completeness of product information.

Creating new business rules to be part of the to-be master data regime highlights the interdependency of people, process and technology. New technology can often be the driver for taking on board such new business rules. Building on the above examples such possibilities may be:

The ability to support real time pick and check of external identifiers
The ability to support real time auto completion and check of postal addresses
The ability to support complex completeness checks of a range of data elements

The Countryside Data Quality Journey Through 2015

12th December 201412th December 2014Henrik Gabs LiliendahlLeave a comment

I guess this is the time for blog posts about big things that is going to happen in 2015. But you see, we could also take a route away from the motorways and highways and see how the traditional way of life is still unfolding the data quality landscape.

Lost While the innovators and early adopters are fighting with big data quality the late majority are still trying get the heads around how to manage small data. And that is a good thing, because you cannot utilize big data without solving small data quality problems not at least around master data as told in the post How important is big data quality?

Shitterton Solving data quality problems is not just about fixing data. It is very much also about fixing the structures around data as explained in a post, featuring the pope, called When Bad Data Quality isn’t Bad Data.

No Mans Land A common roadblock on the way to solving data quality issues is that things that what are everybody’s problem tends to be no ones problem. Implementing a data governance programme is evolving as the answer to that conundrum. As many things in life data governance is about to think big and start small as told in the post Business Glossary to Full-Blown Metadata Management or Vice Versa.

Ugley Data governance revolves a lot around peoples roles and there are also some specific roles within data governance. Data owners have been known for a long time, data stewards have been around some time and now we also see Chief Data Officers emerge as examined in the post The Good, the Bad, and the Ugly Data Governance Role.

As experienced recently, somewhere in the countryside, while discussing how to get going with a big and shiny data governance programme there is however indeed still a lot to do with trivial data quality issues as fields being too short to capture the real world as reported in the post Everyday Year 2000 Problems.

Wales

Evergreen Data Quality and MDM

23rd November 20142nd December 2014Henrik Gabs LiliendahlLeave a comment

The term evergreen is known from botany as plants staying green all year and from music as songs not just being a hit for a few months but capable of generating royalties for years and years.

Data should also stay evergreen. I am a believer in the “first time right” principle as explained in the post instant Single Customer View. However, you must also keep your data quality fresh as examined in the post Ongoing Data Maintenance.

Holly If we look at customer, or rather party, Master Data Management (MDM) it is much about real world alignment. In party master data management you describe entities as persons and legal entities in the real world and you should have descriptions that reflect the current state (and sometimes historical states) of these entities. Some reflections will be The Relocation Event. And as even evergreen trees go away, and “My Way” hopefully will go away someday, you also must be able to perform Undertaking in MDM.

With product MDM it is much about data being fit for multiple future purposes of use as reported in the post Customer Friendly Product Master Data.

Post No. 666

26th October 201427th October 2014Henrik Gabs Liliendahl2 Comments

This is post number 666 on this blog. 666 is the number of the beast. Something diabolic.

The first post on my blog came out in June 2009 and was called Qualities in Data Architecture. This post was about how we should talk a bit less about bad data quality and instead focus a bit more on success stories around data quality. I haven’t been able to stick to that all the time. There are so many good data quality train wrecks out there, as the one told in the post called Sticky Data Quality Flaws.

Some of my favorite subjects around data quality were lined up in Post No. 100. They are:

The role of technology in data quality improvement. This subject was discussed not long ago in the post Reading the right Reading.
Fit for purpose versus real world alignment, a subject revisited recently in the post called The “Fit for Purpose” Trap.
Diversity in data quality was touched latest in the post American Exceptionalism in Data Management.

The biggest thing that has happened in the data quality realm during the five years this blog has been live is probably the rise of big data. Or rather the rise of the term big data. This proves to me that changes usually starts with technology. Then we after sometime starts thinking about processes and finally peoples roles and responsibilities.

	Henrik Gabs Lilienda… on The Intersection of Data Obser…
	Shanker on The Intersection of Data Obser…
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on Data Matching Efficiency
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on From Platforms to Ecosyst…
	Michael Fieg on From Platforms to Ecosyst…
	From Platforms to Ec… on What is Collaborative Product…
	From Platforms to Ec… on MDM and Knowledge Graph
	Henrik Gabs Lilienda… on SAP and Master Data Manag…
	Conrad Greer on SAP and Master Data Manag…
	Henrik Gabs Lilienda… on SAP and Master Data Manag…
	Michael Fieg, Parsio… on SAP and Master Data Manag…
	Asifa on Data Fabric and Master Data…
	Henrik Gabs Lilienda… on Data Fabric and Master Data…