Information Quality – Page 9 – Liliendahl on Data Quality

Self-service Ready Product Data

4th May 2016Henrik Gabs Liliendahl2 Comments

The increased use of self-service based sales approaches as in ecommerce has put a lot of pressure on cross company supply chains. Besides handling the logistics and controlling pricing, you also have to take care of a huge amount of product data and digital assets describing the goods.

You may divide product information into these five levels:

Product Information Levels

Please learn more about the five levels of product information, including how hierarchies, pricing and logistics fits in, by visiting the product information castle.

Level 4 in this model is self-service product data being:

Product attributes, also sometimes called product properties or product features. These are up to thousands of different data elements that describes a product. Some are very common for most products like height, length, weight and colour. Some are very specific to the product category. This challenge is actually the reason of being for dedicated Product Information Management (PIM) solutions.
Basic product relations are the links between a product and other products like a product that have several different accessories that goes with the product or a product being a successor of another now decommissioned product.
Standard digital assets are documents like installation guides, line drawings and data sheets.

These are the product data that helps the end customer comparing products and making an objective choice when buying a product for a specific purpose of use. These data are also helpful in answering the questions a buyer may have when making a purchase.

Every piece of data belonging to any level of product information may be forwarded through the cross company supply chain from the manufacturer to the end seller. Self-service product data are however the data that most obviously will do so.

In order to support end customer self-service when producing, distributing and selling goods you must establish a process driven service that automates the introduction of new products with extensive product data, the inclusion of new kinds of product data and updates to those data. You must be a digitalized member of your business ecosystem. The modern solution for that is the Product Data Lake.

Toilet Seats and Data Quality

12th May 201510th May 2017Henrik Gabs Liliendahl1 Comment

When working with data quality in the product master data management domain you are very dependent on your business partners. Product master data are shared along with the physical products in the ecosystem of manufacturers, distributors, retailers and end users.

In a current role, I have worked a lot with sourcing product data from suppliers. One of our recurring examples is about one of our product categories being toilet seats. In that context, we have three different kind of suppliers:

Those who use the term “toilet seat” in their product descriptions. That is marvelous, then we can use that part of the product description directly as it is. Wonderful data quality.
Those who only use the term “seat” in their product description. Well, it is not really bad data quality for a dedicated manufacturer of bathroom stuff, because what could a seat else be in that context. However, for consistency reasons we have to correct “seat” into “toilet seat”.
Those who use the term “WC seat”. Actually, “WC seat” could be more accurate than “toilet seat”, because we are talking about seats for a room with water opposite to older solutions. Nevertheless, for consistency reasons we have to correct “WC seat” into “toilet seat”.

Manufacturers, distributors and retailers have to work together in order to create win-win situations by sharing product data with an optimal data quality. This is however not straight forward, as you always will be part of an ecosystem where your competitors operate too and often you are not prepared to share the same seat as your competitor.

Who Discovered the Americas?

15th November 201416th November 2014Henrik Gabs Liliendahl4 Comments

Today I read a strange story about who discovered the Americas. It is about that Turkish President Recep Tayyip Erdogan said that Muslims, not Columbus, discovered Americas. The assumed discovery should have happened in the year 1178 in the Gregorian calendar.

Well, in my history book it goes like this:

1^s^t the indigenous peoples of the Americas, sometimes called Indians (as opposed to cowboys), found that land by crossing the Bering Strait thousands of years ago.

2^nd there is much speculation about that someone else crossed the oceans. Only archaeological evidence (so far) is that the Vikings were on Newfoundland of the coast of Canada at a place today called L’Anse aux Meadows. That happened around year 1000 in the Gregorian calendar. (By the way they came from Greenland, that geographically is a part of the Americas).

3^rd Christopher Columbus and his crew arrived in the Americas in the year 1492 in the Gregorian calendar.

That is the data quality part of the story. The rest is information quality.

Falsus in Uno, Falsus in Omnibus

2nd November 2014Henrik Gabs Liliendahl3 Comments

The title of this blog post is a Latin legal phrase meaning “false in one thing, false in everything”. It refers to a principle about regarding everything a witness says as not credible, if one thing said by the witness is proven not to be true. This has been a part of the plot in plenty of courtroom films and TV-shows.

This principle has meaning related to data quality too. An example from direct marketing will be a receiver of a direct mail saying: “If you can’t get my name right, how can I trust you in getting anything right during a purchase?”

Somed data quality dimensions — Some data quality dimensions

An example from the multi-channel world, or should we say omni-channel today, would be a shopper saying: “If you say one thing about the product in the shop and another thing on the website, how can I trust any of your product information?” Falsehood in omni-channel so to speak.

Measuring the impact of such attitudes and thereby the Return on Investment (ROI) in data quality improvement based on this principle is very hard. We usually only have random anecdotal evidence about that this happens.

But, what we can say is: Don’t lie in court and don’t neglect your data quality. It will hurt your credibility and then in the end your creditworthiness.

The Scary Data Lake

30th October 201431st October 2014Henrik Gabs LiliendahlLeave a comment

The concept of the data lake seems to have a revival these days. Perhaps it reemerged about a year ago as told in the post Do You Like the Lake?

The idea of having a data lake scares the hell out of data quality people as seen in the title used by Garry Allemann in the post Data Lake vs Data Cesspool.

The data lake is mostly promoted as a data source for analytics opposite to something being part of daily operations. That is horrifying enough. Imagine Joe last month using 80 % of his time fixing data quality issues when doing one batch of analytics. And this month Sue spend 80 % of her time fixing data quality issues in the same data lake in her analytic quest and 50 % of Sue’s data quality issues are in fact the same as Joe’s challenges from last month.

As Halloween is just around the corner, it is time to ask: What is your data lake horror story?

Data = Money

27th October 2014Henrik Gabs Liliendahl8 Comments

It has often been said, written, blogged and tweeted that data itself is useless. It is all about information.

Indeed. In the same way money itself is worthless. It is all about all the good stuff you can buy for money.

So, if you care about money, you should care about data too.

Post No. 666

26th October 201427th October 2014Henrik Gabs Liliendahl2 Comments

This is post number 666 on this blog. 666 is the number of the beast. Something diabolic.

The first post on my blog came out in June 2009 and was called Qualities in Data Architecture. This post was about how we should talk a bit less about bad data quality and instead focus a bit more on success stories around data quality. I haven’t been able to stick to that all the time. There are so many good data quality train wrecks out there, as the one told in the post called Sticky Data Quality Flaws.

Some of my favorite subjects around data quality were lined up in Post No. 100. They are:

The role of technology in data quality improvement. This subject was discussed not long ago in the post Reading the right Reading.
Fit for purpose versus real world alignment, a subject revisited recently in the post called The “Fit for Purpose” Trap.
Diversity in data quality was touched latest in the post American Exceptionalism in Data Management.

The biggest thing that has happened in the data quality realm during the five years this blog has been live is probably the rise of big data. Or rather the rise of the term big data. This proves to me that changes usually starts with technology. Then we after sometime starts thinking about processes and finally peoples roles and responsibilities.

The “Fit for Purpose” Trap

20th October 201420th October 2014Henrik Gabs Liliendahl7 Comments

Gartner (the analyst firm), represented by Saul Judah, takes data quality back to basics in the recent post called Data Quality Improvement.

While I agree with the sentiment around measuring the facts as expressed in the post I have cautions about relying on that everything is good when data are fit for the purpose for business operations.

Some clues lies in the data quality dimensions mentioned in the post:

Accuracy (for now):

As said in the Gartner post data are indeed temporal. The real world changes and so does business operations. When you got your data fit for the purpose of use the business operations has changed. And when you got your data re-fit for the new purpose of use the business operations has changed again.

Furthermore most organizations can’t take all business operations into account at the same time. If you go down the fit for purpose track you will typically address a single business objective and make data fit for that purpose. Not at least when dealing with master data there are many business objectives and derived purposes of use. In my experience that leads to this conclusion:

“While we value that data are of high quality if they are fit for the intended use we value more that data correctly represent the real-world construct to which they refer in order to be fit for current and future multiple purposes”

Existence – an aspect of completeness:

The Gartner post mentions a data quality dimension being existence. I tend to see this as an aspect of the broader used term completeness.

For example having a fit for purpose completeness related to product master data has been a huge challenge for many organizations within retail and distribution during the last years as explained in the post Customer Friendly Product Master Data.

Omni

Data Quality 3.0 Revisited

18th October 201418th October 2014Henrik Gabs LiliendahlLeave a comment

Back in 2010 I played around with the term Data Quality 3.0. This concept is about how we increasingly use external data within data management opposite to the traditional use of internal data, which are data that has been typed into our databases by employees or has been internally collected in other ways.

The rise of big data has definitely fueled the thinking around using external data as reported in the post Adding 180 Degrees to MDM.

There are other internal and external aspects for example internal and external business rules as examined in the post Two Kinds of Business Rules within Data Governance. This post has been discussed in the Data Governance Know How group on LinkedIn.

In a comment Thomas Tong says:

“It’s really fun when the internal components of governance are running smooth, giving the opportunity to focus on external connections to your data governance program. Finding the right balance between internal and external influences is key, as external governance partners can reduce the load/complexity of your overall governance program. It also helps clarify the difference between a “external standard” vs “internal standard”, as well as what is “reference data” vs “master data”… and a little preview of your probable integration strategy with external.”

This resonates very much with my mindset. Since 2010 my own data quality journey has increasingly embraced Master Data Management (MDM) and Data Governance as told in the recent blog post called Data Governance, Data Quality and MDM.

So, in my quest to coin these 3 disciplines into one term I, besides the word information, also may put 3.0 into the naming: “Information Quality 3.0”, hmmm …..

The Unruly Information Quality Community

16th October 201416th October 2014Henrik Gabs Liliendahl1 Comment

Yesterday Daragh O Brien posted an Open Letter to my Information Quality Peers. The essence is that Daragh isn’t completely satisfied with how things are in The International Association for Information and Data Quality (IAIDQ).

That reminds me of that I was a charter member of IAIDQ.

But now checking I probably haven’t renewed the membership. This is not deliberate. It just may have slipped. Maybe, as being one of Daragh’s critique points, because broadcasting from IAIDQ has decreased the last years.

> Correction: Double checking I am actually still a member. I renewed for 2 years last time (usually I’m not that careless with money). I just lost my Charter Mbr designation in the process.

Another critique point raised by Daragh is the failed mission to make the organization truly international, as the organization have had difficulties maintaining chapters around the world.

Forming and maintaining regional chapters is about getting and upholding a critical mass of active members. An example of that this is possible is the German Information Quality Society – Deutsche Gesellschaft für Informations- und Datenqualität e. V. However, this organization doesn’t seem to be a IAIDQ chapter, but being another church obeying the same god.

The current unrest in IAIDQ is not the first of its kind. I remember that some years ago one of the founding members, Larry English, sent a strange email to members telling that he quitted the organization not being satisfied with something.

It is ironic that information quality practitioners are preaching communication and collaboration, but we don’t seem to get it when it comes to organizing our own little world.

	Henrik Gabs Lilienda… on Balancing the Business Partner…
	Jeppe Thing Sørensen on Balancing the Business Partner…
	peolsolutions on MDM, Cloud, SaaS, PaaS, IaaS a…
	Henrik Gabs Lilienda… on Is the Holiday Season called C…
	Michael D. on Is the Holiday Season called C…
	Jay Ram on The Disruptive MDM List is…
	Henrik Gabs Lilienda… on The Intersection of Data Obser…
	Shanker on The Intersection of Data Obser…
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on Data Matching Efficiency
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on From Platforms to Ecosyst…
	Michael Fieg on From Platforms to Ecosyst…
	From Platforms to Ec… on What is Collaborative Product…
	From Platforms to Ec… on MDM and Knowledge Graph