The Good, the Better and the Best Kinds of Data Quality Technology

If I look at my journey in data quality I think you can say, that I started with working with the good way of implementing data quality tools, then turned to some better ways and, until now at least, is working with the best way of implementing data quality technology.

It is though not that the good old kind of tools are obsolete. They are just relieved from some of the repeating of the hard work in cleaning up dirty data.

The good (old) kind of tools are data cleansing and data matching tools. These tools are good at finding errors in postal addresses, duplicate party records and other nasty stuff in master data. The bad thing about finding the flaws long time after the bad master data has entered the databases, is that it often is very hard to do the corrections after transactions has been related to these master data and that, if you do not fix the root cause, you will have to do this periodically. However, there still are reasons to use these tools as reported in the post Top 5 Reasons for Downstream Cleansing.

The better way is real time validation and correction at data entry where possible. Here a single data element or a range of data elements are checked when entered. For example the address may be checked against reference data, phone number may be checked for adequate format for the country in question or product master data is checked for the right format and against a value list. The hard thing with this is to do it at all entry points. A possible approach to do it is discussed in the post Service Oriented MDM.

The best tools are emphasizing at assisting data capture and thus preventing data quality issues while also making the data capture process more effective by connecting opposite to collecting. Two such tools I have worked with are:

·        IDQ™ which is a tool for mashing up internal party master data and 3rd party big reference data sources as explained further in the post instant Single Customer View.

·        Product Data Lake, a cloud service for sharing product data in the business ecosystems of manufacturers, distributors, retailers and end users of product information. This service is described in detail here.

DQ

Sell more. Reduce costs.

Business outcome is the end goal of any data management activity may that be data governance, data quality management, Master Data Management (MDM) and Product Information Management (PIM).

Business outcome comes from selling more and reducing costs.

At Product Data Lake we have a simple scheme for achieving business outcome through selling more goods and reducing costs of sharing product information between trading partners in business ecosystems:

Sell more Reduce costs

Using Pull or Push to Get to the Next Level in Product Information Management

The importance of having a viable Product Information Management (PIM) solution has become well understood for companies who participates in supply chains.

The next step towards excellence in PIM is to handle product information in close collaboration with your trading partners. Product Data Lake is the solution for that. Here upstream providers of product information (manufacturers and upstream distributors) and downstream receivers of product information (downstream distributors and retailers) connect their choice of in-house PIM solution or other product master data solution as PLM (Product Lifecycle Management) or ERP.

Read more about that in the post What a PIM-2-PIM Solution Looks Like.

The principle behind Product Data Lake is inspired by how a data lake differs from a traditional data warehouse. In a data lake the linking and transformation takes place late, when the data is consumed by the receiver.

pdl-diagram-new

Product Data Lake resembles a social network as you connect with your trading partners from the real world in order to collaborate on getting complete and accurate product data from the manufacturer to the point-of-sales:

  • Pull-PushAs a downstream receiver, you can be on the winning side by utilizing our Product Data Pull service
  • As an upstream provider, you can be on the winning side by utilizing our Product Data Push service

To the Cloud and Beyond

Over at the Informatica blog Joe McKendrick recently wrote about When It’s Time to Give Data Warehouse a Digital Makeover.

In here Joe examines how data warehouses can be modernized to augment architectures supporting data lakes and Mater Data Management and the case for moving data warehouses to the cloud.

In my view, a lot of data management disciplines will eventually move to the cloud as one follows the other. By adding “beyond” I suggest, that cloud solutions will not only be something that is supported company by company. Eventually you will be able to get business outcome by sharing data management burdens within your business ecosystem.

My current venture called Product Data Lake is an example of such a solution. It modernizes the data warehouse thinking within product information sharing by using a data lake concept in the cloud ready-to-use by trading partners within business ecosystems:

  • If you are a provider of product information, typically as a manufacturer of goods, you can harvest your business outcome by using us for Product Data Push
  • If you are a receiver of product information, you can harvest your business outcome by using us for Product Data Pull

pdl-top

MDM, Reltio, Gartner and Business Outcome

A recent well commented blog post by Andrew White of Gartner, the analyst firm, debates What’s Happening in Master Data Management (MDM) Land?

The post is an answer to a much liked and commented LinkedIn status post by Ramon Chen, Chief Product Officer of Reltio.

In his post Andrew connects the classic dots: How does technology lead to business outcome? Especially the use of cloud solutions and the multi-tenant aspect is in the focus. Andrew asks: What do you see “out there”?

My view is that multi-tenant is not just about offering the same subscription based cloud solutions to a range of clients. It is about making clients sharing the same business ecosystem work in the same MDM realm. This is the platform described in Master Data Share.

Gartner Digital Platforms 2
Source: Gartner

Oh, and what does that have to do with business outcome? A lot. Organizations will not win the future the race by optimizing there inhouse MDM capabilities alone. With the rise of digitalization, they need to connect with and understand their customers, which I believe is something Reltio is good at. Furthermore, organisations need to be much better at working with their business partners in a modern way, including at the master data level. The business outcome of this is:

  • Having complete, accurate and timely data assets needed for understanding and connecting with customers. You will sell more.
  • Having a fast and seamless flow of data assets, not at least product information, to and from your trading partners. You will reduce costs.
  • Having a holistic view of internal and external data needed for decision making. You will mitigate risks.

Merchants vs Manufacturers in the Information Age

Merchants sells the goods produced by manufacturers. In that game merchants and manufacturers are basically allies. Then of course the merchant’s profit may depend on the margin he can get between the manufacturers price to him and the merchant’s price to his customer. In that game, merchants and manufacturers are kind of enemies.

When it comes to providing product information to the end customers, merchants and manufacturers are allies too. The more complete product information placed in front of the end customer, the better. This is increasingly important today with more and more goods sold in self-service scenarios as in ecommerce.

standoffBut again, there seems to be an enemy angle here too. Who should have the burden of lifting product information as the manufacturers have it to the way it is presented at the point-of-sales provided by the merchant? Often this seems to be stalled in a standoff as described in the post Passive vs Active Product Information Exchange.

At Product Data Lake we offer merchants and manufacturers an honorable way out of this standoff:

MDM Will Go Cloud

How cloud is changing MDM (Master Data Management) is a subject examined in a very read worthy article by Julie Hunt published recently. The article is called How Does Technology Enable Effective MDM?

In here Julie says: “Adoption of cloud-based MDM or MDM-as-a-Service is on the rise, opening up new dimensions for how organizations take advantage of MDM and data governance.”

Julie’s article is part 3 of a six part series on the “New Age of Master Data Management”, so I may touch on a dimension that is covered in the upcoming articles. This dimension is how business ecosystems must be a part of your organizations MDM roadmap, and that dimension is, according to Gartner, the analyst firm, covering 8 underlying dimensions as told in the post From Business Ecosystem Strategy to PIM Technology.

Working with MDM in a business ecosystem context does require MDM in the cloud of some sort. Inhouse Mater Data Management and Product Information Management (PIM), which may be on premise or in the cloud or perhaps a hybrid, is only the beginning. Collaboration with business partners in a sophisticated environment will be controlled by a cloud solution.

More on this concept is explained in this piece about Master Data Share.

greatbeltbridge.jpg