Liliendahl on Data Quality

MDM, PIM, CX and User Types

11th July 2019Henrik Gabs LiliendahlLeave a comment

Customer Experience (CX) is a trendy driver for Master Data Management (MDM) and Product Information Management (PIM).

When talking about CX we may have to distinguish between 3 main kind of user types:

Consumers (in households)
Small Office, Home Office (SOHO) users
Corporate users

CX user types

The critical differences between pleasing consumers in B2C and pleasing business users in B2B was discussed in the post B2C vs B2B in Product Information Management. A crucial distinction is the use of data as told in the post Where to Buy a Magic Wand?

Business users can be divided into those in small self-owned business’s as craftsmen, farmers, small shop owners, freelance consultants and many more and then corporate users who buys on behalf of a legal entity typically within a team of users.

There are intersections of customer experience preference patterns between these groups and then we are all humans regardless of our role in time. Earlier this year I presented a webinar, hosted by Riversand, on this topic. Find the link and the introduction in the post The relation between CX and MDM.

IoT and Business Ecosystem Wide MDM

9th July 20199th July 2019Henrik Gabs LiliendahlLeave a comment

Two of the disruptive trends in Master Data Management (MDM) are the intersection of Internet of Things (IoT) and MDM and business ecosystem wide MDM (aka multienterprise MDM).

These two trends will go hand in hand.

IoT and Ecosystem Wide MDM

The latest MDM market report from Forrester (the other analyst firm) was mentioned in the post Toward the Third Generation of MDM.

In here Forrester says: “As first-generation MDM technologies become outdated and less effective, improved second generation and third-generation features will dictate which providers lead the pack. Vendors that can provide internet-of-things (IoT) capabilities, ecosystem capabilities, and data context position themselves to successfully deliver added business value to their customers.”

This saying is close to me in my current job as co-founder and CTO at Product Data Lake as told in the post Adding Things to Product Data Lake.

In business ecosystem wide MDM business partners collaborate around master data. This is a prerequisite for handling asset master data involved in IoT as there are many parties involved included manufacturers of smart devices, operators of these devices, maintainers of the devices, owners of the devices and the data subjects these devices gather data about.

In the same way forward looking solution providers involved with MDM must collaborate as pondered in the post Linked Product Data Quality.

CDP: Is that part of CRM or MDM?

3rd July 2019Henrik Gabs Liliendahl2 Comments

The notion of a data centred application type called a Customer Data Platform (CDP) seems to be trending these days. A CDP solution is a centralized registry of all data related to parties regarded as (prospective) customers at an enterprise.

This kind of solution comes from two solution markets:

Customer Relationship Management (CRM)
Master Data Management (MDM)

The CRM track was recently covered in a Venture Beat article telling that Salesforce announces a Customer Data Platform to unify all marketing data. In this article it is also stated that Oracle just announced a similar solution named CX Unity and Adobe announced triggered journeys based on a rich pool of centralized data.

Add to that last year´s announcement from Microsoft, Adobe and SAP on their Open Data Initiative as told in the LinkedIn article Using a Data Lake for Data Sharing.

Some MDM solution providers are also on that track. Reltio Cloud embraces all customer data and Informatica Customer 360 Insights, formerly known as Allsight, is also going there as reported in the post Extended MDM Platforms.

Will be interesting to follow how CDP solutions evolve and if it is CRM or MDM vendors who will do best in this discipline. One guess could be that MDM vendors will provide “the best” solutions but CRM vendors will sell most licenses. We will see.

CDP CRM MDM

10 Years

29th June 2019Henrik Gabs Liliendahl4 Comments

This blog has now been online for 10 years.

Looking back at the first blog posts I think the themes touched are still valid.

The first post from June 2009 was about data architecture. 2000 years ago, the roman writer, architect and engineer Marcus Vitruvius Pollio wrote that a structure must exhibit the three qualities of firmitas, utilitas, venustas — that is, it must be strong or durable, useful, and beautiful. This is true today – both in architecture and data architecture – as told in the post Qualities in Data Architecture.

A recurring topic on this blog has been a discussion around the common definition of data quality as being that the data is fit for the intended purpose of use. The opening of this topic as made in the post Fit for what purpose?

brueghel-tower-of-babel — Tower of Babel by Brueghel

Diversity in data quality has been another repeating topic. Several old tales including in the Genesis and the Qur’an have stories about a great tower built by mankind at a time with a single language of all people. Since then mankind was confused by having multiple languages. And indeed, we still are as pondered in the post The Tower of Babel.

Thanks to all who are reading this blog and not least to all who from time to time takes time to make a comment, like and share.

The Future of Disruptive MDM is in the Cloud

25th June 201925th June 2019Henrik Gabs LiliendahlLeave a comment

Two recent posts on the Gartner blog is about databases in the cloud. The Future of Database Management Systems Is Cloud by Merv Adrian ponders why cloud is now the default platform for managing data and The Future of Database Management Systems Is Cloud by Donald Feinberg does the same. Well, the two posts are identical.

This will also mean that the default platform for Master Data Management (MDM) will be in the cloud. Add to that, that the other disruptive MDM trends also will work best in the cloud.

Disruptive MDM in the Cloud

We increasingly see Extended MDM Platforms that also handles reference data and big data. Both these data types are predominantly external in nature and therefore they are better collected, or even better connected, in the cloud.
Services for Artificial Intelligence (AI) and Master Data Management (MDM) is delivered by vendors as cloud solutions.
Encompassing IoT and MDM means collaboration between many parties and this is, with all the relationships to take care of, only possible with cloud platforms.
We will see several other use cases for business ecosystem wide cross company sharing of master data in what Gartner coins as Multienterprise MDM.

Data Quality and the Climate Issue

20th June 201920th June 2019Henrik Gabs LiliendahlLeave a comment

The similarities between getting awareness for data quality issues and the climate issue was touched 10 years ago here on this blog in the post Data Quality and Climate Politics.

The challenges are still the same.

There are many examples published where the results of climate change are pictured. A recent one is the image from Greenland showing huskies pulling sleds not over the usual ice, but through water.

Greenland-melting-ice-sheet-0613-01-exlarge-169

(Image taken by Steffen Malskær Olsen, @SteffenMalskaer, here published on CNN)

We also see statistics showing a development towards melting ice masses with rising sea levels as the foreseeable result. However, statistics can always be questioned. Is the ice thickening somewhere else? Has this happened many times before?

These kind of questions shows the layers we must go through getting from data quality to information quality, then decision quality and on top the wisdom in applying the right knowledge whether that is to achieve business outcomes or avoiding climate change.

DIKW data quality

RDM: A Small but Important Extension to MDM

18th June 20193rd July 2019Henrik Gabs LiliendahlLeave a comment

Reference Data Management (RDM) is a small but important extension to Master Data Management (MDM). Together with a large extension, being big data and data lakes, mastering reference data is increasingly being part of the offerings from MDM solution vendors as told in the post Extended MDM Platforms.

Reference Data

Reference data are these smaller lists of values that gives context to master data and ensures that we use the same (or linkable) codes for describing master data entities. Examples are:

A list of countries, a list of states/provinces in given countries and a list of postal codes in a given country
A list of industry sector codes
A list of product classifications

Reference data tend to be externally defined and maintained typically by international standardization bodies or industry organizations, but reference data can also be internally defined to meet your specific business model.

3 RDM Solutions from MDM Vendors

Informatica has recently released their first version of a new RDM solution: MDM – Reference 360. This is by the way the first true Software as a Service (SaaS) solution from Informatica in the MDM space. This solution emphasizes on building a hierarchy of reference data lists, the ability to make crosswalks between the lists, workflow (approval) around updates and audit trails.

Reltio has embraced RDM has an integral part of their Reltio Cloud solution where the “RDM capabilities improves data governance and operational excellence with an easy to use application that creates, manages and provisions reference data for better reporting and analytics.”

Semarchy has a solution called Semarchy xDM. The x indicates that this solution encompasses all kinds of enterprise grade data and thus both Master data and Reference data while “xDM extends the agile development concept to its implementation paradigm”.

MDM Trend: Data as a Service

15th June 2019Henrik Gabs Liliendahl2 Comments

A recent post on this blog was called Five Disruptive MDM Trends. One of the trends mentioned herein is MDM in the cloud and one form of Master Data Management in the cloud in the picture is Data as a Service (DaaS).

DaaS within MDM

Using Data as a Service in the cloud within MDM solutions is a great way of ensuring data quality. You have access to real-time validation and enrichment of master data and you can also use third party and second party services in the on-boarding processes and then avoid typing in data with the unavoidable human errors that else is the most common root cause of data quality issues.

Some of the most common data services useful in MDM are:

Address Verification and Geocoding

When handling location data having a valid and standardized description of postal addresses and in many cases also a code that tells about the geographic position is crucial in MDM.

Postal address verification can either be exploited by a global service such as Loqate from GB Group or AddressDoctor, which is part of the Informatica offering. Alternatively, you can use national services that are better (but also narrowly) aligned with a given address format within a country and the specific extra services available in some countries.

Geocodes can either by latitude and longitude or flat map friendly geocoding systems such as UTM coordinates or WGS84 coordinates.

Business Directory Services

When handling party master data as B2B customers, suppliers and other business partners in is useful to validate and enrich the data with third party reference data and in some cases even onboard through these sources.

Again, there are global and local options. The most commonly used global is Dun & Bradstreet, who operates a database called WorldBase that holds business entities from all over the world in a uniform format and also provides data about the company family trees on a global basis. Alternatively, many countries have a national service provided by each government with formats and data elements specific to that country.

Citizen Directory Services

When handling party master data as B2C customers, employees and other personal data the third-party possibilities are sparser in general, naturally because of privacy concerns.

In Scandinavia, where I live, these data are available from public sources based on either our national ID or a correct name and address.

Data pools and Product Data Lake

When handling product master data and product information there are for some product groups and product attributes in some geographies data pools available. The most commonly used global service is GDSN from GS1.

Alternatively (or supplementary), for all other product groups, product attributes and digital assets and in all other geographies, you can use a service like the one I am working with and is called Product Data Lake.

Data Modelling and Data Quality

13th June 201913th June 2019Henrik Gabs LiliendahlLeave a comment

There are intersections between data modelling and data quality. In examining those we can use a data quality mind map published recently on this blog:

Data modelling and data quality

Data Modelling and Data Quality Dimensions:

Some data quality dimensions are closely related to data modelling and a given data model can impact these data quality dimensions. This is the case for:

Data integrity, as the relationship rules in a traditional entity-relation based data model fosters the integrity of the data controlled in databases. The weak sides are, that sometimes these rules are too rigid to describe actual real-world entities and that the integrity across several databases is not covered. To discover the latter one, we may use data profiling methods.
Data validity, as field definitions and relationship rules controls that only data that is considered valid can enter the database.

Some other data quality dimensions must be solved with either extended data models and/or alternative methodologies. This is the case for:

Data completeness:
- A common scenario is that for example a data model born in the United States will set the state field within an address as mandatory and probably to accept only a value from a reference list of 50 states. This will not work in the rest of world. So, in order to not getting crap or not getting data at all, you will either need to extend the model or loosening the model and control completeness otherwise.
- With data about products the big pain is that different groups of products require different data elements. This can be solved with a very granular data model – with possible performance issues, or a very customized data model – with scalability and other issues as a result.
Data uniqueness: A common scenario here is that names and addresses can be spelled in many ways despite that they reflect the same real-world entity. We can use identity resolution (and data matching) to detect this and then model how we link data records with real world duplicates together in a looser or tighter way.

Emerging technologies:

Some of the emerging technologies in the data storing realm are presenting new ways of solving the challenges we have with data quality and traditional entity-relationship based data models.

Graph databases and document databases allows for describing and operating data models better aligned with the real world. This topic was examined in the post Encompassing Relational, Document and Graph the Best Way.

In the Product Data Lake venture I am working with right now we are also aiming to solve the data integrity, data validity and data completeness issues with product data (or product information if you like) using these emerging technologies. This includes solving issues with geographical diversity and varying completeness requirements through a granular data model that is scalable, not only seen within a given company but also across a whole business ecosystem encompassing many enterprises belonging to the same (data) supply chain.

	Henrik Gabs Lilienda… on Balancing the Business Partner…
	Jeppe Thing Sørensen on Balancing the Business Partner…
	peolsolutions on MDM, Cloud, SaaS, PaaS, IaaS a…
	Henrik Gabs Lilienda… on Is the Holiday Season called C…
	Michael D. on Is the Holiday Season called C…
	Jay Ram on The Disruptive MDM List is…
	Henrik Gabs Lilienda… on The Intersection of Data Obser…
	Shanker on The Intersection of Data Obser…
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on Data Matching Efficiency
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on From Platforms to Ecosyst…
	Michael Fieg on From Platforms to Ecosyst…
	From Platforms to Ec… on What is Collaborative Product…
	From Platforms to Ec… on MDM and Knowledge Graph