CDP: Is that part of CRM or MDM?

The notion of a data centred application type called a Customer Data Platform (CDP) seems to be trending these days. A CDP solution is a centralized registry of all data related to parties regarded as (prospective) customers at an enterprise.

This kind of solution comes from two solution markets:

  • Customer Relationship Management (CRM)
  • Master Data Management (MDM)

The CRM track was recently covered in a Venture Beat article telling that Salesforce announces a Customer Data Platform to unify all marketing data. In this article it is also stated that Oracle just announced a similar solution named CX Unity and Adobe announced triggered journeys based on a rich pool of centralized data.

Add to that last year´s announcement from Microsoft, Adobe and SAP on their Open Data Initiative as told in the LinkedIn article Using a Data Lake for Data Sharing.

Some MDM solution providers are also on that track. Reltio Cloud embraces all customer data and Informatica Customer 360 Insights, formerly known as Allsight, is also going there as reported in the post Extended MDM Platforms.

Will be interesting to follow how CDP solutions evolve and if it is CRM or MDM vendors who will do best in this discipline. One guess could be that MDM vendors will provide “the best” solutions but CRM vendors will sell most licenses. We will see.

CDP CRM MDM

10 Years

This blog has now been online for 10 years.

pont_du_gard
Pont du Gard

Looking back at the first blog posts I think the themes touched are still valid.

The first post from June 2009 was about data architecture. 2000 years ago, the roman writer, architect and engineer Marcus Vitruvius Pollio wrote that a structure must exhibit the three qualities of firmitas, utilitas, venustas — that is, it must be strong or durable, useful, and beautiful. This is true today – both in architecture and data architecture – as told in the post Qualities in Data Architecture.

A recurring topic on this blog has been a discussion around the common definition of data quality as being that the data is fit for the intended purpose of use. The opening of this topic as made in the post Fit for what purpose?

brueghel-tower-of-babel
Tower of Babel by Brueghel

Diversity in data quality has been another repeating topic. Several old tales including in the Genesis and the Qur’an have stories about a great tower built by mankind at a time with a single language of all people. Since then mankind was confused by having multiple languages. And indeed, we still are as pondered in the post The Tower of Babel.

Thanks to all who are reading this blog and not least to all who from time to time takes time to make a comment, like and share.

greatbeltbridge
Great Belt Bridge

RDM: A Small but Important Extension to MDM

Reference Data Management (RDM) is a small but important extension to Master Data Management (MDM). Together with a large extension, being big data and data lakes, mastering reference data is increasingly being part of the offerings from MDM solution vendors as told in the post Extended MDM Platforms.

RDM

Reference Data

Reference data are these smaller lists of values that gives context to master data and ensures that we use the same (or linkable) codes for describing master data entities. Examples are:

Reference data tend to be externally defined and maintained typically by international standardization bodies or industry organizations, but reference data can also be internally defined to meet your specific business model.

3 RDM Solutions from MDM Vendors

Informatica has recently released their first version of a new RDM solution: MDM – Reference 360. This is by the way the first true Software as a Service (SaaS) solution from Informatica in the MDM space. This solution emphasizes on building a hierarchy of reference data lists, the ability to make crosswalks between the lists, workflow (approval) around updates and audit trails.

Reltio has embraced RDM has an integral part of their Reltio Cloud solution where the “RDM capabilities improves data governance and operational excellence with an easy to use application that creates, manages and provisions reference data for better reporting and analytics.

Semarchy has a solution called Semarchy xDM. The x indicates that this solution encompasses all kinds of enterprise grade data and thus both Master data and Reference data while “xDM extends the agile development concept to its implementation paradigm”.

Data Modelling and Data Quality

There are intersections between data modelling and data quality. In examining those we can use a data quality mind map published recently on this blog:

Data modelling and data quality

Data Modelling and Data Quality Dimensions:

Some data quality dimensions are closely related to data modelling and a given data model can impact these data quality dimensions. This is the case for:

  • Data integrity, as the relationship rules in a traditional entity-relation based data model fosters the integrity of the data controlled in databases. The weak sides are, that sometimes these rules are too rigid to describe actual real-world entities and that the integrity across several databases is not covered. To discover the latter one, we may use data profiling methods.
  • Data validity, as field definitions and relationship rules controls that only data that is considered valid can enter the database.

Some other data quality dimensions must be solved with either extended data models and/or alternative methodologies. This is the case for:

  • Data completeness:
    • A common scenario is that for example a data model born in the United States will set the state field within an address as mandatory and probably to accept only a value from a reference list of 50 states. This will not work in the rest of world. So, in order to not getting crap or not getting data at all, you will either need to extend the model or loosening the model and control completeness otherwise.
    • With data about products the big pain is that different groups of products require different data elements. This can be solved with a very granular data model – with possible performance issues, or a very customized data model – with scalability and other issues as a result.
  • Data uniqueness: A common scenario here is that names and addresses can be spelled in many ways despite that they reflect the same real-world entity. We can use identity resolution (and data matching) to detect this and then model how we link data records with real world duplicates together in a looser or tighter way.

Emerging technologies:

Some of the emerging technologies in the data storing realm are presenting new ways of solving the challenges we have with data quality and traditional entity-relationship based data models.

Graph databases and document databases allows for describing and operating data models better aligned with the real world. This topic was examined in the post Encompassing Relational, Document and Graph the Best Way.

In the Product Data Lake venture I am working with right now we are also aiming to solve the data integrity, data validity and data completeness issues with product data (or product information if you like) using these emerging technologies. This includes solving issues with geographical diversity and varying completeness requirements through a granular data model that is scalable, not only seen within a given company but also across a whole business ecosystem encompassing many enterprises belonging to the same (data) supply chain.

Connecting Silos

The building next to my home office was originally two cement silos standing in an industrial harbor area among other silos. These two silos are now transformed into a connected office building as this area has been developed into a modern residence and commercial quarter.

Master Data Management (MDM) is on similar route.

The first quest for MDM has been to be a core discipline in transforming siloed data stores within a given company into a shared view of the core entities that must be described in the same way across different departmental views. Going from the departmental stage to the enterprise wide stage is examined in the post Three Stages of MDM Maturity.

But as told in this post, it does not stop there. The next transformation is to provide a shared view with trading partners in the business ecosystem(s) where your company operates. Because the shared data in your organization is also a silo when digital transformation puts pressure on each company to become a data integrated part of a business ecosystem.

A concept for doing that is described on the blog page called Master Data Share.

Silos
Connected silos in Copenhagen North Harbor – and connecting data silos enterprise wide and then business ecosystem wide

Artificial Intelligence (AI) and Multienterprise MDM

The previous post on this blog was called Machine Learning, Artificial Intelligence and Data Quality. In here the it was examined how Artificial Intelligence (AI) is impacted by data quality and how data quality can impact AI.

Master Data Management (MDM) will play a crucial role in sustaining the needed data quality for AI and with the rise of digital transformation encompassing business ecosystems we will also see an increasing need for ecosystem wide MDM – also called multienterprise MDM.

Right now, I am working with a service called Product Data Lake where we strive to utilize AI including using Machine Learning (ML) to understand and map data standards and exchange formats used within product information exchange between trading partners.

The challenge in this area is that we have many different classification systems in play as told in the post Five Product Classification Standards. Besides the industry and cross sector standards we still have many homegrown standards as well.

Some of these standards (as eClass and ETIM) also covers standards for the attributes needed for a given product classification, but still, we have plenty of homegrown standards (at no standards) for attribute requirements as well.

Add to that the different preferences for exchange methods and we got a chaotic system where human intervention makes Sisyphus look like a lucky man. Therefore, we have great expectations about introducing machine learning and artificial intelligence in this space.

AI ML PDL

Next week, I will elaborate on the multienterprise MDM and artificial theme on the Master Data Management Summit Europe in London.

Solutions for Handling Product Master Data and Digital Assets

There are three kinds of solutions for handling product master data and related digital assets:

  • Master Data Management (MDM) solutions that are either focussed on product master data or being a multi-domain MDM solution covering the product domain as well as the party domain, the location domain, the asset domain and more.
  • Product Information Management (PIM) solutions.
  • Digital Asset Management (DAM) solutions.

According to Gartner Analyst Simon Walker a short distinction is:

  • MDM of product master data solutions help manage structured product data for enterprise operational and analytical use cases
  • PIM solutions help extend structured product data through the addition of rich product content for sales and marketing use cases
  • DAM solutions help users create and manage digital multimedia files for enterprise, sales and marketing use cases

The below figure shows what kind of data that is typically included in respectively an MDM solution, a PIM solution and/or a DAM solution.

MDM PIM DAM

This is further elaborated in the post How MDM, PIM and DAM Stick Together.

The solution vendors have varying offerings going from being best-of-breed in one of the three categories to offering a OneStopShopping solution for all disciplines.

If you are to compile a list of suitable and forward-looking solutions for MDM, PIM and/or DAM for your required mix, you can start looking at The Disruptive List of MDM/PIM/DAM solutions.

To use Excel or not to use Excel in Product Information Management?

Excel is used heavily throughout data management and this is true for Product Information Management (PIM) too.

The reason of being for PIM solutions is often said to be to eliminate the use of spreadsheets. However, PIM solutions around have functionality to co-exist with spreadsheets, because spreadsheets are still a fact of life.

This is close to me as I have been working on a solution to connect PIM solutions (and other solutions for handling product data) between trading partners. This solution is called Product Data Lake.

Our goal is certainly also to eliminate the use of spreadsheets in exchanging product information between trading partners. However, as an intermediate state we must accept that spreadsheets exists either as the replacement of PIM solutions or because PIM solutions does not (yet) fulfill all purposes around product information.

So, consequently we have added a little co-existence with Excel spreadsheets in today´s public online release of Product Data Lake version 1.10.

PDL version 1 10

The challenge is that product information is multi-dimensional as we for example have products and their attributes typically represented in multiple languages. Also, each product group has its collection of attributes that are relevant for that group of products.

Spreadsheets are basically two dimensional – rows and columns.

In Product Data Lake version 1.10 we have included a data entry sheet that mirrors spreadsheets. You can upload a two-dimensional spreadsheet into a given product group and language, and you can download that selection into a spreadsheet.

This functionality can typically be used by the original supplier of product information – the manufacturer. This simple representation of data will then be part of the data lake organisation of varieties of product information supplemented by digital assets, product relationships and much more.

1,000 Blog Posts and More to Come

number_1000I just realized that this post will be number 1,000 published on this blog. So, let me not say something new but just recap a little bit on what it has been all about in the last nearly 10 years of running a blog on some nerdy stuff.

Data quality has been the main theme. When writing about data quality one will not avoid touching Master Data Management (MDM). In fact, the most applied category used here on this site, with 464 and counting entries, is Master Data.

The second most applied category on this blog is, with 219 entries, Data Architecture.

The most applied data quality activity around is data matching. As this is also where I started my data quality venture, there has been 192 posts about Data Matching.

The newest category relates to Product Information Management (PIM) and is, with 20 posts at the moment, about Product Data Syndication.

Even though that data quality is a serious subject, you must not forget to have fun. 66 posts, including a yearly April Fools post, has been categorized as Supposed to be a Joke.

Thanks to all who are reading this blog and not least to all who from time to time takes time to make a comment, like and share.