Data discovery is emerging as an essential discipline in the data management space as explained in the post The Role of Data Discovery in Data Management.
In a data hub encompassing master data, reference data, critical application data and more, data discovery can play a significant role in the continuous improvement of data quality and how data is governed, managed and measured along with an ever evolving business model and new data driven services.
Data discovery serves as the weapon used when exploring the as-is data landscape at your organization with the aim of building a data hub that reflects your data model and data portfolio. As the data maturity is continuously improved reflected in step-by-step maturing to-be states, data discovery can be used when increasing the data hub scope by encompassing more data sources, when new data driven services are introduced and the business model is enhanced as part of a digital transformation.
In that way data discovery is an indispensable node in maturing the data supply chain and the continuously data quality improvement cycle that must underpin your digital transformation course.
Learn more about the data discovery capability in a data hub context in the Semarchy whitepaper authored by me and titled Intelligent Data Hub: -Taking MDM to the Next Level.
During the end of last century data quality management started to gain traction as organizations realized that the many different applications and related data stores in operation needed some form of hygiene. Data cleansing and data matching (aka deduplication) tools were introduced.
In the 00’s Master Data Management (MDM) arised as a discipline encompassing the required processes and the technology platforms you need to have to ensure a sustainable level of data quality in the master data used across many applications and data stores. The first MDM implementations were focused on a single master data domain – typically customer or product. Then multidomain MDM (embracing customer and other party master data, location, product and assets) has become mainstream and we see multienterprise MDM in the horizon, where master data will be shared in business ecosystems.
MDM also have some side disciplines as Product Information Management (PIM), Digital Asset Management (DAM) and Reference Data Management (RDM). Sharing of product information and related digital assets in business ecosystems is here supported by Product Data Syndication.
Lately data governance has become a household term. We see multiple varying data governance frameworks addressing data stewardship, data policies, standards and business glossaries. In my eyes data governance and data governance frameworks is very much about adding the people side to the processes and technology we have matured in MDM and Data Quality Management (DQM). And we need to combine those themes, because It is not all about People or Processes or Technology. It is about unifying all this.
In my daily work I help both tool providers and end user organisations with all this as shown on the page Popular Offerings.
A Request for Proposal (RFP) process for a Master Data Management (MDM) and/or Product Information Management (PIM) solution has a hard fact side as well as there are The Soft Sides of MDM and PIM RFPs.
The hard fact side is the detailed requirements a potential vendor has to answer to in what in most cases is the excel sheet the buying organization has prepared – often with the extensive help from a consultancy.
Here are what I have seen as the most frequently included topics for the hard facts in such RFPs:
- MDM and PIM: Does the solution have functionality for hierarchy management?
- MDM and PIM: Does the solution have workflow management included?
- MDM and PIM: Does the solution support versioning of master data / product information?
- MDM and PIM: Does the solution allow to tailor the data model in a flexible way?
- MDM and PIM: Does the solution handle master data / product information in multiple languages / character sets / script systems?
- MDM and PIM: Does the solution have capabilities for (high speed) batch import / export and real-time integration (APIs)?
- MDM and PIM: Does the solution have capabilities within data governance / data stewardship?
- MDM and PIM: Does the solution integrate with “a specific application”? – most commonly SAP, MS CRM/ERPs, SalesForce?
- MDM: Does the solution handle multiple domains, for example customer, vendor/supplier, employee, product and asset?
- MDM: Does the solution provide data matching / deduplication functionality and formation of golden records?
- MDM: Does the solution have integration with third-party data providers for example business directories (Dun & Bradstreet / National registries) and address verification services?
- MDM: Does the solution underpin compliance rules as for example data privacy and data protection regulations as in GDPR / other regimes?
- PIM: Does the solution support product classification and attribution standards as eClass, ETIM (or other industry specific / national standards)?
- PIM: Does the solution support publishing to popular marketplaces (form of outgoing Product Data Syndication)?
- PIM: Does the solution have a functionality to ease collection of product information from suppliers (incoming Product Data Syndication)?
Learn more about how I can help in the blog page about MDM / PIM Tool Selection Consultancy.
Data discovery is a term probably most mentioned in relation to business intelligence and data science. I this context data discovery can be seen as a more experimental and preliminary activity that can lead to a more continuous and integrated form of reporting and predictive analysis when hidden data sources, relationships and patterns are identified.
However, data discovery is useful in other data management disciplines as well.
With the increasing awareness of data security, data protection and data privacy – and the regularity compliance enforced in this space – it is crucial for organisations to know what kind of data that flows and are stored within the organization. While you may argue that this should be available in already existing documentation, I have yet to meet an organization, where this is the case. And I come around a lot.
Data discovery is also a component of test data management and tool vendors package their offerings in this space with capabilities for data masking, data subsetting and data discovery in order to answer questions as:
- Where are the data elements that should be masked when using production data in test scenarios without violating data privacy regulations?
- How can you subset (minimize) test data sets derived from production (covering several databases) and still have proper relationships covered?
Within Data Quality Management, Data Governance and Master Data Management (MDM) data discovery also plays a role similar to the role in data reporting. We can use data discovery to map data lineage, find potential data relationships where data matching, data cleansing and/or data stewardship might help with ensuring data quality and business process improvement and explore where the same data have different labels (metadata) attached or the same labels are used for different data types.
Yesterday I had the pleasure of attending the Informatica MDM 360 and Data Governance Summit in London including being in a panel discussing best practices for your MDM 360 journey. The rise of Artificial Intelligence (AI) in Master Data Management (MDM) was a main theme at this event.
Informatica has a track record of innovating in new technologies in the data management space while also acquiring promising newcomers in order to fast track their market offering. So it is with AI and MDM at Informatica too. Informatica currently has two tracks:
- clAIre – the clairvoyant component in the Informatica portfolio that “using machine learning and other AI techniques leverages the industry-leading metadata capabilities of the Informatica Intelligent Data Platform to accelerate and automate core data management and governance processes”.
- Informatica Customer 360 Insights which is the new branding of the recent AllSight acquisition. You can learn about that over at The Disruptive Master Data Management Solutions List in the entry about Informatica Customer 360 Insights.
At the Informatica event the synergy between these two tracks was presented as the Intelligent 360 View. Naturally, marketing synergies are the first results of an acquisition. Later we will – hopefully – see actual synergies when the technologies are to be aligned, positioned and delivered to customers who want to be an intelligent enterprise of the future.
The title of this blog post is also the title of a presentation I will do at the 2019 Data Governance and Information Quality Conference in San Diego, US in June.
There is a little difference between how we can exercise data governance and information quality management when we are handling data about products versus handling the most common data domain being party data (customer, vendor/supplier, employee and other roles).
This topic was touched here on the blog in the post called Data Quality for the Product Domain vs the Party Domain.
The conference session will go through these topics:
- Product master data vs. product information
- How Master Data Management (MDM), Product Information Management (PIM) and Digital Asset Management (DAM) stick together
- The roles of 1st party data, 2nd party data and 3rd party data in MDM, PIM and DAM
- Business ecosystem wide product data management
- Cross company data governance and information quality alignment
You can have a look at the full agenda for the DGIQ 2019 Conference here.
20 years ago, when I started working as a contractor and entrepreneur in the data management space, data was not on the top agenda at many enterprises. Fortunately, that has changed.
An example is displayed by Schneider Electric CEO Jean-Pascal Tricoire in his recent blog post on how digitization and data can enable companies to be more sustainable. You can read it on the Schneider Electric Blog in the post 3 Myths About Sustainability and Business.
Manufacturers in the building material sector naturally emphasizes on sustainability. In his post Jean-Pascal Tricoire says: “The digital revolution helps answering several of the major sustainability challenges, dispelling some of the lingering myths regarding sustainability and business growth”.
One of three myths dispelled is: Sustainability data is still too costly and time-consuming to manage.
From my work with Master Data Management (MDM) and Product Information Management (PIM) at manufacturers and merchants in the building material sector I know that managing the basic product data, trading data and customer self-service ready product data is hard enough. Taking on sustainability data will only make that harder. So, we need to be smarter in our product data management. Smart and sustainable homes and smart sustainable cities need smart product data management.
In his post Jean-Pascal Tricoire mentions that Schneider Electric has worked with other enterprises in their ecosystem in order to be smarter about product data related to sustainability. In my eyes the business ecosystem theme is key in the product data smartness quest as pondered in the post about How Manufacturers of Building Materials Can Improve Product Information Efficiency.
The term data monetization is trending in the data management world.
Data monetization is about harvesting direct financial results from having access to data that is stored, maintained, categorized and made accessible in an optimal manner. Traditionally data management & analytics has contributed indirectly to financial outcome by aiming at keeping data fit for purpose in the various business processes that produced value to the business. Today the best performers are using data much more directly to create new services and business models.
In my view there are three flavors of data monetization:
- Selling data: This is something that have been known to the data management world for years. Notable examples are the likes of Dun & Bradstreet who is selling business directory data as touched in the post What is a Business Directory? Another examples is postal services around the world selling their address directories. This is the kind of data we know as third party data.
- Wrapping data around products: If you have a product – or a service – you can add tremendous value to these products and services and make them more sellable by wrapping data, potentially including third party data, around those products and services. These data will thus become second party data as touched in the post Infonomics and Second Party Data.
- Advanced analytics and decision making: You can combine third party data, second party data and first party data (your own data) in order to make advanced analytics and fast operational decision making in order to sell more, reduce costs and mitigate risks.
Please learn more about data monetization by downloading a recent webinar hosted by Information Builders, their expert Rado Kotorov and yours truly here.
Sometimes you may get the impression that sales, including online sales, is driven by extremely smart sales and marketing people targeting simple-minded customers.
Let us look at an example with selling a product online. Below are two approaches:
Bigger picture is available here.
My take is that the data rich approach is much more effective than the alternative (but sadly often used one). Some proof is delivered in the post Ecommerce Su…ffers without Data Quality.
In many industries, the merchant who will cash in on the sale will be the one having the best and most stringent data, because this serves the overwhelming majority of buying power, who do not want to be told what to buy, but what they are buying.
So, pretending to be an extremely smart data management expert, I will argue that you can monetize on product data by having the most complete, timely, consistent, conform and accurate product information in front of your customers. This approach is further explained in the piece about Product Data Lake.
This week I attended the Master Data Management Summit Europe 2018 and Data Governance Conference Europe 2018 in London.
Among the recurring sessions year by year on this conference and the sister conferences around the world will be Aaron Zornes presenting the top MDM Vendors as he (that is the MDM Institute) sees it and the top System Integrators as well.
Managing an ongoing list of such entities can be hard and doing it in PowerPoint does not make the task easier as visualized in two different shots captured via Twitter as seen below around the Top 19 to 22 European MDM / DG System Integrators:
Bigger picture available here.
Now, the variations between these two versions of the truth and the real world are (at least):
- Red circles: Is number 17 (in alphabetical order) Deloitte – in Denmark – who bought Platon 5 years ago or is it KPMG.
- Blue arrow and circles: Is SAP Professional Services in there or not – and if they are, there must be 21 Top 20 players with two number 11: Edifixio and Entity Group
- Green arrow: Number 1 (in alphabetical order) Affecto has been bought by number 8 CGI during this year.
PS: Recently I started a disruptive list of MDM vendors maintained by the vendors themselves. Perhaps the analysts can be helped by a similar list for System Integrators?