The next vendor to be included on The Disruptive MDM / PIM / DQM List 2022 is Informatica.
Informatica has been a leader on the Master Data Management (MDM), Product Information Management (PIM) and Data Quality Management (DQM) space for many years latest as seen in their 6th time front position on the Gartner MDM quadrant.
Data quality dimensions are some of the most used terms when explaining why data quality is important, what data quality issues can be and how you can measure data quality. Ironically, we sometimes use the same data quality dimension term for two different things or use two different data quality dimension terms for the same thing. Some of the troubling terms are:
Validity / Conformity – same same but different
Validity is most often used to describe if data filled in a data field obeys a required format or are among a list of accepted values. Databases are usually well in doing this like ensuring that an entered date has the day-month-year sequence asked for and is a date in the calendar or to cross check data values against another table and see if the value exist there.
The problems arise when data is moved between databases with different rules and when data is captured in textual forms before being loaded into a database.
Conformity is often used to describe if data adheres to a given standard, like an industry or international standard. This standard may due to complexity and other circumstances not or only partly be implemented as database constraints or by other means. Therefore, a given piece of data may seem to be a valid database value but not being in compliance with a given standard.
Sometimes conformity is linked to the geography in question. For example a postal code will be conform depending on the country where the address is in. Therefore, a the postal code 12345 is conform in Germany, but not in United Kingdom.
In the data quality realm accuracy is most often used to describe if the data value corresponds correctly to a real-world entity. If we for example have a postal address of the person “Robert Smith” being “123 Main Street in Anytown” this data value may be accurate because this person (for the moment) lives at that address.
But if “123 Main Street in Anytown” has 3 different apartments each having its own mailbox, the value does not, for a given purpose, have the required precision.
If we work with geocoordinates we have the same challenge. A given accurate geocode may have the sufficient precision to tell the direction to the nearest supermarket is, but not precise enough to know in which apartment the out-of-milk smart refrigerator is.
Timeliness / Currency – when time matters
Timeliness is most often used to state if a given data value is present when it is needed. For example, you need the postal address of “Robert Smith” when you want to send a paper invoice or when you want to establish his demographic stereotype for a campaign.
Currency is most often used to state if the data value is accurate at a given time – for example if “123 Main Street in Anytown” is the current postal address of “Robert Smith”.
Uniqueness / Duplication – positive or negative
Uniqueness is the positive term where duplication is the negative term for the same issue.
We strive to have uniqueness by avoiding duplicates. In data quality lingo duplicates are two (or more) data values describing the same real-world entity. For example, we may assume that
“Robert Smith at 123 Main Street, Suite 2 in Anytown”
is the same person as
“Bob Smith at 123 Main Str in Anytown”
Completeness / Existence – to be, or not to be
Completeness is most often used to tell in what degree all required data elements are populated.
Existence can be used to tell if a given dataset has all the needed data elements for a given purpose defined.
So “Bob Smith at 123 Main Str in Anytown” is complete if we need name, street address and city, but only 75 % complete if we need name, street address, city and preferred colour and preferred colour is an existent data element in the dataset.
Data Quality Management
Master Data Management (MDM) solutions and specialized Data Quality Management (DQM) tools have capabilities to asses data quality dimensions and improve data quality within the different data quality dimensions.
Also, Tibco has acquired Information Builders and thus taken their position.
Again, this year Informatica is the most top-right positioned vendor. Good to know, as I am right now involved in some digital transformation programs where Informatica Data Quality (iDQ) is part of the technology stack.
You can get a free copy of the report from Ataccama here.
The previous acquisitions have strengthened the Precisely offerings around data quality for the customer master data domain and the adjacent location domain.
The Winshuttle take over will make Precisely a multidomain vendor adding cross domain capabilities and specific product domain capabilities.
The original Winshuttle capabilities revolves around process automation for predominately SAP environments covering all master data domains and further Application Data Management (ADM).
As Winshuttle recently took over the Product Information Management (PIM) solution provider Enterworks, this will bring capabilities around product master data management and thus make Precisely a provider for a broad spectrum of master data domains.
The interesting question will be in what degree Precisely over the time will be willing to and able to integrate these different solutions so a one-stop-shopping experience will become a one-stop digital experience for their clients.
Infogix is a four-decade old provider of solutions for data quality and adjacent disciplines as data governance, data catalog and data analytics.
Precisely was recently renamed from Syncsort. Under this brand they nabbed Pitney Bowes software two years ago as told in the post Syncsort Nabs Pitney Bowes Software Solutions. Back in time Pitney Bows nabbed veteran data quality solution provider Group1.
Before being Syncsort their data quality software solution was known as Trillium. This solution also goes a long way back.
So, it is worth noticing that the M&A activity revolves around data quality software that was born in the previous millennium.
There are not any significant changes in the relative positioning of the vendors. Only thing is that Syncsort has been renamed to Precisely.
As stated in the report, much of the data quality industry is focused on name and address validation. However, there are many opportunities for data quality vendors to spread their wings and better tackle problems in other data domains, such as product, asset and inventory data.
One explanation of why this is not happening is probably the interwoven structure of the joint Master Data Management (MDM), Product Information Management (PIM) and Data Quality Management (DQM) markets and disciplines. For example, a predominant data quality issue as completeness of product information is addressed in PIM solutions and even better in Product Data Syndication (PDS) solutions.
Here, there are some opportunities for pure play vendors within each speciality to work together as well as for the larger vendors for offering both a true integrated overall solution as well as contextual solutions for each issue with a reasonable cost/benefit ratio.
Many analyst market reports in the Master Data Management (MDM), Product Information Management (PIM) and Data Quality Management (DQM) space have a generic ranking of the vendors.
The trouble with generic ranking is that one size does not fit all.
On the sister site to this blog, The Disruptive MDM / PIM / DQM List, there is no generic ranking. Instead there is a service where you can provide your organization’s context, scope and requirements and within 2 to 48 hours get Your Solution List.
The selection model includes these elements:
Your context in terms of geographical reach and industry sector.
Your scope in terms of data domains to be covered and organizational scale stretching from specific business units over enterprise-wide to business ecosystem wide (interenterprise).
Your specific requirements covering the main capabilities that differentiate the vendors on market.
A model that combines those facts into a rectangle where you can choose to:
Go ahead with a Proof of Concept with the best fit vendor
Make an RFP with the best fit vendors in a shortlist
Examine a longlist of best fit vendors and other alternatives like combining more than one solution.
I am running a service where organizations on the look for a Master Data Management (MDM), Product Information Management (PIM) and/or Data Quality Management (DQM) solution can get a list of the best fit solutions for their context, scope and requirements. The service is explained in more details in the post Get Your Free Bespoke MDM / PIM / DQM Solution Ranking.
2020 was a busy year for this service. There were 176 requests for a list. About half of them came, as far as I can tell, from end user organizations and the other half came from consultancies who are helping end user organizations with finding the right tool vendor. Requests came from all continents (except Antarctica) with North America and Europe as the big chunks. There were requests from most industries thus representing a huge span in context.
Also, there where requests from a variety in organization sizes which has given insights beyond what the prominent analyst firms obtain.
It has been a pleasure also to receive feedback from requesters which has helped calibrating the selection model and verifying the insights derived from the context, scope and requirements given.
The variety in context, scope and requirements resulted in having 8 different vendor logos in top-right position and 25 different logos in all included in the 7 to 9 sized best fit extended longlists in the dispatched Your Solution Lists during 2020.
If you are on the look for a solution, you can use the service here.
If you are a vendor in the MDM / PIM / DQM space, you can register your solution here.