In Master Data Management (MDM) we strive to describe the core entities that are essential to running a business. Most of these entities are something that exists in the real-world. We can organize these entities in various groups as for example parties, things and locations or by their relation to the business buy-side, sell-side and make-side (production).
The challenge in MDM is, as in life in general, that we use the same term for different concepts and different terms for the same concept.
Here are some of the classic issues:
An employee is someone who works within an organization. Sometimes this term must be equal to someone who is on the payroll. But sometimes it is also someone who works besides people on the payroll but is contracting and therefore is more like a vendor. Sometimes employees buy stuff from the organization and therefore acts as a customer.
Is it called vendor or supplier? The common perception is that a vendor brings the invoice and the supplier brings the goods and/or services. This is often the same legal entity but not too seldom two different legal entities.
What is a customer? There are numerous challenges in this question. It is about when a party starts being a customer and when the relationship ends. It is about whether it is a direct or an indirect customer. And also: Is it a business-to-consumer (B2C) customer, a business-to-business (B2B) or a B2B2C customer?
Besides employees, vendors and customers (and similar terms) we also care about other parties being business partners. We care about those entities that we must engage with in order to influence our sales. In manufacturing or reselling building materials you for example build relationships with the architects and engineers who choose the materials to be used for a building.
Traditionally product master data management has revolved around describing a product model which can be produced and sold in many instances over time. With the rise of intelligent things and individually configured complex products, we increasingly must describe each instance of a product as an asset. This adds to the traditional asset domain, where only a few valuable assets have been handled with focus on the financial value.
Each party and each thing have one and most often several relationships with a geographic location (besides digital locations as for example websites).
The difference between doing Business-to-Consumer (B2C) or Business-to-Business (B2B) reflects itself in many IT enabled disciplines.
When it comes to Product Information Management (PIM) this is true as well. As PIM has become essential with the rise of eCommerce, some of the differences are inherited from the eCommerce discipline. There is a discussion on this in a post on the Shopify blog by Ross Simmonds. The post is called B2B vs B2C Ecommerce: What’s The Difference?
Some significant observations to go into the PIM realm is that for B2B, compared to B2C:
The audience is (on average) narrower
The price is (on average) higher
The decision process is (on average) more thoughtful
To sum up the differences I would say that some of the technology you need, for example PIM solutions, is basically the same but the data to go into these solutions must be more elaborate and stringent for B2B. This means that for B2B, compared to B2C, you (on average) need:
More complete and more consistent attributes (specifications, features, properties) for each product and these should be more tailored to each product group.
More complete and consistent product relations (accessories, replacements, spare parts) for each product.
More complete and consistent digital assets (images, line drawings, certificates) for each product.
Within the upcoming EU General Data Protection Regulation (GDPR) the term data subject is used for the persons for whom we must protect the privacy.
These are the persons we handle as entities within party Master Data Management (MDM).
In the figure below the blue area covers the entity types and roles that are data subjects in the eyes of GDPR
While GDPR is of very high importance in business-to-consumer (B2C) and government-to-citizen (G2C) activities, GDPR is also of importance for business-to-business (B2B) and government-to-business (G2B) activities.
GDPR does not cover unborn persons which may be a fact of interest in very few industries as for example healthcare. When it comes to minors, there are special considerations within GDPR to be aware of. GDPR does not apply to deceased persons. In some industries like financial services and utility, the handling of the estate after the death of a person is essential, as well as knowing about that sad event is of importance in general as touched in the post External Events, MDM and Data Stewardship.
One tough master data challenge in the light of GDPR will be to know the status of your registered party master data entities. This also means knowing when it is a private individual, a contact at an organization or an organization or department hereof as such. From my data matching days, I know that heaps of databases do not hold that clarity as reported in the post So, how about SOHO homes.
When working with Party Master Data Management one approach to ensure accuracy, completeness and other data quality dimensions is to onboard new business-to-business (B2B) entities and enrich such current entities via a business directory.
While this could seem to be a straight forward mechanism, unfortunately it usually is not that easy peasy.
Let us take an example featuring the most widely used business directory around the world: The Dun & Bradstreet Worldbase. And let us take my latest registered company: Product Data Lake.
On this screen showing the basic data elements, there are a few obstacles:
The address is not formatted well
The country code system is not a widely used one
The industry sector code system shown is one among others
In our address D&B has put the word “sal”, which is Danish for floor. This is not incorrect, but addresses in Denmark are usually not written with that word, as the number following a house number in the addressing standard is the floor.
D&B has their own 3-digit country code. You may convert to the more widely used ISO 2-character country code. I do however remember a lot of fun from my data matching days when dealing with United Kingdom where D&B uses 4 different codes for England, Wales, Scotland and Northern Ireland as well as mapping back and forth with United States and Puerto Rico. Had to be made very despacito.
Industry Sector Codes
The screen shows a SIC code: 7374 = Computer Processing and Data Preparation and Processing Services
This must have been converted from the NACE code by which the company has been registered: 63.11:(00) = Data processing, hosting and related activities.
The two codes do by the way correspond to the NAICS Code518210 = Data processing, hosting and related activities.
Within Product Information Management (PIM) there is a growing awareness about that sharing product information between trading partners is a very important issue.
So, how do we do that? We could do that, on a global scale, by using:
2,345,678 customer data portals
901,234 supplier data portals
Spreadsheets is the most common mean to exchange product information between trading partners today. The typical scenario is that a receiver of product information, being a downstream distributor, retailer or large end user, will have a spreadsheet for each product group that is sent to be filled by each supplier each time a new range of products is to be on-boarded (and potentially each time you need a new piece of information). As a provider of product information, being a manufacturer or upstream distributor, you will receive a different spreadsheet to be filled from each trading partner each time you are to deliver a new range of products (and potentially each time they need a new piece of information).
Customer data portals is a concept a provider of product information may have, plan to have or dream about. The idea is that each downstream trading partner can go to your customer data portal, structured in your way and format, when they need product information from you. Your trading partner will then only have to deal with your customer data portal – and the 1,234 other customer data portals in their supplier range.
Supplier data portals is a concept a receiver of product information may have, plan to have or dream about. The idea is that each upstream trading partner can go to your supplier data portal, structured in your way and format, when they have to deliver product information to you. Your trading partner will then only have to deal with your supplier data portal – and the 567 other supplier data portals in their business-to-business customer range.
Product Data Lake is the sound alternative to the above options. Hailstorms of spreadsheets does not work. If everyone has either a passive customer data portal or a passive supplier data portal, no one will exchange anything. The solution is that you as a provider of product information will push your data in your structure and format into Product Data Lake each time you have a new product or a new piece of product information. As a receiver you will set up pull requests, that will give you data in your structure and format each time you have a new range of products, need a new piece of information or each time your trading partner has a new piece of information.
Master Data Management (MDM) is increasingly being about supporting systems of engagement in addition to the traditional role of supporting systems of record. This topic was first examined on this blog back in 2012 in the post called Social MDM and Systems of Engagement.
The best known systems of engagement are social networks where the leaders are Facebook for engagement with persons in the private sphere and LinkedIn for engagement with people working in or for one or several companies.
But what about engagement between companies? Though you can argue that all (soft) engagement is neither business-to-consumer (B2C) nor business-to-business (B2B) but human-to-human (H2H), there are some hard engagement going on between companies.
One of the most important ones is exchange of product information between manufacturers, distributors, resellers and large end users of product information. And that is not going very well today. Either it is based on fluffy emailing of spreadsheets or using rigid data pools and portals. So there are definitely room for improvement here.
One of the ways to ensure data quality for customer – or rather party – master data when operating in a business-to-business (B2B) environment, is to on-board new entries using an external defined business entity identifier.
By doing that, you tackle some of the most challenging data quality dimensions as:
Accuracy, by having names, addresses and other information defaulted from a business directory and thus avoiding those spelling mistakes that usually are all over in party master data.
Conformity, by inheriting additional data as line-of-business codes and descriptions from a business directory.
Having an external business identifier stored with your party master data helps a lot with maintaining data quality as pondered in the post Ongoing Data Maintenance.
When selecting an identifier there are different options as national IDs, LEI, DUNS Number and others as explained in the post Business Entity Identifiers.
At the Product Data Lake service I am working on right now, we have decided to use an external business identifier from day one. I know this may be something a typical start-up will consider much later if and when the party master data population has grown. But, besides being optimistic about our service, I think it will be a win not to have to fight data quality issues later with guarantied increased costs.
For the identifier to use we have chosen the DUNS Number from Dun & Bradstreet. The reason is that this currently is the only worldwide covered business identifier. Also, Dun & Bradstreet offers some additional data that fits our business model. This includes consistent line-of-business information and worldwide company family trees.
As reported in the post Gravitational Waves in the MDM World there is a tendency in the MDM (Master Data Management) market and in MDM programmes around to encompass both the party domain and the product domain.
The party domain is still often treated as two separate domains, being the vendor (or supplier) domain and the customer domain. However, there are good reasons for seeing the intersection of vendor master data and customer master data as party master data. These reasons are most obvious when we look at the B2B (business-to-business) part of our master data, because:
You will always find that many real world entities have a vendor role as well as a customer role to you
The basic master data has the same structure (identification, names, addresses and contact data
You need the same third party validation and enrichment capabilities for customer roles and vendor roles.
When we look at the product domain we also have a huge need to connect the buy side and the sell side of our business – and the make side for that matter where we have in-house production.
Multi-Domain MDM has a side effect, so to speak, about bringing the sell-side together with the buy- and make-side. PIM (Product Information Management), which we often see as the ancestor to product MDM, has the same challenge. Here we also need to bring the sell-side and and the buy-side together – on three frontiers:
Bringing the internal buy-side and sell-side together not at least when looking at product hierarchies
Bringing our buy-side in synchronization with our upstream vendors/suppliers sell-side when it comes to product data
Bringing our sell-side in synchronization with our downstream customers buy-side when it comes to product data
Reference Data Management (RDM) is an evolving discipline within data management. When organizations mature in the reference data management realm we often see a shift from relying on internally defined reference data to relying on externally defined reference data. This is based on the good old saying of not to reinvent the wheel and also that externally defined reference data usually are better in fulfilling multiple purposes of use, where internally defined reference data tend to only cater for the most important purpose of use within your organization.
Then, what standard to use tend to be a matter of where in the world you are. Let’s look at three examples from the location domain, the party domain and the product domain.
Location reference data
If you read articles in English about reference data and ensuring accuracy and other data quality dimensions for location data you often meet remarks as “be sure to check validity against US Postal Services” or “make sure to check against the Royal Mail PAF File”. This is all great if all your addresses are in the United States or the United Kingdom. If all your addresses are in another country, there will in many cases be similar services for the given country. If your address are spread around the world, you have to look further.
There are some Data-as-a-Service offerings for international addresses out there. When it comes to have your own copy of location reference data the Universal Postal Union has an offering called the Universal POST*CODE® DataBase. You may also look into open data solutions as GeoNames.
Party reference data
Within party master data management for Business-to-Business (B2B) activities you want to classify your customers, prospects, suppliers and other business partners according to what they do, For that there are some frequently used coding systems in areas where I have been:
Standard Industrial Classification (SIC) codes, the four-digit numerical codes assigned by the U.S. government to business establishments.
The North American Industry Classification System (NAICS).
NACE (Nomenclature of Economic Activities), the European statistical classification of economic activities.
As important economic activities change over time, these systems change to reflect the real world. As an example, my Danish company registration has changed NACE code three times since 1998 while I have been doing the same thing.
This doesn’t make conversion services between these systems more easy.
Product reference data
There are also a good choice of standardized and standardised classification systems for product data out there. To name a few:
TheUnited Nations Standard Products and Services Code® (UNSPSC®), managed by GS1 US™ for the UN Development Programme (UNDP).
eCl@ss, who presents themselves as: “THE cross-industry product data standard for classification and clear description of products and services that has established itself as the only ISO/IEC compliant industry standard nationally and internationally”. eCl@ss has its main support in Germany (the home of the Mercedes E-Class).
In addition to cross-industry standards there are heaps of industry specific international, regional and national standards for product classification.