When working with Party Master Data Management one approach to ensure accuracy, completeness and other data quality dimensions is to onboard new business-to-business (B2B) entities and enrich such current entities via a business directory.
While this could seem to be a straight forward mechanism, unfortunately it usually is not that easy peasy.
Let us take an example featuring the most widely used business directory around the world: The Dun & Bradstreet Worldbase. And let us take my latest registered company: Product Data Lake.
On this screen showing the basic data elements, there are a few obstacles:
- The address is not formatted well
- The country code system is not a widely used one
- The industry sector code system shown is one among others
Address Formatting
In our address D&B has put the word “sal”, which is Danish for floor. This is not incorrect, but addresses in Denmark are usually not written with that word, as the number following a house number in the addressing standard is the floor.
Country Codes
D&B has their own 3-digit country code. You may convert to the more widely used ISO 2-character country code. I do however remember a lot of fun from my data matching days when dealing with United Kingdom where D&B uses 4 different codes for England, Wales, Scotland and Northern Ireland as well as mapping back and forth with United States and Puerto Rico. Had to be made very despacito.
Industry Sector Codes
The screen shows a SIC code: 7374 = Computer Processing and Data Preparation and Processing Services
This must have been converted from the NACE code by which the company has been registered: 63.11:(00) = Data processing, hosting and related activities.
The two codes do by the way correspond to the NAICS Code 518210 = Data processing, hosting and related activities.
The challenges in embracing the many standards for reference data was examined in the post The World of Reference Data.
This business directory example shown here in this blog is very informative. I love how you address the issues with the code errors.
Thanks for commenting Melissa. The issues with the formats and codes are what I will categorize as conformity challenges. The data is accurate and valid but may not meet a certain conformity rule, as organisations and organizations use different kinds of standardization and standardisation.
The good about standards: there are so many of them 😉
btw… One time I did a data profiling project, investigating vendor data (of which I was one). My “own” record was stored in three applications – none of them showing the right company name, only one the right address, and an awkward and erroneous phone number formatting… I assume that my customer tried to challenge me.
Indeed Matthias – and in some areas we have too many standards and in some other areas we miss one.
Checking data against something you really know about is always a good exercise. And when you realize the trouble with that, you have a hunch about the state of all the rest of the data.