While we are waiting for the LEI

As told in the post Business Entity Identifiers there has been a new global numbering system for business entities on the way for some time. The wonder is called LEI (Legal Entity Identifier).

fsb-leiThe implementation work has been adapted by the Financial Stability Board. The latest developments are reported in a publication called Fifth progress note on the Global LEI Initiative.

Surely, while the implementations may be in good hands, the set up doesn’t give hope for a speedy process where every legal entity in the world in a short time will have a LEI.

And then the next question will be how long it will take before organizations will have enriched existing databases with that LEI and implemented on-boarding processes where a LEI is captured with every new insertion of party master data describing a legal entity.

A good way to start to be prepared will be to implement features in on-boarding business processes where available external reference data are captured when new party entities are added to your databases. Having best available information about names, addresses and business entity identifiers available today and a culture of capturing such information will be a great starting point.

And oh, the instant Data Quality concept is precisely all about doing that.

Bookmark and Share

Business Entity Identifiers

The least cumbersome way of uniquely identifying a business partner being a company, government body or other form of organization is to use an externally provided number.

However, there are quite a lot of different numbers to choose from.

All-Purpose National Identification Numbers

In some counties, like in Scandinavia, the public sector assigns a unique number to every company to be used in every relation to the public sector and open to be used by the private sector as well for identification purposes.

As reported in the post Single Company View I worked with the early implementation of such a number in Denmark way back in time.

Single-Purpose National Identification Numbers

In most countries there are multiple systems of numbers for companies each with an original special purpose. Examples are registration numbers, VAT numbers and employer identification numbers.

My current UK company has both a registration number and a VAT number and very embarrassing for a data quality and master data geek these two numbers have different names and addresses attached.

Other Numbering Systems

The best known business entity numbering system around the world is probably the DUNS-number used by Dun & Bradstreet. As examined in the post Select Company_ID from External_Source Where Possible the use of DUNS-numbers and similar business directory id’s is a very common way of uniquely identifying business partners.

In the manufacturing and retail world legal entities may, as part of the Global Data Synchronization Network, be identified with a Global Location Number (GLN).

There has been a lot of talk in the financial sector lately around implementing yet a new numbering system for legal entities with an identifier usually abbreviated as LEI. Wikipedia has the details about a Legal Entity Identification for Financial Contracts.

These are only some of the most used numbering systems for business entities.

So, the trend doesn’t seem to be a single source of truth but multiple sources making up some kind of the truth.

Bookmark and Share

”Fitness for Use” is Dead

The definition of data quality as being ”fitness for use” is challenged. “Real world alignment” or similar expressions are gaining traction.

Back in May Malcolm Chisholm made a tweet about the shortcomings of the “fitness for use” definition reported here on the blog in the post The Problem with Multiple Purposes of Use.

Last week the tweet was elaborated on the Information Management article called Data Quality is Not Fitness for Use. Today Jim Harris has a follow post called Data and its Relationships with Quality.

When working with data quality in the domain with far the most data quality issues being the quality of contact data (customer, supplier, employee and other party master data) I have many times experienced that making data fit for more than a single purpose of use almost always is about better real world alignment. Having data that actually represents what it purports to represent always helps with making data fit for use, even with more than one purpose of use.

In practice that in the contact data realm for example means:

  • Getting a standardized address at contact data entry makes it possible for you to easily link to sources with geo codes, property information and other location data for multiple purposes.
  • Obtaining a company registration number or other legal entity identifier (LEI) at data entry makes it possible to enrich with a wealth of available data held in public and commercial sources making data fit for many use cases.
  • Having a person’s name spelled according to available sources for the country in question helps a lot with typical data quality issues as uniqueness and consistency.

Also, making data real world aligned from the start is a big help when maintaining data as the real world will change over time.

Data quality tools will in my eyes also have to apply to this trend as discussed with Gartner in the post Quality of Data behind the Data Quality Magic Quadrant.

Bookmark and Share

Pulling Data Quality from the Cloud

In a recent post here on the blog the benefits of instant data enrichment was discussed.

In the contact data capture context these are some examples:

  • Getting a standardized address at contact data entry makes it possible for you to easily link to sources with geo codes, property information and other location data.
  • Obtaining a company registration number or other legal entity identifier (LEI) at data entry makes it possible to enrich with a wealth of available data held in public and commercial sources.
  • Having a person’s name spelled according to available sources for the country in question helps a lot with typical data quality issues as uniqueness and consistency.

However, if you are doing business in many countries it is a daunting task to connect with the best of breed sources of big reference data. Add to that, that many enterprises are doing both business-to-business (B2B) and business-to-consumer (B2C) activities including interacting with small business owners. This means you have to link to the best sources available for addresses, companies and individuals.

A solution to this challenge is using Cloud Service Brokerage (CSB).

An example of a Cloud Service Brokerage suite for contact data quality is the instant Data Quality (iDQ™) service I’m working with right now.

This service can connect to big reference data cloud services from all over the world. Some services are open data services in the contact data realm, some are international commercial directories, some are the wealth of national reference data services for addresses, companies and individuals and even social network profiles are on the radar.

Bookmark and Share

Instant Data Enrichment

Data enrichment is one of the core activities within data quality improvement. Data enrichment is about updating your data in order to be more real world aligned by correcting and completing with data from external reference data sources.

Traditionally data enrichment has been a follow up activity to data matching and doing data matching as a prerequisite for data enrichment has been a good part of my data quality endeavor during the recent 15 years as reported in the post The GlobalMatchBox.

During the last couple of years I have tried to be part of the quest for doing something about poor data quality by moving the activities upstream. Upstream data quality prevention is better than downstream data cleansing wherever applicable. Doing the data enrichment at data capture is the fast track to improve data quality for example by avoiding contact data entry flaws.

It’s not that you have to enrich with all the possible data available from external sources at once. What is the most important thing is that you are able to link back to external sources without having to do (too much) fuzzy data matching later. Some examples:

  • Getting a standardized address at contact data entry makes it possible for you to easily link to sources with geo codes, property information and other location data at a later point.
  • Obtaining a company registration number or other legal entity identifier (LEI) at data entry makes it possible to enrich with a wealth of available data held in public and commercial sources.
  • Having a person’s name spelled according to available sources for the country in question helps a lot when you later have to match with other sources.

In that way your data will be fit for current and future multiple purposes.

Bookmark and Share

The Big Search Opportunity

The other day Bloomberg Businessweek had an article telling that Facebook Delves Deeper Into Search.

I have always been advocating for having better search functionality in order to get more business value from your data. That certainly also applies to big data.

In a recent post called Big Reference Data Musings here on the blog, the challenge of utilizing large external data sources for getting better master data quality was discussed. In a comment Greg Leman pointed out, that there often isn’t a single source of the truth, as you for example could expect from say a huge reference data source as the Dun & Bradstreet WorldBase holding information about business entities from all over the world.

Indeed our search capabilities optimally must span several sources. In the business directory search realm you may include several sources at a time like supplementing the D&B  WorldBase with for example EuroContactPool, if you do business in Europe, or the source called Wiki-Data (under rename to AvoxData) if you are in financial services and wants to utilize the new Legal Entity Identifier (LEI) for counterparty uniqueness in conjunction with other more complete sources.

As examined in Search and if you are lucky you will find combining search on external reference data sources and internal master data sources is a big opportunity too. In doing that you, as described the follow up piece named Wildcard Search versus Fuzzy Search, must get the search technology right.

I see in the Bloomberg article that Facebook don’t intend to completely reinvent the wheel for searching big data, as they have hired a Google veteran, the Danish computer scientist Lars Rasmussen, for the job.

Bookmark and Share