The art of Business Directory Matching

A business directory is a list of companies in a given area and perhaps a given industry. One very useful type of such a directory related to data quality is a list of all companies in a given country. In many countries the authorities maintains such a list, other places it’s a matter of assembling local lists or other forms of data capture. Many private service providers offer such lists often with added information value of different kinds.

If you take the customer/prospect master table from an enterprise doing B2B in a given country one should believe that the rows in that table would match 100% to the business directory of that country. I am not talking about that all data are spelled exactly as in the directory but “only” about that it’s the same real world object reflected.

neural1During many years of providing solutions for business directory match and tuning these as well as handling such match services from colleagues in the business I have very, very seldom seen a 100% match – even 90% matches are very rare.

Why is that so? Some of the reasons – related to the classic data quality dimensions – I have stumbled over has been:

Completeness of business directories varies from country to country and between the lists provided by vendors. Some countries like those of the old Czechoslovakia, some English speaking countries in the Pacifics, the Nordics and others have a tight registration and then it is less tight from countries in North America, other European countries and the rest of the world.

Actuality in business directories also differs a lot. Also it is important if the business directory covers dissolved entities and includes history tracking like former names and addresses. Then take the actuality of the customer/prospect table to be matched and once again the time dimension has a lot to say.

Validity, accuracy, consistency both concerning the directory and the table to be matched is a natural course of mismatch. Also many B2B customer/prospect tables holds a lot of entities not being a formal business entity but being a lot of other types of party master data.

Uniqueness may be different defined in the directory and table to be matched. This includes the perception of hierachies of legal entities and branches – not at least governmental and local authority bodies is a fuzzy crowd. Also different roles as those of a small business owner makes challenges. The same is true about roles as franchise takers and the use of trading styles.

Then of course the applied automated match technique and the human interaction executed are factors of the resulting match rate and the quality of the match measured as frequency of false positives.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s