Legal Forms from Hell

When doing data matching with company names a basic challenge is that a proper company name in most cultures in most cases have two elements:

  • The actual company name
  • The legal form

Some worldwide examples:

  • Informatica Corporation
  • Talend SA
  • SAP Deutschland AG & Co. KG
  • Sony Kabushiki Kaisha
  • LEGO A/S

There are hundreds of different legal forms in full and abbreviated forms. Wikipedia has a list here (here called types of business entity).

However, when typing in company names in databases the legal form is often omitted. And even where legal forms are present they may be represented differently in full or abbreviated forms, with varying spelling and punctuation and so on. As the actual company names also suffer from this fuzziness, the complexity is overwhelming.

A common way of handling this issue in data matching is to separate the legal form and then emphasize on comparing the remaining part being the actual company name. When doing that it has to be done country specific or else you may remove the entire name of a company like with a name of an Italian company called Société Anonyme, which is a French legal form.

While the practice of having legal forms in company names may serve well for the original purpose of knowing the risk of doing business with that entity, it is certainly not serving the purpose of having the uniqueness data quality dimension solved.

One should think that it is time for changing the bad (legal demanded) practice of mixing legal forms with company names and serve the original purpose in another more data quality friendly way.

Bookmark and Share

2 thoughts on “Legal Forms from Hell

  1. biztrend 9th February 2011 / 16:25

    I know this is an older post .. but I just came across this.

    Can’t this problem be easily solved by keeping a match code in each record? And then when an entry for a record is either legal or common name, they both link to the same match code?

    • Henrik Liliendahl Sørensen 9th February 2011 / 17:46

      Hi there. That’s a way and certainly variants over the techniques are used when making golden records in master data hubs.

Leave a comment