Entity resolution is the discipline of uniquely identifying your master data records, typically being those holding data about customers, products and locations. Entity resolution is closely related to the concept of a single version of the truth.
Questions to be asked during entity resolution are like these ones:
- Is a given customer master data record representing a real world person or organization?
- Is a person acting as a private customer and a small business owner going to be seen as the same?
- Is a product coming from supplier A going to identified as the same as the same product coming from supplier B?
- Is the geocode for the center of a parcel the same place as the geocode of where the parcel is bordering a public road?
We may come a long way in automating entity resolution by using advanced data matching and exploiting rich sources of external reference data and we may be able to handle the complex structures of the real world by using sophisticated hierarchy management and hereby make an entity revolution in our databases.
But I am often faced with the fact that most organizations don’t want an entity revolution. There are always plenty of good reasons why different frequent business processes don’t require full entity resolution and will only be complicated by having it (unless drastic reengineered). The tangible immediate negative business impact of an entity revolution trumps the softer positive improvement in business insight from such a revolution.
Therefore we are mostly making entity evolutions balancing the current business requirements with the distant ideal of a single version of the truth.