The Wikipedia article on Identity Resolution has this catch on the difference between good old data matching and Entity Resolution:
”Here are four factors that distinguish entity resolution from data matching, according to John Talburt, director of the UALR Laboratory for Advanced Research in Entity Resolution and Information Quality:
- Works with both structured and unstructured records, and it entails the process of extracting references when the sources are unstructured or semi-structured
- Uses elaborate business rules and concept models to deal with missing, conflicting, and corrupted information
- Utilizes non-matching, asserted linking (associate) information in addition to direct matching
- Uncovers non-obvious relationships and association networks (i.e. who’s associated with whom)”
If you look at the above mentioned factors that distinguish data matching from identity resolution, some of the often mentioned features in the new big data technology shine through:
- Working with unstructured and semi-structured data is probably the most mentioned difference between working with small data versus working with big data.
- Working with associations is a feature of graph databases or other similar technologies as mentioned in the post Will Graph Databases become Common in MDM?
So, in the quest of expanding matching small data to evolve into Entity (or Identity) Resolution we will be helped by general developments in working with big data.