When matching customer master data in order to find duplicates or find corresponding real world entities in a business directory or a consumer directory you may use a data quality kind of deduplication tool to do the hard work.
The tool will typically – depending on the capabilities of the tool and the nature of and purpose for the data – find:
A: The positive automated matches. Ideally you will take samples for manual inspection.
C: The negative automated matches.
B: The dubious part selected for manual inspection.
Humans are costly resources. Therefore the manual inspection of the B pot (and the A sample) may be supported by a user interface that helps getting the job done fast but accurate.
I have worked with the following features for such functionality:
- Random sampling for quality assurance – both from the A pot and the manual settled from the B pot
- Check-out and check-in for multiuser environments
- Presenting a ranked range of computer selected candidates
- Color coding elements in matched candidates – like:
- green for (near) exact name,
- blue for a close name and
- red for a far from similar name
- Possibility for marking:
- as a manual positive match,
- as a manual negative match (with reason) or
- as questionable for later or supervisor inspection (with comments)
- Entering a match found by other methods
- Removing one or several members from a duplicate group
- Splitting a duplicate group into two groups
- Selecting survivorship
- Applying hierarchy linkage
Anyone else out there who have worked with making or using a man-machine dialogue for this?