When evaluating results from automated data matching your goal is typically to find false positives and false negatives being entities that are matched, but shouldn’t be (false positives) and entities that are not matched, but should have been (false negatives).
However the fuzziness often used in the data matching process also apply to the evaluation of the results as many dubious results isn’t a question about if the matched database rows are reflecting the same real world entity but more a question about if the matched (or not matched) database rows are reflecting different members of a real world hierarchy.
Example 1:John Smith on 1 Main Street in Anytown Mary & John Smith on 1 Main Str in Anytown
Example 2:Anytown Municipality, Technical Dept Municipality of Anytown
Example 3:Acme Corporation, Anytown Acme Corporation, Anywhere
All three examples above may be considered a false positive if matched and a false negative if not matched.
You may say that it depends on the purpose of use, which is true.
But if we are talking master data management we may probably encompass multiple requirements where we simultaneously need the match and don’t want the match, which is why we need to be able to resolve and store the results from fuzzy data matching into hierarchies.