My last post about search functionality in Master Data Management (MDM) solutions was called Search and if you are lucky you will find.
In the comments the use of wildcards versus fuzzy search was touched.
The problem with wildcards
I have a company called “Liliendahl Limited” as this is the spelling of the name as it is registered with the Companies House for England and Wales.
But say someone is searching using one of the following strings:
- “Liliendahl Ltd”,
- “Liliendal Limited” or
- “Liljendahl Limited”
Search functionality should in these situations return with the hit “Liliendahl Limited”.
The problem is however that most users don’t have the time, patience and skills to construct these search strings with wildcard characters. And maybe the registered name was something slightly else not meeting the wildcard characters used.
Tools for batch matching of name strings have been around for many years. When doing a batch match you can’t practically use wildcard characters. Instead matching algorithms typically rely of one, or in best case a combination, of these techniques:
The same techniques can be used for interactive search thus reaching a hit in one fast search.
I have worked with the Omkron FACT algorithm for batch matching. This algorithm has morphed into being implemented as a fuzzy search algorithm as well.
One area of use for this is when webshop users are searching for a product or service within your online shop. This feature is, along with other eCommerce capabilities, branded as FACT-Finder.
The fuzzy search capabilities are also used in a tool I’m involved with called iDQ. Here external reference data sources, in combination with internal master data sources, are searched in an error tolerant way, thus making data available for the user despite heaps of spelling possibilities.