Browse Definitions :

fuzzy search

What is a fuzzy search?

A fuzzy search is a technique that uses search algorithms to find strings that match patterns approximately. It's particularly useful for helping users find webpages without having to know exactly what they're looking for or how a word is spelled. Fuzzy searches are also used for Structured Query Language lookups to help database users find records without having to be sure of the exact spelling of the value they're looking for.

A fuzzy search is performed using a fuzzy matching algorithm, which returns a list of results based on likely relevance even though search argument words and spellings may not be an exact match. For web lookups, exact and highly relevant matches appear near the top of the list. Subjective relevance ratings may be given, usually as percentages.

How do fuzzy searches work?

A fuzzy matching algorithm can operate like a spell checker and spelling-error corrector. For example, if a user types "Misissippi" into Yahoo or Google -- both of which use fuzzy matching -- a list of hits is returned along with the question, "Did you mean Mississippi?" Alternative spellings and words that sound the same but are spelled differently are given.

A fuzzy matching algorithm can compensate for common input typing errors, as well as errors introduced by optical character recognition scanning of printed documents. The algorithm can return hits with content that contains a specified base word along with prefixes and suffixes. For example, if planet is entered as a search word, hits occur for sites containing words such as protoplanet or planetary. The program can also find synonyms and related terms, working like an online thesaurus or encyclopedic cross-reference tool.

Fuzzy matching algorithms usually return irrelevant hits, as well as relevant ones. Superfluous results are likely to occur for terms with multiple meanings, only one of which is the meaning the user intends. If the user has only a vague or general idea of the topic or doesn't know exactly what to look for, the ratio of relevant hits to irrelevant hits tends to be low. However, the ratio is even lower when an exact matching program is used in this situation.

Fuzzy searching is much more powerful than exact searching when used for research and investigation and is especially useful when researching unfamiliar, foreign-language or sophisticated terms for which the proper spellings aren't widely known. Fuzzy searching can also be used to locate individuals based on incomplete or partially accurate identifying information.

Many search engines enable users to specifically request a fuzzy search in the search query by using a tilde (~) at the end of the word or term they want to search with fuzziness.

What is the Levenshtein distance?

How close the search term is to the exact match is measured in terms of edit distance -- a metric that represents the cost of converting one string to another. The most widely used method for computing the edit distance is to take the minimum number of single-character changes needed to convert the search term to the exact match. This computation results in the Levenshtein distance, which is named after the Russian scientist who invented the method, Vladimir Levenshtein.

The following single-character changes are used to compute edit distance. Some algorithms use weights to place a higher value on one or more of the primitive operations:

  • Insertion: fuzzy serch → fuzzy search
  • Deletion: fuzzey search → fuzzy search
  • Substitution: fuzzy seerch → fuzzy search
  • Transposition: fuzzy saerch → fuzzy search

What are the pros and cons of fuzzy matching?

While fuzzy matching can be helpful to researchers who don't know the exact spelling of what they're looking for, it also can be a source of frustration, as it may return irrelevant results that the user must then filter.

Fuzzy matching offers the following benefits:

  • Online visitors can find products or product categories without knowing the correct spellings.
  • Tourists can locate cities without knowing exact spellings or by using an approximate spelling in cases where the city name is listed in a different language.
  • Scientists can find articles without having to know the exact titles.
  • Users can find the names of films or books without knowing the exact titles.
  • Users can search for people without knowing the exact spelling of their names.

Fuzzy matching also has the following drawbacks:

  • It sometimes returns too many results, requiring the user to sift through all the suggestions to find the one that's most appropriate. Sometimes, the most appropriate result is toward the end of the list, several pages down; and sometimes, none of the results are appropriate.
  • It only compensates for typos or misspellings and must be complemented with a semantic algorithm to also consider synonyms and semantics.

What is the difference between a fuzzy search and a wildcard search?

A fuzzy search is either applied by default or specifically requested by the user -- frequently by typing a tilde at the end of the search term. The fuzzy search returns matches within a certain edit distance by assuming that any letters in the search term can be inserted, deleted, substituted or transposed.

A wildcard search can only be specifically requested by the user by using wildcards, which are characters that indicate different kinds of matches. The search engine only considers certain letters can be altered, assuming all the other letters are to be matched exactly. The following are two of the most commonly used wildcards along with their meanings:

  1. A question mark (?) is used for single-character replacement. A search on f?zzy might yield fuzzy and fizzy.
  2. An asterisk (*) is used for multiple-character matching. A search on fuzz* might yield fuzzie and fuzzy.

Learn how focused specialty search engines differ from all-purpose search engines, such as Google.

This was last updated in August 2022

Continue Reading About fuzzy search

  • local area network (LAN)

    A local area network (LAN) is a group of computers and peripheral devices that are connected together within a distinct ...

  • TCP/IP

    TCP/IP stands for Transmission Control Protocol/Internet Protocol and is a suite of communication protocols used to interconnect ...

  • firewall as a service (FWaaS)

    Firewall as a service (FWaaS), also known as a cloud firewall, is a service that provides cloud-based network traffic analysis ...

  • identity management (ID management)

    Identity management (ID management) is the organizational process for ensuring individuals have the appropriate access to ...

  • single sign-on (SSO)

    Single sign-on (SSO) is a session and user authentication service that permits a user to use one set of login credentials -- for ...

  • fraud detection

    Fraud detection is a set of activities undertaken to prevent money or property from being obtained through false pretenses.

  • IT budget

    IT budget is the amount of money spent on an organization's information technology systems and services. It includes compensation...

  • project scope

    Project scope is the part of project planning that involves determining and documenting a list of specific project goals, ...

  • core competencies

    For any organization, its core competencies refer to the capabilities, knowledge, skills and resources that constitute its '...

  • recruitment management system (RMS)

    A recruitment management system (RMS) is a set of tools designed to manage the employee recruiting and hiring process. It might ...

  • core HR (core human resources)

    Core HR (core human resources) is an umbrella term that refers to the basic tasks and functions of an HR department as it manages...

  • HR service delivery

    HR service delivery is a term used to explain how an organization's human resources department offers services to and interacts ...

Customer Experience
  • martech (marketing technology)

    Martech (marketing technology) refers to the integration of software tools, platforms, and applications designed to streamline ...

  • transactional marketing

    Transactional marketing is a business strategy that focuses on single, point-of-sale transactions.

  • customer profiling

    Customer profiling is the detailed and systematic process of constructing a clear portrait of a company's ideal customer by ...