What is a search string?
A search string is the combination of text, numbers and sometimes special characters that a user enters into an application's search form to find specific types of information. The application submits the search string to a search engine, which interprets the request and then searches the target data. The data might come from a file server, database, large distributed collection of unstructured data or some other type of data store. The results of that search are then returned to the calling application and displayed to the user.
The exact format of a search string depends on the type of search being conducted, the capabilities of the application and search engine, and the nature of the target data. In some cases, a user types the search string directly into a simple text box, while at other times, the user completes a search form that includes text boxes, drop-down lists or other options for defining more complex search criteria. When this type of form is used, the application formats the actual search string behind the scenes and then submits it to the search engine.
Google provides a good example of the differences in search forms. The most common approach to using the Google search engine is to go to https://www.google.com and enter a search string in the text box. The search string might be nothing more than two or three words, such as "world's tallest mountain." However, Google also offers an advanced search form at https://www.google.com/advanced_search. The form includes text boxes and drop-down lists, making it possible to perform a much more detailed search.
The most basic search string contains only alphanumeric characters, often in the form of words or phrases, although a search string can also include numbers, as in "books with over 500 pages." Some search engines ignore the stop words in a search string to help improve search efficiency. Stop words are frequently used words such as an, by, the, with or on.
In addition, some search engines use a process called stemming, which reduces words to their base form when conducting a search. For example, if a user searches on the term "managers," the search results might also include manage, managed, manages and management.
Using special characters in search strings
Many search engines also support the use of special characters in their search strings. Special characters are wildcard characters and search operators that can help better refine a search or make it possible to conduct a much broader search. Following are some of the more common special characters and their typical uses:
- Asterisk (*). An asterisk is typically used to represent zero or more letters. For example, the search term "south*" would match south, southwest, southeast, southerly and southern.
- Question mark (?) or hash tag (#). A question mark or hash tag can sometimes be used in place of a single character, as in "s?ear," which can return shear, smear, spear and swear.
- Plus sign (+). When appended to the beginning of a search term, the plus sign indicates that term should be included in the results. For example, if "+laundry" is in the search string, all returned documents must include the word laundry.
- Minus sign (-). When appended to the beginning of a search term, the minus sign tells the search engine that the search results should not include this term. For example, if "-nutbutter" is included in the search string, none of the returned documents should include the word nutbutter.
- Single or double quotation marks (' or "). Enclosing a search term or phrase in quotation marks tells the search engine to return only those results that contain the quoted word or phrase. For example, a search string that includes "Madame Marie Curie" will return files that contain the specified phrase but will not return files that include only Madame Curie or Marie Curie.
Not all search engines handle special characters in the same way (if they support them at all). For example, Google treats the asterisk wildcard as a placeholder for one or more words, rather than individual characters. Users should familiarize themselves with the system they're using to ensure that they're getting the most out of their searches.
Some search engines also support the use of proximity operators, which specify the distance between terms, based on the number of words that separate them. Proximity operators are often represented by a letter such as w (for within) or an n (for near). This letter is then followed by a number that specifies the distance in words. For example, the search string "apples n5 oranges" indicates that the two terms should be within five words of each other for a document to be included in the search results.
In addition, some search engines support the use of special terms for refining a search even further. With Google, for instance, you can use "site:" to specify which domain to search. An example of this is a search string that contains "site:irs.com," which limits the search results to data within the irs.com domain. Google also looks for terms that have been inadvertently truncated or misspelled and then provides results based on what appears to be the correct search string. For example, Google will replace "apples and orangs" with "apples and oranges." Other systems, such as database search engines, are usually much less forgiving.
Using logical operators in a search string
Logical operations, also called Boolean operators, enable users to link together two search conditions to define more complex search logic. In this way, they can more precisely control what is included and excluded from their search results. There are three basic logical operators, which are often entered as all capital letters.
- AND. The two search conditions must both evaluate to true for a resource to be included in the search results. For example, the search string "apples AND oranges" will return only documents that contain both search terms.
- OR. At least one of the two search conditions must evaluate to true for a resource to be included in the search results. For example, the search string "apples OR oranges" might return documents that contain apples, other documents that contain oranges and some documents that contain both words.
- NOT. If the NOT operator precedes a search term, that term is excluded from search results. For example, the search string "apples NOT oranges" would return documents that contain apples, but not documents that contain oranges.
Search engines that support logical operators also typically permit the use of parenthesis, so that multiple search conditions can be linked together to define even more complex logic. Parentheses isolate specific elements of the search string to ensure that they're treated as a unit. Consider the following search string:
(apples AND oranges) OR (peaches AND cream)
For a file to be included in the search results, it must contain apples and oranges, or it must contain peaches and cream. A file that contains apples and peaches but not oranges or cream will not be returned. If a file contains three or four of these terms, it will have met the search conditions specified by the logical operators and will be included in the results.