Today we assume that the search engine can locate all of the matching pages (precision) and return relevant results. Google, considered the de-facto engine by many technies has rocketed Google into a billion dollar industry, with over a dozens contenders (Magellan, Alta Vista, MSN, Mama, Etc) ripping at Google’s heels.
From an academic perspective, the “best” search engine is the one that returns the “right” answer, the one that derived the “meaning” of the query and returned on-point results. Old research attempted to qualify the quality of search engines using difficult metrics such as Search Engine Precision and Recall.
A central part of quantifying the “relevance” of any query is to “expand” the query into a more complex query. For example, consider the query:
cheap condo Los Angeles no credit check
Word Stemming
“Word stemming” is defined as the ability to include word variations. For example any noun-word would include variations (whose importance is directly proportional to the degree of variation) With word stemming, we use quantified methods for the rules of grammar to add word stems and rank them according to their degree of separation from the root word. For example, we might see stems identified for “cheap”, “condo” and “check”:
(cheap or cheaper)
AND
(condo and condos)
AND
(check and checked and checking)
Synonym Expansion
Synonym Expansion is where we take variants of the word and assign them to the search engine query. Retuning to our example, the term “cheap” might indicate that the searcher is also interested in similar terms for a low cost:
cheaper
or
inexpensive
or
“low cost”
or
bargain
Similarly, the term “condo” might indicate that the searcher is also interested in similar types on housing”
condo
or
apartment
or
flat
or
“rental property”
When we expand a query we develop a complex word search expression for the base engine. In our case the simple “cheap condo Los Angeles no credit check” is transformed into a far more complex Boolean form:
(cheap or cheaper)
AND
(condo and condos)
AND
(check and checked and checking)
AND
(cheaper or inexpensive or “low cost” or bargain)
AND
(condo or apartment or flat or “rental property”)
Oh, but what about adding stems of the synonyms:
AND
(apartment or apartments)
AND
(bargains or bargain or bargaining)
Of course, we have not yet assigned weights to the synonyms in the query. For example the word “flat” is an obscure term for housing and it would have far less weight than the original “condo”.