When you search a Verity collection, you use the cfsearch
tag in a ColdFusion application page. Use the criteria
attribute to specify the query expression you want to pass to the search engine.
You can build two types of query expressions: simple and explicit. A simple query expression is typically a word or words. An explicit query expression can employ a number of operators and modifiers to refine the search, and you must invoke all aspects of the search explicitly. A simple query expression employs operators by default. You can assemble an explicit query expression programmatically, or you can pass a simple query expression to the search engine directly from an HTML input form.
The Verity query language provides many operators and modifiers for composing queries. You can use the following search techniques to search a Verity collection:
Simple queries let end users enter simple, comma-delimited strings and use wildcard characters. Users can enter multiple words separated by commas, in which case the comma is treated like a logical OR. If a user omits the commas, the query expression is treated as a phrase.
Ordinarily, operators are employed in explicit query expressions. Operators are normally surrounded by angle brackets (< >). However, a simple query expression can include the AND, OR, and NOT operators without angle brackets.
A simple query automatically employs the STEM operator and the MANY modifier. STEM searches for words that derive from those entered in the query expression, so entering "find" returns documents that contain "find," "finding," "finds," and so on. The MANY modifier presents the documents returned in the search as a list based on a relevancy score.
You can construct explicit queries using a variety of operators, which are described later in this section. Most operators in an explicit query expression must be surrounded by angle brackets < >. You can use the AND, OR, and NOT operators without angle brackets.
You can use either simple or explicit syntax when stating simple query syntax. The syntax you use determines whether the search words you enter are stemmed, and whether the words that are found contribute to relevance-ranked scoring.
When you use simple syntax, the search engine implicitly interprets single words as if they were modified by the MANY and STEM operators. By implicitly applying the MANY operator, the search engine calculates each document's score based on the density of the search term in the searched documents. The more frequent is the occurrence of a word in a document, the higher is the document's score.
As a result, the search engine ranks documents according to word density as it searches for the word you specify, as well as words that have the same stem. For example, "films", "filmed," and "filming" are stemmed variations of the word "film." To search for documents containing the word "film" and its stem words, you can enter the word "film" without modification. When documents are ranked by relevance, they appear in a list with the most relevant documents at the top.
When you use explicit syntax, the search engine interprets the search terms you enter as literals. For example, by entering the word "film" (including quotation marks) using explicit syntax, the stemmed versions of the word "film", "films," "filmed," and "filming" are ignored.
The following table shows all operators available for conducting searches of ColdFusion Verity collections.
Verity Search Operators |
||
---|---|---|
< |
CONTAINS |
PHRASE |
<= |
ENDS |
SENTENCE |
= |
MATCHES |
STARTS |
> |
NEAR |
STEM |
>= |
NEAR/N |
SUBSTRING |
Accrue |
OR |
WILDCARD |
AND |
PARAGRAPH |
WORD |
The search engine handles a number of characters in particular ways as described in the following table:
A backslash (\) removes special meaning from whatever character follows it. To enter a literal backslash in a query, use two in succession; for example:
<FREETEXT>("\"Hello\", said Packard.") "backslash (\\)"
The following rules apply to the composition of search expressions.
Expressions are read from left to right. The AND operator takes precedence over the OR operator. However, terms enclosed in parentheses are evaluated first. When the search engine encounters nested parentheses, it starts with the innermost term.
You use can using prefix notation or infix notation to define search strings that use any operator other than an evidence operator. As a result, either of the following expressions is valid:
AND (a,b)
This is prefix notation
a AND b
This is infix notation
When you use prefix notation, the expression specifies precedence explicitly. The following example means: Look for documents that contain b and c first, then documents that contain a:
OR (a, AND (b,c))
When you use infix notation, precedence is implicit in the expression. For example, the AND operator takes precedence over the OR operator.
If an expression includes two or more search terms within parentheses, a comma is required as a separator between the elements. The following example means: Look for documents that contain any combination of a and b together.
<OR> (a, b)
Note that in this example, angle brackets are used with the OR operator.
You use angle brackets (< >), double quotation marks ("), and backslashes (\) to delimit various elements in a query expression, as described in the following table:
The following table shows the wildcard characters that you can use to search Verity collections:
To search for a wildcard character in your collection, you need to escape the character with a backslash (\); for example:
You must precede the following nonalphanumeric characters with a backslash character (\) in a search string:
In addition to the backslash character, you can use paired backquotes (` `) to interpret special characters as literals. For example, to search for the wildcard string "a{b" you can surround the string with backquotes, as follows:
`a{b`
To search for a wildcard string that includes the literal backquote character (`) you must use two backquotes together and surround the whole string in backquotes:
`*n``t`
You can use paired backquotes or backslashes to escape special characters. There is no functional difference between the two. For example, you can query for the term: <DDA> in the following ways:
The power of the cfsearch
tag is in the control it provides over the Verity search engine. The engine offers users a high degree of specificity in setting search parameters.
An operator represents logic to be applied to a search element. This logic defines the qualifications that a document must meet to be retrieved. You can use operators to refine your search or to influence the results in other ways. For example, you could construct an HTML form for conducting searches. In the form, a user could perform a search for a single term: server. You can refine your search by limiting the search scope in a number of ways. Operators are available for limiting a query to a sentence or paragraph, and you can search words based on proximity.
Ordinarily, you use operators in explicit searches, as shown here:
"<operator>search_string"
The following operator types are available:
Evidence operators let you specify a basic word search or an intelligent word search. A basic word search finds documents that contain only the word or words specified in the query. An intelligent word search expands the query terms to create an expanded word list so that the search returns documents that contain variations of the query terms.
Documents retrieved using evidence operators are not ranked by relevance unless you use the MANY modifier.
The following tale describes the evidence operators:
Proximity operators specify the relative location of specific words in the document. Specified words must be in the same phrase, paragraph, or sentence for a document to be retrieved. In the case of NEAR and NEAR/N operators, retrieved documents are ranked by relevance based on the proximity of the specified words. Proximity operators can be nested; phrases or words can appear within SENTENCE or PARAGRAPH operators, and SENTENCE operators can appear within PARAGRAPH operators.
The following table describes the proximity operators:
Relational operators search document fields that you defined in the collection. Documents containing specified field values are returned. Documents retrieved using relational operators are not ranked by relevance, and you cannot use the MANY modifier with relational operators.
You use the following operators for numeric and date comparisons:
Operator |
Description |
---|---|
= |
Equals |
> |
Greater than |
>= |
Greater than or equal to |
< |
Less than |
<= |
Less than or equal to |
The following relational operators compare text and match words and parts of words:
You can specify the values for the cfindex
attributes TITLE, KEY, URL, and CUSTOM as document fields for use with relational operators in the criteria
attribute. Document fields are referenced in text comparison operators. They are identified as:
For more information on this topic, see the Knowledge Base article, "Verity: Using Document Fields To Narrow Down Searches" (ID# 1082) on our Web site at http://www.coldfusion.com/Support/KnowledgeBase/SearchForm.cfm.
You can use the SUBSTRING operator to match a character string with data stored in a specified data source. In the example described in this section, a data source called TEST1 contains the table YearPlaceText, which itself contains three columns: Year, Place, and Text. Year and Place make up the primary key. The following table shows the TEST1 schema:
Year |
Place |
Text |
---|---|---|
1990 |
Utah |
Text about Utah 1990 |
1990 |
Oregon |
Text about Oregon 1990 |
1991 |
Utah |
Text about Utah 1991 |
1991 |
Oregon |
Text about Oregon 1991 |
1992 |
Utah |
Text about Utah 1992 |
The following application page matches records that have 1990 in the TEXT column and are in the Place Utah. The search is performed against the collection that contains the TEXT column and then is narrowed further by searching for the string "Utah" in the CF_TITLE document field. Recall that document fields are defaults defined in every collection corresponding to the values you define for URL, TITLE, and KEY in the cfindex
tag.
<cfquery name="GetText"
datasource="TEST1"> SELECT Year+Place AS Identifier, text FROM YearPlaceText </cfquery> <cfindex collection="testcollection" action="Update" type="Custom" title="Identifier" key="Identifier" body="TEXT" query="GetText"> <cfsearch name="GetText_Search" collection="testcollection" type="Explicit" criteria="1990 and CF_TITLE <SUBSTRING> Utah"> <cfoutput> Record Counts: <br> #GetText.RecordCount# <br> #GetText_Search.RecordCount# <br> </cfoutput> Query Results --- Should be 5 rows <br> <cfoutput query="Gettext"> #Identifier# <br> </cfoutput> Search Results -- should be 1 row <br> <cfoutput query="GetText_Search"> #GetText_Search.TITLE# <br> </cfoutput>
Concept operators combine the meaning of search elements to identify a concept in a document. Documents retrieved using concept operators are ranked by relevance. The following table describes each concept operator:
Score operators govern how the search engine calculates scores for retrieved documents. The maximum score that a returned search element can have is 1.000. You can set the score percentage display to as many as four decimal places.
When you use a score operator, the search engine first calculates a separate score for each search element found in a document, and then performs a mathematical operation on the individual element scores to arrive at the final score for each document.
Note that the document's score is available as a result column. You can use the SCORE result column to get the relevancy score of any document retrieved. For example:
<cfoutput> <a href="#Search1.URL#">#Search1.Title#</a><br> Document Score=#Search1.SCORE#<BR> </cfoutput>
The following table describes the score operators:
You combine modifiers with operators to change the standard behavior of an operator in some way. For example, you can use the CASE modifier with an operator to specify that you want to match the case of the search word.
The following table describes the available modifiers.