SQL search_items Syntax
Basic index search_items can contain the following syntax:
Word Search:
Argument |
Description |
word1 word2 word3 |
Specifies that these exact words must appear (in any order) somewhere in the text. (Logical AND). You can specify a single word, or any number of words separated by spaces. |
word1 OR word2 NOT word3
word1 OR (word2 AND word3) |
search_items can contain AND, OR, and NOT logical operators. AND is the same as separating words with spaces (implicit AND). NOT is logically equivalent to AND NOT. search_items can also use parentheses to group logical operators. Explicit AND is needed when specifying multiple words in grouping parentheses: (word2 AND word3). If the explicit AND was omitted, (word2 word3) would be interpreted as a positional phrase. You can use the \ escape character to specify AND, OR, NOT as literals rather than logical operators: \and |
?word
word?
w?rd
w??d |
A question mark wildcard specifies exactly one non-space character of any type. One or more ? wildcards can be used as a prefix, suffix, or within a word. You can combine ? and * wildcards. You can use \ escape character to specify ? as a literal: \? |
*word
word*
*word*
w*d |
An asterisk wildcard specifies 0 or more non-space characters of any type. An asterisk can be used as a prefix, suffix, or within a word. You can use \ escape character to specify * as a literal: \* |
Co-occurrence Word Search:
Argument |
Description |
[word1,word2,...,range] |
Co-occurrence search. Specifies that these exact words must appear (in any order) within the proximity window specified by range. You can specify any number of words or multi-word phrases. A multi-word phrase is specified as words separated by spaces with no delimiting punctuation. Words (or positional phrases) are separated by commas, the last comma-separated element is an optional numeric range. Words can specify asterisk wildcards.
A range can be specified as min–max or simply as max with a default min of 1. For example, 1–5 or 5. range is optional; if omitted, it defaults to 1–20. A range count is inclusive of all of the specified words.
Co-occurrence search cannot be used with search_option=4 (Regular Expressions). |
Positional Phrase Search:
Note:
You can use double quotes "word1 word2 word3" or parentheses (word1 word2 word3) to delimit a positional phrase. Because parentheses are also used to group logical operators, the use of double quotes is preferred.
Argument |
Description |
"word1 word2 word3" |
These exact words must appear sequentially in the specified order. Words are separated by spaces. Note that no semantic analysis is performed; for example, the words in a “phrase” may be the final word of a sentence and the beginning words of the next sentence. Asterisk wildcards can be applied to individual words in a phrase. A literal parentheses character in the search_items must be enclosed with quotes. |
"word1 ? word3"
"word1 ? ? ? word5" |
A question mark indicates that exactly one word is found between the specified words in a phrase. You can specify multiple single question marks, each separated by spaces. |
"word1 ?? word6" |
A double question mark (with no space between) indicates that from 0 to 6 words are found between the specified words in a phrase. |
"word1 [1–3] word5" |
Square brackets indicate an interval number of words between the specified words in a phrase: min-max. This interval is specified as a variable range, in this case from 1 to 3 missing words. |
Semantic index search_items can contain the following NLP entity search syntax in addition to the Basic index syntax:
Full Entity and Partial Entity Search:
Argument |
Description |
{entity} |
Specifies the exact wording of a NLP entity. Asterisk wildcards can be applied to individual words in an entity. |
<{entity} |
A less-than sign prefix specifies an NLP entity ending with the specified word(s). There must be one or more words in the entity appearing before the specified word(s). |
{entity}> |
A greater-than sign suffix specifies an NLP entity beginning with the specified word(s). There must be one or more words in the entity appearing after the specified word(s). |
Multiple search_items can be specified, separated by spaces. This is an implicit AND test. For example:
SELECT Narrative FROM Aviation.TestSQLSrch WHERE %ID %FIND
search_index(NarrSemanticIdx,'<{plug electrode} "flight plan" instruct*',0,'en')
means that a Narrative text must include one or more SQL Search entities that end with “plug electrode”, AND the positional phrase “flight plan”, AND the word “instruct” with a wildcard suffix, allowing for “instructor”, “instructors”, “instruction”, “instructed”, etc. These items can appear in any order in the text.