Class Reference
IRIS for UNIX 2019.2
InterSystems: The power behind what matters   
Documentation  Search
  [%SYS] >  [%iFind] >  [Index] >  [Basic]
Private  Storage   

index class %iFind.Index.Basic extends %Library.FunctionalIndex, %Compiler.Type.Embedded

This index class provides text search capabilities to perform word-level searches through text in the %String or %Stream properties being indexed, for persistent classes using default storage.

Defining an iFind index

An iFind index can be defined in the class as follows:

Class ThePackage.MyClass Extends %Persistent
{
	Property MyStringProperty As %String;
	
	Index MyBasicIndex On (MyStringProperty) As %iFind.Index.Basic;
}

A number of parameters can be configured in order to refine the indexing behavior, such as whether to support case-sensitive search (LOWER), which language to use when indexing the text (LANGUAGE) or whether to enable stemming or decompounding (INDEXOPTION).

Querying an iFind index

Classes with an iFind index can subsequently be queried in SQL using the following syntax:

SELECT * FROM ThePackage.MyClass WHERE %ID %FIND search_index(MyBasicIndex, 'interesting')

This will return all the records containing the word "interesting". The following table lists a few sample search strings illustrating more advanced iFind search syntax.

Search stringWhat will be retrieved
structureAll records containing the word "structure"
logical structureAll records containing both the words "logical" and "structure" (implicit AND)
logical structure*Same, but with anything that starts with "structure" (wildcard search)
"logical structure"All records containing the word "structure" right behind "logical" (positional search)
"logical ? structure"All records containing the words "logical" and "structure" with exactly one word in between (positional search)
"logical [0-5] structure"Positional again, but with up to 5 words between
[logical, structure, 5]"All records containing the words "logical" and "structure", but with up to 5 words between
[logical structure, diagram, 3-6]"All records containing the phrase "logical structure" and the word "diagram" again, with between 3 and 6 words between

It's also possible to use AND, OR and NOT, as well as parentheses to combine words into more complex search strings, other than the implicit AND which is the default behavior for multi-word search:

Search stringWhat will be retrieved
FixedAll records containing the word "fixed"
Fixed and storedAll records containing "fixed" and "stored"
Fixed and not storedAll records containing "fixed" but not "stored"
Fixed and not "stored procedure"All records containing "fixed" but not the positional string "stored procedure"
fixed and ("stored procedure" or "default parameters")All records containing "fixed" and either "stored procedure" or "default parameters"
Fixed and \notAll records containing the words "fixed" and "not"
Fixed \and \notAll records containing "fixed", "and" and "not"
not generatedAll records not containing "generated"
\not generatedImplicit AND of "not" and "generated"

Besides the name of the iFind index and the search string, the search_index() function supports two more optional parameters:

search_index(index_name, search_string [, search_option [, search_language]]

The search_option defines whether to search for exact occurrences of words in the search string (search_option=0), which is the default, to look for words that correspond to the same "normalized" form, based on a particular transformation. For example, stemming will normalize conjugated words to their base form and allow you to search for any conjugated form that corresponds to the same base form. Similarly, decompounding will normalize words even further by splitting up compound words in the atomic words it consists of (see also %iKnow.Stemming.DecompoundUtils). The following values can be used for search_option:

Which values are available for a given index depends on the values of the INDEXOPTION or TRANSFORMATIONSPEC parameters.

The search_language argument enables filtering records to those in a particular language, in cases where the indexed property contains text in multiple languages (LANGUAGE = "*"). This language is also passed on to an eventual word transformation method when search_option != 0.

Inventory

Parameters Properties Methods Queries Indices ForeignKeys Triggers
13 16


Summary

Methods
DeleteIndex Embedded Find Highlight
InsertIndex Normalize PurgeIndex Rank
SegmentFinalize SegmentInitialize SegmentInsert SortBeginIndex
SortEndIndex StripCharacters StrippedEntityId StrippedWordId
UpdateIndex

Subclasses
%iFind.Index.Semantic

Parameters

• parameter IFINDADVANCEDSQLPREFIX;

When generating SQL projections of iFind index data using the IFINDMAPPINGS), this parameter overrides the naming of those classes, using this parameter's value instead of the default [class_name]_[index_name] prefix. The projections will still be created in the [package_name]_[class_name] package.

• parameter IFINDMAPPINGS = 0;

When this parameter is set to 1, additional SQL projections will be created upon compiling the class. These are accessible as read-only tables in a package named [package_name]_[class_name] and have names starting with [class_name]_[index_name] (which can be overridden through IFINDADVANCEDSQLPREFIX).

By default, the following mappings are generated for an %iFind.Index.Basic index:

  • [class_name]_[index_name]_WordRec: stores which words appear in each record in this index. See also %iFind.Index.AbstractWordRec.
  • [class_name]_[index_name]_WordSpread: stores the total number of records in which this word appears in this index. See also %iFind.Index.AbstractWordSpread.
  • [class_name]_[index_name]_WordPos stores which word occurs at which position in a record, so it can be joined to the AttributePos table. See also %iFind.Index.AbstractWordPos.

Additional classes will be generated automatically, based on your index class and parameters. See the class reference for subclasses for more details.

• parameter IFINDSHAREDDATALOCATION;
This parameter enables specifying whether words, entities and similar data should be written to the shared %iFind.Word, %iFind.Entity and similar tables (IFINDSHAREDDATALOCATION="NAMESPACE", default), or whether it should be stored in index-specific tables (IFINDSHAREDDATALOCATION="INDEX"). In the latter case, mappings will be generated for these tables if IFINDMAPPINGS is set to 1.
• parameter INDEXOPTION = 0;
Specific indexing options to use when processing records.
  • 0 = Do not store compounds or stems
  • 1 = Store word-level stems
  • 2 = Store word-level compounds and stems

See also %iKnow.Stemmer and %iKnow.Stemming.DecompoundUtils for more details on stemming or decompounding, or TRANSFORMATIONSPEC for advanced options to use custom transformations.

• parameter KEEPCHARS;

This parameter controls which characters are retained at the start and end of a word when calculating the "stripped" version of a word that will be indexed along with the original word as it appeared in the text.

• parameter LANGUAGE = "en";
Language to use when indexing records. Use "*" to enable automatic language detection.
• parameter LOWER = 1;
Whether or not to convert content to lowercase before indexing. When set to 1 (default), searches are always case-insensitive. When set to 0, searching will be case-sensitive.
• parameter RANKERCLASS;

The %iFind.Rank.Abstract implementation to use for ranking search results using the auto-generated rank SQL procedure "[package name].[class name]_[index name]Rank"

• parameter STEMMINGCONFIG;

This parameter can be used to override the default stemming implementation when INDEXOPTION > 0. To do so, set this parameter to a saved %iKnow.Stemming.Configuration instance. This parameter has no effect if INDEXOPTION = 0.

This parameter is for advanced use only and empty by default.

• parameter TRANSFORMATIONSPEC;

This parameter defines the word transformation(s) to apply to input text, such as stemming, decompounding and other operations for "normalizing" words, so searches can scan these normalized forms rather than the literal forms.
This parameter cannot be set in conjunction with the INDEXOPTION and/or STEMMINGCONFIG parameters, which are shorthands for configuring stemming and decompounding options and overriding the default configurations for those.
This parameter also allows using custom transformations by specifying the name of a class that inherits from %iFind.Transformation.Abstract, optionally followed by a colon and string that will be passed onto the Transform method of the transformation class if it accepts any parameters.

• parameter USERDICTIONARY;

This parameter controls which user dictionary should be used to rewrite or annotate text before it is processed by the iKnow engine. See also the section on User Dictionaries in the iKnow documentation.

This parameter is for advanced use only and empty by default.


Methods

• classmethod DeleteIndex(pID As %RawString, pArg... As %Binary)
Deletes the iFind index for the row
• classmethod Embedded() as %RegisteredObject
Return an instance of the embedded Find class, initialized with the index' parameters
• classmethod Find(pSearch As %Library.Binary, pOption As %Integer = 0, pLanguage As %String = "", pSynonymOption As %String = "") as %Library.Binary [ SQLProc = ]
Searches for matches based on the iFind index. This function can be accessed more conveniently through the following syntax:
SELECT * FROM MyPackage.Table WHERE 
%ID %FIND search_index(<i>index_name</i>, <var>pSearch</var> [, <var>pOption</var> [, <var>pLanguage</var>]])
• classmethod Highlight(pRecordID As %RawString, pSearchString As %String, pSearchOption As %String = $$$IFSEARCHNORMAL, pTags As %String = $$$IFDEFAULTHLTAGS, pLimit As %Integer = 0, Output pSC As %Status) as %String [ SQLProc = ]

This SQL procedure returns the text indexed by pRecordID, in which all matches of the supplied pSearchString are highlighted using pTags.

SELECT %ID, 
	Title,
	SomePackage.TheTable_MyIndexHighlight(%ID, 'cocktail* OR (hammock AND NOT bees)')
FROM SomePackage.TheTable
WHERE %ID %FIND search_index(MyIndex, 'cocktail* OR (hammock AND NOT bees)')
ORDER BY 4 DESC

pTags is a comma-separated list of tags to use for highlighting. If only a single one is supplied, it will be used to highlight all matches of search terms. If a second one is supplied, it will be used for all terms in a NOT node of the search tree (such as 'bees' in the above example), while the first will be used for all other terms.

pLimit can be used to limit the text to a maximum number of hits rather than returning the entire, highlighted text. pSearchOption can be used as in other iFind search operations, for example to also mark fuzzy matches or stem matches.

• classmethod InsertIndex(pID As %RawString, pArg... As %Binary)
Inserts the iFind index for the row THROW: This method throws exceptions.
• classmethod Normalize(pQuery As %String = "", pLanguage As %String = "") as %String
This method will normalize the query of %iFind.Find.Basic based on the dictionary defined %iFind.Index.Basic
• classmethod PurgeIndex()
Purges the iFind index
• classmethod Rank(pRecordID As %RawString, pSearchString As %String, pSearchOption As %String = $$$IFSEARCHNORMAL) as %Float [ SQLProc = ]

This SQL procedure returns the score expressing how well the record identified by pRecordID matches pSearchString, based on the ranking algorithm defined by RANKERCLASS.

SELECT %ID, 
	Title,
	FullText,
	SomePackage.TheTable_MyIndexRank(%ID, 'cocktail* OR (hammock AND NOT bees)')
FROM SomePackage.TheTable
WHERE %ID %FIND search_index(MyIndex, 'cocktail* OR (hammock AND NOT bees)')
ORDER BY 4 DESC

pSearchOption can be used as in other iFind search operations, for example to also accept fuzzy matches or stem matches when calculating the rank score.

• classmethod SortBeginIndex()
• classmethod SortEndIndex()
• classmethod StripCharacters(pWord As %String) as %String
Utility method stripping punctuation characters from the start and end of a word, according to the value of the KEEPCHARS index parameter for this index.
• classmethod StrippedEntityId(pEntity As %String) as %String
Returns the Entity ID for pEntity, after stripping off any punctuation at the start and end of the words making up the entity, according to the value of KEEPCHARS for this index.
• classmethod StrippedWordId(pWord As %String) as %String
Returns the Word ID for pWord, after stripping off any punctuation at the start and end of the word, according to the value of KEEPCHARS for this index.
• classmethod UpdateIndex(pID As %RawString, pArg... As %Binary)
Updates the iFind index for the row


Copyright (c) 2019 by InterSystems Corporation. Cambridge, Massachusetts, U.S.A. All rights reserved. Confidential property of InterSystems Corporation.