Class Reference
IRIS for UNIX 2019.2
InterSystems: The power behind what matters   
Documentation  Search
  [%SYS] >  [%iKnow] >  [Stemmer]
Private  Storage   

This class represents an object responsible for stemming user input into a base form, either through some internal algorithm or through an external library. Use GetDefault to instantiate the default stemmer for a particular language or GetCustom to retrieve one configured with custom settings saved as a %iKnow.Stemming.Configuration object.

See also %iKnow.Stemming.Utils for defining exception rules.

Inventory

Parameters Properties Methods Queries Indices ForeignKeys Triggers
2 9


Summary

Properties
DefaultLanguage

Methods
%AddToSaveSet %ClassIsLatestVersion %ClassName %ConstructClone
%DispatchClassMethod %DispatchGetModified %DispatchGetProperty %DispatchMethod
%DispatchSetModified %DispatchSetMultidimProperty %DispatchSetProperty %Extends
%GetParameter %IsA %IsModified %New
%NormalizeObject %ObjectModified %OriginalNamespace %PackageName
%RemoveFromSaveSet %SerializeObject %SetModified %ValidateObject
Decompound GetByDomain GetCustom GetDefault
Reload Stem StemAny

Subclasses
%iKnow.Stemming.DefaultStemmer

Properties

• property DefaultLanguage as %String;
The default language to use when calling any Stem() method without supplying a language explicitly.

Methods

• method Decompound(pWord As %String, Output pCompounds, pLanguage As %String = "", pStemCompounds As %Boolean = 1) as %Status

This method will try to decompound the supplied pWord into composing elements and returns the stems of those elements:

pCompounds(n) = $lb([stem], [start pos in string], [score])

Note that most punctuation encountered in pWord (including spaces) will be considered as explicit compound boundaries (like hyphens).

See also %iKnow.Stemming.DecompoundUtils.

• final classmethod GetByDomain(pDomainId As %Integer, Output pStemmer As %iKnow.Stemmer) as %Status
Returns the stemmer configured according to the domain settings of the specified domain.
• final classmethod GetCustom(pConfigName As %String, Output pStemmer As %iKnow.Stemmer) as %Status
Instantiates a custom stemmer object based on the stemmer configuration named pConfigName. See also %iKnow.Stemming.Configuration.
• final classmethod GetDefault(pLanguage As %String, Output pStemmer As %iKnow.Stemmer) as %Status

Returns the default stemmer object for language pLanguage, which is resolved as follows:

  1. Check if there is a pLanguage*.aff file in INSTALL_DIR/dev/hunspell/ and just use the first you come across for the requested language, instantiating a hunspell-based stemmer object. Most libraries have a more detailed locale such as en_US.aff, which is covered by this check. Note that pLanguage should refer to the two-letter ISO code for that language.
  2. If no such file is found, check if there is a directory called INSTALL_DIR/dev/hunspell/pLanguage with a *.aff file within. If found, use it to instantiate a hunspell-based stemmer object.
  3. If no hunspell library is found, try to check if a %Text.Text implementation corresponding to the requested language is found and use its Standardize() method (through a %iKnow.Stemming.TextStemmer instance)
  4. If none found, revert to the default in %Text.Text:Standardize(), which will mostly just lowercase the string.
• method Reload() as %Status
Resets any properties, rules or other cached information this object was using
• method Stem(pString As %String, pLanguage As %String = "", pLexType As %Integer = $$$ENTTYPECONCEPT) as %String
This method will return the fully stemmed version of any string (1 or more words). If the stemmed version is equal to the original string, it will return the empty string.
• final method StemAny(pString As %String, pLanguage As %String = "", pLexType As %Integer = $$$ENTTYPECONCEPT) as %String
This convenience method will return the fully stemmed version of any string (1 or more words). This method will return a nonempty value, regardless of whether a stemmed version was found or whether it consists of more than one word.


Copyright (c) 2019 by InterSystems Corporation. Cambridge, Massachusetts, U.S.A. All rights reserved. Confidential property of InterSystems Corporation.