Skip to main content

Assigning to a Domain

Assigning to a Domain

Once you have created a domain and (optionally) specified its domain parameters, you can assign various components to that domain:

  • Source Data: After creating a domain, you commonly will load a number (usually a large number) of text sources into a domain; this generates InterSystems NLP indexed data within that domain. Loading text sources is a required precondition for most InterSystems NLP operations. A variety of text sources are supported, including files, SQL fields, and text strings. You can specify SQL fields of data type %String or %Stream.GlobalCharacter (character stream data). After InterSystems NLP has indexed a data source, the original data source can be removed without affecting further processing. Changing a data source has no effect on InterSystems NLP processing, unless you re-load that data source to update the indexed data in the domain.

  • Filters: After creating a domain, you can optionally create one or more filters for that domain. A filter specifies criteria used to exclude some of the loaded sources from a query. Thus a filter allows you to perform InterSystems NLP operations on a subset of the data loaded in the domain.

  • Metadata: After creating a domain, you can optionally specify one or more metadata fields that you can use as criteria for filtering sources. A metadata field is data associated with a source that is not indexed data. For example, the date and time that a text source was loaded is a metadata field for that source. Metadata fields must be defined before loading text sources into a domain.

  • Skiplists: After creating a domain, you can optionally create one or more skiplists for that domain. A skiplist is a list of entities (such as words or phrases) that you do not want a query to return. Thus a skiplist allows you to perform InterSystems NLP operations that ignore specific data entities in data sources loaded in the domain.

  • Smart Matching Dictionaries: After creating a domain, you can optionally create one or more Smart Matching dictionaries for that domain. A dictionary contains entities that are used to match the indexed data.

These components are defined using various InterSystems NLP classes and methods. You can also use the InterSystems IRIS Domain Architect to define metadata fields, load sources, and define skiplists and dictionaries.

Metadata fields must be defined before loading sources. Filters, skiplists, and dictionaries can be defined or modified at any time.

FeedbackOpens in a new tab