Increasingly, organizations are amassing larger and larger quantities of unstructured text data, far in excess of their ability to read or catalog these texts. Frequently, an organization may have little or no idea what the contents of these text documents are. Conventional “top-down” text analysis based on pure search technologies makes assumptions about the contents of these texts, which may miss important content.
InterSystems IRIS® Natural Language Processing (NLP) allows you to perform text analysis on unstructured data sources in a variety of natural languages without any prior knowledge of their content. It does this by applying language-specific rules that identify semantic entities. Because these rules are specific to the language, not the content, NLP can provide insight into the contents of texts without the use of a dictionary or ontology.
InterSystems IRIS SQL Search — allows you to use SQL queries to search for semantic entities across multiple texts, as well as single words, regular expressions, and other constructs.
Unstructured Information Management Architecture (UIMA)