class %iKnow.Source.Converter.TextTransformation extends %iKnow.Source.Converter
This %iKnow.Source.Converter implementation wraps around a Text Transformation model and will extract sections and key-value pairs as defined in the model. Select sections will be concatenated and used as text input for indexing by the iKnow engine, while select key-value pairs can be saved as metadata values.
- Model class name (%String): name of the %iKnow.TextTransformation.Definition class containing the TT model definition. This parameter is required.
- Section headers to index (%String, default = ""): comma-separated list of section headers whose contents is to be indexed. Leaving this parameter blank (default) will cause all sections to be indexed. Header names are case-insensitive.
- Include headers in sections (%Boolean, default = 0): whether or not to include the header itself to be indexed as well. Setting this value to 1 will ensure section contents is always prepended with the title.
- Keys to extract for metadata (%String, default = ""): comma-separated list of keys the model extracts that need to be saved as metadata values. Leaving this parameter blank (default) will result in no key-value pairs being saved as metadata. Key names are case-insensitive.
- Metadata field names (%String, default = ""): comma-separated list of metadata field names corresponding to the key names in the third parameter. If left blank, it is assumed the key names themselves are valid metadata field names.
This method takes the raw input text and buffers it internally in the converter. The text is provided in chunks of 32k. Every custom converter will need to implement this method so that it can take in the raw data.
This method is called after all data has been buffered. In this method the converter will need to parse the raw data and extract/convert it into plain text data. If any metadata is present within the document the converter can extract that metadata here, and provide it to the system. Metadata can be reported by using the SetCurrentMetadataValues() function.
When conversion is done, this method will be called to fetch the converted data back from the converter. The method should return the converted text in chuncks of maximum 32k in size. When no more data is available, the method should return the empty string ("") to signal that all data has been transferred.