Storing and Using Stream Data (BLOBs and CLOBs)
InterSystems SQL supports the ability to store stream data as either BLOBs (Binary Large Objects) or CLOBs (Character Large Objects) within an InterSystems IRIS® data platform database.
Stream Fields and SQL
InterSystems SQL supports two kinds of stream fields:
-
Character streams, used for large quantities of text.
-
Binary streams, used for images, audio, or video.
BLOBs and CLOBs
InterSystems SQL supports the ability to store BLOBs (Binary Large Objects) and CLOBs (Character Large Objects) within the database as stream objects. BLOBs are used to store binary information, such as images, while CLOBs are used to store character information. BLOBs and CLOBs can store up to 4 Gigabytes of data (the limit imposed by the JDBC and ODBC specifications).
The operation of the BLOBs and CLOBs is identical in every respect except how they handle character encoding conversion (such as Unicode to multibyte) when accessed via an ODBC or JDBC client: the data in a BLOB is treated as binary data and is never converted to another encoding, while the data in a CLOB is treated as character data and is converted as necessary.
If a binary stream file (BLOB) contains the single non-printing character $CHAR(0), it is considered to be an empty binary stream. It is equivalent to the "" empty binary stream value: it exists (is not null), but has a length of 0.
From the object point of view, BLOBs and CLOBs are represented as stream objects. For more information, see Working with Streams.
Defining Stream Data Fields
InterSystems SQL supports a variety of data type names for stream fields. These InterSystems data type names are synonyms that correspond to the following:
-
Character streams: data type LONGVARCHAR, which maps to the %Stream.GlobalCharacterOpens in a new tab class and the ODBC/JDBC data type -1.
-
Character streams: data type LONGVARBINARY, which maps to the %Stream.GlobalBinaryOpens in a new tab class and the ODBC/JDBC data type -4.
Some InterSystems stream data types allow you to specify a data precision value. This value is a no-op and has no effect on the permitted size of the stream data. It is provided to allow the user to document the anticipated size of future data.
For data type mappings of stream data types, refer to the SQL Data Types reference page.
For how to define fields (properties) of a table (persistent class), refer to Defining a Table by Creating a Persistent Class and Defining a Table by Using DDL. When defining a stream property of a persistent class, you can optionally specify the LOCATION parameter; see Declaring Stream Properties.
The following example defines a table containing two stream fields:
CREATE TABLE Sample.MyTable (
Name VARCHAR(50) NOT NULL,
Notes LONGVARCHAR,
Photo LONGVARBINARY)
Stream Field Constraints
The definition of a stream field is subject to the following field data constraints:
A stream field can be defined as NOT NULL.
A stream field can take a DEFAULT value, an ON UPDATE value, or a COMPUTECODE value.
A stream field cannot be defined as UNIQUE, a primary key field, or an IdKey. Attempting to do so results in an SQLCODE -400 fatal error with a %msg such as the following: ERROR #5414: Invalid index attribute: Sample.MyTable::MYTABLEUNIQUE2::Notes, Stream property is not allowed in a unique/primary key/idkey index > ERROR #5030: An error occurred while compiling class 'Sample.MyTable'.
A stream field cannot be defined with a specified COLLATE value. Attempting to do so results in an SQLCODE -400 fatal error with a %msg such as the following: ERROR #5480: Property parameter not declared: Sample.MyTable:Photo:COLLATION > ERROR #5030: An error occurred while compiling class 'Sample.MyTable'.
Inserting Data into Stream Data Fields
There are three ways to INSERT data into stream fields:
-
%Stream.GlobalCharacterOpens in a new tab fields: you can insert character stream data directly. For example,
INSERT INTO Sample.MyTable (Name,Notes) VALUES ('Fred','These are extensive notes about Fred')
-
%Stream.GlobalCharacterOpens in a new tab and %Stream.GlobalBinaryOpens in a new tab fields: you can insert stream data using an OREF. You can use the Write()Opens in a new tab method to append a string to the character stream, or the WriteLine()Opens in a new tab method to append a string with a line terminator to the character stream. By default, the line terminator is $CHAR(13,10) (carriage return/line feed); you can change the line terminator by setting the LineTerminator property. In the following example, the first part of the example creates a character stream consisting of two strings and their terminators, then inserts it into a stream field using Embedded SQL. The second part of the example returns the character stream length and displays the character stream data showing the terminators:
CreateAndInsertCharacterStream SET gcoref=##class(%Stream.GlobalCharacter).%New() DO gcoref.WriteLine("First Line") DO gcoref.WriteLine("Second Line") &sql(INSERT INTO Sample.MyTable (Name,Notes) VALUES ('Fred',:gcoref)) IF SQLCODE<0 {WRITE "SQLCODE ERROR:"_SQLCODE_" "_%msg QUIT} ELSE {WRITE "Insert successful",!} DisplayTheCharacterStream KILL ^CacheStream WRITE gcoref.%Save(),! ZWRITE ^CacheStream
-
%Stream.GlobalCharacterOpens in a new tab and %Stream.GlobalBinaryOpens in a new tab fields: you can insert stream data by reading it from a file. For example,
SET myf="C:\InterSystems\IRIS\mgr\temp\IMG_0190.JPG" OPEN myf:("RF"):10 USE myf:0 READ x(1):10 &sql(INSERT INTO Sample.MyTable (Name,Photo) VALUES ('George',:x(1))) IF SQLCODE<0 {WRITE "INSERT Failed:"_SQLCODE_" "_%msg QUIT} ELSE {WRITE "INSERT successful",!} CLOSE myf
For further details, refer to Sequential File I/O.
String data that is inserted as a DEFAULT value or a computed value is stored in the format appropriate for the stream field.
Querying Stream Field Data
A query select-item that selects a stream field returns the fully formed OID (object ID) value of the stream object, as shown in the following example:
SELECT Name,Photo,Notes
FROM Sample.MyTable WHERE Photo IS NOT NULL
An OID is a %List formatted data address such as the following: $lb("1","%Stream.GlobalCharacter","^EW3K.Cn9X.S").
-
The first element of the OID is a sequential positive integer (starting with 1) that is assigned to each inserted stream data value in a table. For example, if Row 1 is inserted with values for the stream fields Photo and Notes, these are assigned 1 and 2. If Row 2 is inserted with a value for Notes, that is assigned 3. If Row 3 is inserted with a value for Photo and Notes, those are assigned 4 and 5. The assignment sequence is the order that the fields are listed in the table definition, not the order they are specified in the INSERT command. By default, a single integer sequence is used which corresponds to the stream location global counter. However, a table may have multiple stream counters, as described below.
An UPDATE operation does not change the initial integer value. A DELETE operation may create gaps in the integer sequence, but does not change these integer values. Using DELETE to delete all records does not reset this integer counter. Using TRUNCATE TABLE to delete all records resets this integer counter if all of the table stream fields use the default StreamLocation value. TRUNCATE TABLE cannot be used to reset the stream integer counter for a embedded object (%SerialObject) class.
-
The second element of the OID is the stream data type, either %Stream.GlobalCharacter or %Stream.GlobalBinary.
-
The third element of the OID is a global variable. By default, its name is generated from the package name and the persistent class name that correspond to the table. An “S” (for Stream) is appended.
-
If the table was created using the SQL CREATE TABLE command, these package and persistent class names are hashed to four characters each (for example, ^EW3K.Cn9X.S). This global contains the most recently assigned value of the stream data inserts counter. If no stream field data has been inserted, or TRUNCATE TABLE has been used to delete all table data, this global is undefined.
-
If the table was created as a persistent class, these package and persistent class names are not hashed (for example, ^Sample.MyTableS). By default, this is the StreamLocation storage keyword <StreamLocation>^Sample.MyTableS</StreamLocation> value.
The default stream location is a global such as ^Sample.MyTableS. This global is used to count the inserts to all stream properties (fields) that do not have a custom LOCATION. For example, if all stream properties in Sample.MyTable use the default stream location, when ten stream data values have been inserted into stream properties of Sample.MyTable, the ^Sample.MyTableS global contains the value 10. This global contains the most recently assigned value of the stream data inserts counter. If no stream field data has been inserted, or TRUNCATE TABLE has been used to delete all table data, this global is undefined.
When defining a stream field property, you can define a custom LOCATION, such as the following: Property Note2 As %Stream.GlobalCharacter (LOCATION="^MyCustomGlobalS");. In this situation, the ^MyCustomGlobalS global serves as the stream data inserts counter for the stream property (or properties) that specify this LOCATION; stream properties that do not specify a LOCATION use the default stream location global (^Sample.MyTableS) as the stream data inserts counter. Each global counts the inserts for the stream properties associated with that location. If no stream field data has been inserted the location global is undefined. TRUNCATE TABLE does not reset stream counters if one or more stream properties defined a LOCATION.
Subscripts of these stream location global variables contain the data for each stream field. For example, ^EW3K.Cn9X.S(3) represents the third inserted stream data item. ^EW3K.Cn9X.S(3,0) is the length of the data. ^EW3K.Cn9X.S(3,1) is the actual stream data value.
-
The OID for a stream field is not the same as the OID returned for a RowID or a reference field. The %OID function returns the OID for a RowID or a reference field; %OID cannot be used with a stream field. Attempting to use a stream field as an argument to %OID results in an SQLCODE -37 error.
Use of a stream field in the WHERE clause or HAVING clause of a query is highly restricted. You cannot use an equality condition or other relational operator (=, !=, <, >), or a Contains operator ( ] ) or Follows operator ( [ ) with a stream field. Attempting to use these operators with a stream field results in an SQLCODE -313 error. Refer to Predicate Conditions and Streams for valid predicates using a stream field.
Result Set Display
-
Dynamic SQL executed from a program returns the OID in the format $lb("6","%Stream.GlobalCharacter","^EW3K.Cn9X.S").
-
The SQL Shell executes as Dynamic SQL and returns the OID in the format $lb("6","%Stream.GlobalCharacter","^EW3K.Cn9X.S").
-
Embedded SQL returns the same OID, but as an encoded %List. You can use the $LISTTOSTRING function to display the OID as a string with its elements separated by commas: 6,%Stream.GlobalBinary,^EW3K.Cn9X.S.
When a query is run from the Management Portal SQL Execute interface, the OID is not returned. Instead:
-
A character stream field returns the first 100 characters of character stream data. If the character stream data is longer than 100 characters, this is indicated by an ellipsis (...) following the 100th character. This is equivalent to SUBSTRING(cstreamfield,1,100).
-
A binary stream field returns the string <binary>.
The same values are shown in the Management Portal SQL interface Open Table display of table data.
To display the OID value from the Management Portal SQL Execute interface, concatenate an empty string to a stream value, as shown in the following: SELECT Name, ''||Photo, ''||Notes FROM Sample.MyTable.
DISTINCT, GROUP BY, and ORDER BY
Every stream data field OID value is unique, even when the data itself contains duplicates. These SELECT clauses operate on the stream OID value, not the data value. Therefore, when applied to a stream field in a query:
-
A DISTINCT clause has no effect on duplicate stream data values. A DISTINCT clause reduces the number records where the stream field is NULL to one NULL record.
-
A GROUP BY clause has no effect on duplicate stream data values. A GROUP BY clause reduces the number records where the stream field is NULL to one NULL record.
-
An ORDER BY clause orders stream data values by their OID value, not their data value. An ORDER BY clause lists records where the stream field is NULL before listing records with a stream field data value.
Predicate Conditions and Streams
The IS [NOT] NULL predicate can be applied to the data value of a stream field, as shown in the following example:
SELECT Name,Notes
FROM Sample.MyTable WHERE Notes IS NOT NULL
The BETWEEN, EXISTS, IN, %INLIST, LIKE, %MATCHES, and %PATTERN predicates can be applied to the OID value of the stream object, as shown in the following example:
SELECT Name,Notes
FROM Sample.MyTable WHERE Notes %MATCHES '*1[0-9]*GlobalChar*'
Attempting to use any other predicate condition on a stream field results in an SQLCODE -313 error.
Aggregate Functions and Streams
The COUNT aggregate function takes a stream field and counts the rows containing non-null values for the field, as shown in the following example:
SELECT COUNT(Photo) AS PicRows,COUNT(Notes) AS NoteRows
FROM Sample.MyTable
However, COUNT(DISTINCT) is not supported for stream fields.
No other aggregate functions are supported for stream fields. Attempting to use a stream field with any other aggregate function results in an SQLCODE -37 error.
Scalar Functions and Streams
InterSystems SQL cannot apply any function to a stream field, except the %OBJECT, CHARACTER_LENGTH (or CHAR_LENGTH or DATALENGTH), SUBSTRING, CONVERT, XMLCONCAT, XMLELEMENT, XMLFOREST, and %INTERNAL functions. Attempting to use a stream field as an argument to any other SQL function results in an SQLCODE -37 error.
-
The %OBJECT function opens a stream object (takes an OID) and returns the OREF (object reference), as shown in the following example:
SELECT Name,Notes,%OBJECT(Notes) AS NotesOref FROM Sample.MyTable WHERE Notes IS NOT NULL
-
The CHARACTER_LENGTH, CHAR_LENGTH, and DATALENGTH functions take a stream field and return the actual data length, as shown in the following example:
SELECT Name,DATALENGTH(Notes) AS NotesNumChars FROM Sample.MyTable WHERE Notes IS NOT NULL
-
The SUBSTRING function takes a stream field and returns the specified substring of the stream field’s actual data value, as shown in the following example:
SELECT Name,SUBSTRING(Notes,1,10) AS Notes1st10Chars FROM Sample.MyTable WHERE Notes IS NOT NULL
When issued from the Management Portal SQL Execute interface, a SUBSTRING function returns up to a 100 character substring of the stream field data. If the specified substring of the stream data is longer than 100 characters, this is indicated by an ellipsis (...) following the 100th character.
-
The CONVERT function can be used to convert a stream data type to VARCHAR, as shown in the following example:
SELECT Name,CONVERT(VARCHAR(100),Notes) AS NotesTextAsStr FROM Sample.MyTable WHERE Notes IS NOT NULL
CONVERT(datatype,expression) syntax supports stream data conversion. If the VARCHAR precision is less than the length of the actual stream data, it truncates the returned value to the VARCHAR precision. If the VARCHAR precision is greater than the length of the actual stream data, the returned value has the length of the actual stream data. No padding is performed.
{fn CONVERT(expression,datatype)} syntax does not support stream data conversion; it issues an SQLCODE -37 error.
-
The %INTERNAL function can be used on a stream field, but performs no operation.
Stream Field Concurrency Locking
InterSystems IRIS protects stream data values from concurrent operations by another process by taking out a lock on the stream data.
The system takes an exclusive lock before performing a write operation; the exclusive lock is released immediately after the writer operation completes.
Additionally, the system takes a shared lock when the first read operation occurs. A shared lock is only acquired if the stream is actually read, and is released immediately after the entire stream has been read from disk into the internal temporary input buffer.
Using Stream Fields within InterSystems IRIS Methods
You cannot use a BLOB or CLOB value using Embedded SQL or Dynamic SQL directly within an InterSystems IRIS method; instead you use SQL to find the stream identifier for a BLOB or CLOB and then create an instance of the %AbstractStreamOpens in a new tab object to access the data.
Using Stream Fields from ODBC
The ODBC specification does not provide for any recognition or special handling for BLOB and CLOB fields. InterSystems SQL represents CLOB fields within ODBC as having type LONGVARCHAR (-1). BLOB fields are represented as having type LONGVARBINARY (-4). For ODBC/JDBC data type mappings of stream data types, refer to Integer Codes for Data Types in the SQL Data Types reference page.
The ODBC driver/server uses a special protocol to access BLOB and CLOB fields. Typically you have to write special code within ODBC application to use CLOB and BLOB fields; the standard reporting tools typically do not support them.
Using Stream Fields from JDBC
Within a Java program you can retrieve or set data from a BLOB or CLOB using the standard JDBC BLOB and CLOB interfaces. For example:
Statement st = conn.createStatement();
ResultSet rs = st.executeQuery("SELECT MyCLOB,MyBLOB FROM MyTable");
rs.next(); // fetch the Blob/Clob
java.sql.Clob clob = rs.getClob(1);
java.sql.Blob blob = rs.getBlob(2);
// Length
System.out.println("Clob length = " + clob.length());
System.out.println("Blob length = " + blob.length());
// ...
When finished with a BLOB or CLOB, you must explicitly call the free() method to close the object in Java and send a message to the server to release stream resources (objects and locks). Just letting the Java object go out of scope does not send a message to clean up the server resources.