Skip to main content

Strings

Strings

A string is a set of characters: letters, digits, punctuation, and so on delimited by a matched set of quotation marks ("):

 SET string = "This is a string"
 WRITE string

Topics about strings include:

Also see String Length Limit.

Null String / $CHAR(0)

  • SET mystr="": sets a null or empty string. The string is defined, is of zero length, and contains no data:

      SET mystr=""
      WRITE "defined:",$DATA(mystr),!
      WRITE "length: ",$LENGTH(mystr),!
      ZZDUMP mystr
  • SET mystr=$CHAR(0): sets a string to the null character. The string is defined, is of length 1, and contains a single character with the hexadecimal value of 00:

      SET mystr=$CHAR(0)
      WRITE "defined:",$DATA(mystr),!
      WRITE "length: ",$LENGTH(mystr),!
      ZZDUMP mystr

Note that these two values are not the same. However, a bitstring treats these values as identical.

Note that InterSystems SQL has its own interpretation of these values; see NULL and the Empty String.

Escaping Quotation Marks

You can include a " (double quote) character as a literal within a string by preceding it with another double quote character:

 SET string = "This string has ""quotes"" in it."
 WRITE string

There are no other escape character sequences within ObjectScript string literals.

Note that literal quotation marks are specified using other escape sequences in other InterSystems software. Refer to the $ZCONVERT function for a table of these escape sequences.

Concatenating Strings

You can concatenate two strings into a single string using the concatenate operator:

 SET a = "Inter"
 SET b = "Systems"
 SET string = a_b
 WRITE string

By using the concatenate operator you can include non-printing characters in a string. The following string includes the linefeed ($CHAR(10)) character:

 SET lf = $CHAR(10)
 SET string = "This"_lf_"is"_lf_"a string"
 WRITE string
Note:

How non-printing characters display is determined by the display device. For example, the Terminal differs from browser display of the linefeed character, and other positioning characters. In addition, different browsers display the positioning characters $CHAR(11) and $CHAR(12) differently.

InterSystems IRIS encoded strings — bit strings, List structure strings, and JSON strings — have limitations on their use of the concatenate operator. For further details, see Concatenate Encoded Strings.

Some additional considerations apply when concatenating numbers. For further details, see “Concatenating Numbers”.

String Comparisons

You can use the equals (=) and does not equal ('=) operators to compare two strings. String equality comparisons are case-sensitive. Exercise caution when using these operators to compare a string to a number, because this comparison is a string comparison, not a numeric comparison. Therefore only a string containing a number in canonical form is equal to its corresponding number. ("-0" is not a canonical number.) This is shown in the following example:

  WRITE "Fred" = "Fred",!  // TRUE
  WRITE "Fred" = "FRED",!  // FALSE
  WRITE "-7" = -007.0,!    // TRUE
  WRITE "-007.0" = -7,!    // FALSE
  WRITE "0" = -0,!         // TRUE
  WRITE "-0" = 0,!         // FALSE
  WRITE "-0" = -0,!        // FALSE

The <, >, <=, or >= operators cannot be used to perform a string comparison. These operators treat strings as numbers and always perform a numeric comparison. Any non-numeric string is assigned a numeric value of 0 when compared using these operators.

Lettercase and String Comparisons

String equality comparisons are case-sensitive. You can use the $ZCONVERT function to convert the letters in the strings to be compared to all uppercase letters or all lowercase letters. Non-letter characters are unchanged.

A few letters only have a lowercase letter form. For example, the German eszett ($CHAR(223)) is only defined as a lowercase letter. Converting it to an uppercase letter results in the same lowercase letter. For this reason, when converting alphanumeric strings to a single letter case it is always preferable to convert to lowercase.

Bit Strings

A bit string represents a logical set of numbered bits with boolean values. Bits in a string are numbered starting with bit number 1. Any numbered bit that has not been explicitly set to boolean value 1 evaluates as 0. Therefore, referencing any numbered bit beyond those explicitly set returns a bit value of 0.

A bit string has a logical length, which is the highest bit position explicitly set to either 0 or 1. This logical length is only accessible using the $BITCOUNT function, and usually should not be used in application logic. To the bit string functions, an undefined global or local variable is equivalent to a bitstring with any specified numbered bit returning a bit value 0, and a $BITCOUNT value of 0.

A bit string is stored as a normal ObjectScript string with an internal format. This internal string representation is not accessible with the bit string functions. Because of this internal format, the string length of a bit string is not meaningful in determining anything about the number of bits in the string.

Because of the bit string internal format, you cannot use the concatenate operator with bit strings. Attempting to do so results in an <INVALID BIT STRING> error.

Two bit strings in the same state (with the same boolean values) may have different internal string representations, and therefore string representations should not be inspected or compared in application logic.

To the bit string functions, a bitstring specified as an undefined variable is equivalent to a bitstring with all bits 0, and a length of 0.

Unlike an ordinary string, a bit string treats the empty string and the character $CHAR(0) to be equivalent to each other and to represent a 0 bit. This is because $BIT treats any non-numeric string as 0. Therefore:

  SET $BIT(bstr1,1)=""
  SET $BIT(bstr2,1)=$CHAR(0)
  SET $BIT(bstr3,1)=0
  IF $BIT(bstr1,1)=$BIT(bstr2,1) {WRITE "bitstrings are the same"} ELSE {WRITE "bitstrings different"}
  WRITE $BITCOUNT(bstr1),$BITCOUNT(bstr2),$BITCOUNT(bstr3) 

A bit set in a global variable during a transaction will be reverted to its previous value following transaction rollback. However, rollback does not return the global variable bit string to its previous string length or previous internal string representation. Local variables are not reverted by a rollback operation.

A logical bitmap structure can be represented by an array of bit strings, where each element of the array represents a "chunk" with a fixed number of bits. Since undefined is equivalent to a chunk with all 0 bits, the array can be sparse, where array elements representing a chunk of all 0 bits need not exist at all. For this reason, and due to the rollback behavior above, application logic should avoid depending on the length of a bit string or the count of 0-valued bits accessible using $BITCOUNT(str) or $BITCOUNT(str,0).

FeedbackOpens in a new tab