Formal Rules about Globals
This topic describes formal rules governing globals and global references (apart from extended global references, discussed separately).
Global Naming Conventions and Limits
The name of a global specifies its purpose and use. There are two types of globals:
-
A global — This is a persistent, multidimensional array that resides in the current namespace. Note that you can refer to a global in another namespace via an extended global reference.
-
A process-private global — This is an array variable that is only accessible to the process that created it.
The naming conventions for globals are:
-
A global name begins with a caret character (^) prefix. This caret distinguishes a global from a local variable.
-
The first character after the caret (^) prefix in a global name can be:
-
A letter or the percent character (%) — For standard globals only. For global names, a letter is defined as being an alphabetic character within the range of ASCII 65 through ASCII 255. If a global’s name begins with % (but not %Z or %z), then this global is for InterSystems IRIS system use. % globals are typically stored within either the IRISSYS or IRISLIB databases. For more details on the % character and InterSystems naming, see Rules and Guidelines for Identifiers, including Global Variable Names to Avoid.
Global names cannot contain Unicode characters.
-
Also see extended global reference for additional variations.
-
-
The other characters of a global name may be letters, numbers, or the period (.) character. The percent (%) character cannot be used, except as the first character of a global name. The period (.) character cannot be used as the last character of a global name.
-
A global name may be up to 31 characters long (exclusive of the caret character prefix). You can specify global names that are significantly longer, but InterSystems IRIS treats only the first 31 characters as significant.
-
Global names are case-sensitive.
-
InterSystems IRIS imposes a limit on the total length of a global reference, and this limit, in turn, imposes limits on the length of any subscript values. See Maximum Length of a Global Reference for details.
Introduction to Global Nodes and Subscripts
A global typically has multiple nodes, generally identified by a subscript or set of subscripts. For a basic example:
set ^Demo(1)="Cleopatra"
This statement refers to the global node ^Demo(1), which is a node within the ^Demo global. This node is identified by one subscript.
For another example:
set ^Demo("subscript1","subscript2","subscript3")=12
This statement refers to the global node ^Demo("subscript1","subscript2","subscript3"), which is another node within the same global. This node is identified by three subscripts.
For yet another example:
set ^Demo="hello world"
This statement refers to the global node ^Demo, which does not use any subscripts.
The nodes of a global form a hierarchical structure. ObjectScript provides commands that take advantage of this structure. You can, for example, remove a node or remove a node and all its children; see Using Multidimensional Storage (Globals).
Note that any global node cannot contain a string longer than the string length limit, which is extremely long. See General System Limits.
Rules for Global Subscripts
Subscripts have the following rules:
-
Subscript values are case-sensitive.
-
A subscript value can be any ObjectScript expression, provided that the expression does not evaluate to the null string ("").
The value can include characters of all types, including blank spaces, non-printing characters, and Unicode characters. (Note that non-printing characters are less practical in subscript values.)
-
Before resolving a global reference, InterSystems IRIS evaluates each subscript in the same way it evaluates any other expression. In the following example, we set one node of the ^Demo global, and then we refer to that node in several equivalent ways:
SAMPLES>s ^Demo(1+2+3)="a value" SAMPLES>w ^Demo(3+3) a value SAMPLES>w ^Demo(03+03) a value SAMPLES>w ^Demo(03.0+03.0) a value SAMPLES>set x=6 SAMPLES>w ^Demo(x) a value
-
InterSystems IRIS imposes a limit on the total length of a global reference, and this limit, in turn, imposes limits on the length of any subscript values. See Maximum Length of a Global Reference for details.
The preceding rules apply for all InterSystems IRIS supported collations. For older collations still in use for compatibility reasons, such as “pre-ISM-6.1”, the rules for subscripts are more restrictive. For example, character subscripts cannot have a control character as their initial character; and there are limitations on the number of digits that can be used in integer subscripts.
Collation of Globals
Within a global, nodes are stored in a collated (sorted) order.
Applications typically control the order in which nodes are sorted by applying a conversion to values used as subscripts. For example, the SQL engine, when creating an index on string values, converts all string values to uppercase letters and prepends a space character to make sure that the index is both not case-sensitive and collates as text (even if numeric values are stored as strings).
Maximum Length of a Global Reference
The total length of a global reference — that is, the reference to a specific global node or subtree — is limited to 511 encoded characters (which may be fewer than 511 typed characters).
For a conservative determination of the size of a given global reference, use the following guidelines:
-
For the global name: add 1 for each character.
-
For a purely numeric subscript: add 1 for each digit, sign, or decimal point.
-
For a subscript that includes nonnumeric characters: add 3 for each character.
If a subscript is not purely numeric, the actual length of the subscript varies depending on the character set used to encode the string. A multibyte character can take up to 3 bytes.
Note that an ASCII character can take up 1 or 2 bytes. If the collation does case folding, an ASCII character can take 1 byte for the character and 1 byte for the disambiguation byte. If the collation does not perform case folding, an ASCII character takes 1 byte.
-
For each subscript, add 1.
If the sum of these numbers is greater than 511, the reference may be too long.
Because of the way that the limitation is determined, if you must have long subscript or global names, it is helpful to avoid a large number of subscript levels. Conversely, if you are using multiple subscript levels, avoid long global names and long subscripts. Because you may not be able to control the character set(s) you are using, it is useful to keep global names and subscripts shorter.
When there are doubts about particular references, it is useful to create test versions of global references that are of equivalent length to the longest expected global reference (or even a little longer). Data from these tests provides guidance on possible revisions to your naming conventions prior to building your application.