Numeric Computing in InterSystems Applications
This page provides details on the numeric formats supported by InterSystems IRIS®.
Introduction
InterSystems IRIS has two different ways of representing numbers:

The first of these has its roots in the original implementation of InterSystems IRIS. This representation will be referred to as decimal format.
In class definitions, you use the %Library.DecimalOpens in a new tab datatype class when you want a property to contain a decimal format number.

The second, more recently supported, form adheres to the IEEE Binary FloatingPoint Arithmetic standard (#754–2019Opens in a new tab). This latter format is referred to as $DOUBLE format after the ObjectScript function ($DOUBLE) that is used to convert numbers into this form.
In class definitions, you use the %Library.DoubleOpens in a new tab datatype class when you want a property to contain a $DOUBLE format number.
SQL Representations
The InterSystems SQL data types DOUBLE and DOUBLE PRECISION represent IEEE floatingpoint numbers, that is, $DOUBLE. The SQL FLOAT data type represents standard InterSystems IRIS decimal numbers.
Decimal Format
InterSystems IRIS represents decimal numbers internally in two parts. The first is called the significand, and the second is called the exponent:

The significand contains the significant digits of the number. It is stored as a signed 64–bit integer with the decimal point assumed to be to the right of the value. The largest positive integer with an exponent of 0 that can be represented without loss of precision is 9,223,372,036,854,775,807; the largest negative integer is 9,223,372,036,854,775,808.

The exponent is stored internally as a signed byte. Its values range from 127 to 128.
This is the base10 exponent of the value. That is, the value of the number is the significand multiplied by 10 raised to the power of the exponent.
For example, for the ObjectScript literal value 1.23, the significand is 123, and 2 is the exponent.
Thus, the range of numbers that can be represented in InterSystems IRIS native format approximately covers the range 1.0E128 to 9.22E145. (The first value is the smallest integer with the smallest exponent. The second value is the largest integer with the decimal point moved to the left and the exponent increased correspondingly in the displayed representation. )
All numbers with 18 digits of precision can be represented exactly; numbers which are within the representation bounds of the significand can be accurately represented as 19digit values.
InterSystems IRIS does not normalize the significand unless necessary to fit the number in decimal format. So numbers with a significand of 123 and an exponent of 1, and a significand of 1230 and an exponent of zero compare as equal.
$DOUBLE Format
The InterSystems $DOUBLE format conforms to IEEE754–2019Opens in a new tab, specifically, the 64bit binary (doubleprecision) representation. This means it consists of three parts:

A sign bit

An 11–bit power of two exponent. The exponent value is biased by 1023, so the internal value of the exponent for the number $DOUBLE(1.0) is 1023 rather than 0.

A positive 52–bit fractional significand. Because the significand is always treated as a positive value and normalized, a 1bit is assumed as the lead binary digit even though it is not present in the significand. Thus, the significand is numerically 53 bits long: the value 1, followed by the implied binary point, followed by the fractional significand. This can be thought of as an integer implicitly divided by 2**52.
As an integer, all values between 0 and 9,007,199,254,740,992 can be represented exactly. Larger integers may or may not have exact representations depending on their pattern of bits.
This representation has three optional features that are not available with InterSystems IRIS native format:

The ability to represent the results of invalid computations (such as taking the square root of a negative number) as a NaN (Not any Number).

The ability to represent both a +0 and 0.

The ability to represent infinity.

The standard provides for representation of numbers smaller than 2 ** 1022. This is done by a technique referred to as a gradual loss of precision. Please refer to the standardOpens in a new tab for details.
These features are under program control via the IEEEError()Opens in a new tab method of the %SYSTEM.ProcessOpens in a new tab class for an individual process or the IEEEError()Opens in a new tab method of the Config.MiscellaneousOpens in a new tab class for the system as a whole.
Calculations using IEEE binary floatingpoint representations can give different results for the same IEEE operation. InterSystems has written its own implementations for:

Conversions between $DOUBLE binary floatingpoint and decimal;

Conversion between $DOUBLE and numeric strings;

Comparisons between $DOUBLE and other numeric types.
This guarantees that when a $DOUBLE value is inserted into, or fetched from, an InterSystems IRIS data base, the result is the same across all hardware platforms.
However, for all other calculations involving the $DOUBLE type, InterSystems IRIS uses the vendorsupplied floatingpoint library subroutines. This means that there can be minor differences between platforms for the same set of operations. In all cases, however, InterSystems $DOUBLE calculations equal the local calculations performed on the C double type; that is, the differences between platforms for InterSystems $DOUBLE computations are never worse than the differences exhibited by C programs computing IEEE values running on those same platforms.
Choosing a Numeric Format
The choice of which format to use is largely determined by the requirements of the computation. InterSystems IRIS decimal format permits over 18 decimal digits of accuracy while $DOUBLE guarantees only 15.
In most cases, decimal format is simpler to use and provides more precise results. It is usually preferred for computations involving decimal values (such as currency calculations) because it gives the expected results. Decimal fractions cannot often be represented exactly as binary fractions.
On the other hand, the range of numbers in $DOUBLE is significantly larger than permitted by native format: 1.0E308 versus 1.0E145. Those applications where the range is a significant factor should use $DOUBLE.
Applications that will share data externally may also consider maintaining data in $DOUBLE format because it will not be subject to implicit conversion. Most other systems use the IEEE standard as their representation of binary floatingpoint numbers because it is supported directly by the underlaying hardware architecture. So values in decimal format must be converted before they can be exchanged, for example, via ODBC/JDBC, SQL, or language binding interfaces.
If a $DOUBLE value is within the bounds defined for InterSystems IRIS decimal numbers, then converting it to decimal and then converting back to a $DOUBLE value will always yield the same number. The reverse is not true because $DOUBLE values have less precision than decimal values.
For this reason, InterSystems recommends that computation be done in one representation or the other, when possible. Converting values back and forth between representations may cause loss of accuracy. Most applications can use InterSystems IRIS decimal format for all their computations. The $DOUBLE format is intended to support those applications that exchange data with systems that use IEEE formats.
The reasons for preferring InterSystems IRIS decimal over $DOUBLE are:

InterSystems IRIS decimal has more precision, almost 19 decimal digits compared to less than 16 decimal digits for $DOUBLE.

InterSystems IRIS decimal can exactly represent decimal fractions. The value 0.1 is an exact value in InterSystems IRIS decimal; but there is no exact equivalent in binary floating point, so 0.1 must be approximated in $DOUBLE format.
The advantages of InterSystems $DOUBLE over InterSystems decimal for scientific numbers are:

$DOUBLE uses exactly the same representation as the IEEE double precision binary floating point used by most computing hardware.

$DOUBLE has a greater range: 1.7E308 maximum for $DOUBLE and 9.2E145 maximum for InterSystems decimal.
Conversions: Strings
When converting values from string to number, or when processing written constants when a program is compiled, only the first 38 significant digits can influence the value of the significand. All digits following that will be treated as if they were zero; that is, they will be used in determining the value of the exponent but they will have no additional effect on the significand value.
Strings as Numbers
In InterSystems IRIS, if a string is used in an expression, the value of the string is the value of the longest numeric literal contained in the string starting at the first character. If there is no such literal present, the computed value of the string is zero.
Numeric Strings As Subscripts
In computation, there is no difference between the strings “04” and “4”. However, when such strings are used as subscripts for local or global arrays, InterSystems IRIS makes a distinction between them.
In InterSystems IRIS, numeric strings that contain leading zeroes (after the minus sign, if there is one), or trailing zeroes at the end of decimal fractions, will be treated as if they were strings when used as subscripts. As strings, they have a numeric value; they can be used in computations. But as subscripts for local or global variables, they are treated as strings and are collated as strings. Thus, in the list of pairs:

4 versus 04

10 versus 10.0

.001 versus 0.001

.3 versus 0.3

1 versus +01
those on the left are considered numbers when used as subscripts and those on the right are treated as strings. (The form on the left, without the extraneous leading and trailing zero parts, is sometimes referred to as canonical form.)
In normal collation, numbers sort before strings as shown in this example,
SET ^TEST("2") = "standard"
SET ^TEST("01") = "not standard"
SET NF = "Not Found"
WRITE """2""", ": ", $GET(^TEST("2"),NF), !
WRITE 2, ": ", $GET(^TEST(2),NF), !
WRITE """01""", ": ", $GET(^TEST("01"),NF), !
WRITE 1, ": ", $GET(^TEST(1),NF), !, !
SET SUBS=$ORDER(^TEST(""))
WRITE "Subscript Order:", !
WHILE (SUBS '= "") {
WRITE SUBS, !
SET SUBS=$ORDER(^TEST(SUBS))
}
Conversions: Decimal to $DOUBLE
InterSystems recommends that your application explicitly control conversions between decimal and $DOUBLE formats.
Conversion to $DOUBLE format is done explicitly via the $DOUBLE function. This function also permits the explicit construction of IEEE representations for notanumber and infinity via the expression, $DOUBLE(<S>) where <S> is:

the string, nan to generate a NaN

any one of the strings inf, +inf, inf, infinity, +infinity, or infinity for infinity.

the numeric and string literals, 0 and 0, respectively
The case of the string, <S>, is ignored on input. On output, only NAN, INF and INF are produced.
Conversions: $DOUBLE to Decimal
InterSystems recommends that your application explicitly control conversions between decimal and $DOUBLE formats.
Values in $DOUBLE form are converted to decimal values with the $DECIMAL function. The result of calling the function is a string suitable for conversion to a decimal value.
Although this description assumes the value presented to $DECIMAL is a $DOUBLE value, this is not a requirement. Any numeric value may be supplied as the argument and the same rules apply for rounding.
$DECIMAL(x)
The single argument form of the function converts the $DOUBLE value given as its argument to decimal. $DECIMAL rounds the decimal portion of the number to 19 digits. $DECIMAL always rounds to the nearest decimal value.
$DECIMAL(x, n)
The twoargument form allows precise control over the number of digits returned. If n is greater than 38, an <ILLEGAL VALUE> error occurs. If n, is greater than 0, the value of x rounded to n significant digits is returned.
When n is zero, the following rules are used to determine the value:

If x is an Infinity, return INF or INF as appropriate.

If x is a NaN, return NAN.

If x is a positive or negative zero, return 0.

If x can be exactly represented in 20 or fewer significant digits, return the canonical numeric string contains those exact significant digits.

Otherwise, truncate the decimal representation to 20 significant digits, and

If the 20th digit is a 0, replace it with a 1;

If the 20th digit is a 5, replace it with a 6.
Then, return the resulting string.

This rounding rule involving truncationtozero of the 20th digit except when it would inexactly make the 20th digit be a 0 or 5 has these properties:

If a $DOUBLE value is different from a decimal value, these two values will always have unequal representation strings.

When a $DOUBLE value can be converted to decimal without generating a <MAXNUMBER> error, the result is the same as converting the $DOUBLE value to a string and then converting that string to a decimal value. There is no possibility of a double round error when doing the two conversions.
Conversions: Decimal to String
Decimal values can be converted to strings by default when they are used as such, for example, as one of the operands to the concatenation operator. When more control over the conversion is needed, use the $FNUMBER function.
Arithmetic Operations
Homogeneous Representations
Expressions involving only decimal values will always yield a decimal result. Similarly, expressions with only $DOUBLE values will always produce a $DOUBLE result. In addition,

If the result of a computation involving decimal values overflows, a <MAXNUMBER> error will result. There is no automatic conversion to $DOUBLE in this case as there is for literals.

If a decimal expression underflows, 0 is generated as the result of the expression.

By default the IEEE errors of overflow, dividebyzero, and invalidoperation will signal the <MAXNUMBER>, <DIVIDE>, and <ILLEGAL VALUE> errors, respectively, rather than generating an Infinity or NaN result. This behavior can be modified by the IEEEError()Opens in a new tab method of the %SYSTEM.ProcessOpens in a new tab class for an individual process or the IEEEError()Opens in a new tab method of the Config.MiscellaneousOpens in a new tab class for the system as a whole.

The expression 0 ** 0 (decimal) produces the decimal value, 0; but, the expression $DOUBLE(0) ** $DOUBLE(0) produces the $DOUBLE value, 1. The former has always been true in InterSystems IRIS; the latter is required by the IEEE standard.
Heterogenous Representations
Expressions involving both decimal and $DOUBLE representations always produce a $DOUBLE value. The conversion of the value takes place when it is used. Thus, in the expression
1 + 2 * $DOUBLE(4.0)
InterSystems IRIS first adds 1 and 2 together as decimal values. Then it converts the result, 3, to $DOUBLE format and does the multiplication. The result is $DOUBLE(12).
Rounding
When necessary, numeric results are rounded to the nearest representable value. When the value to be rounded is equally close to two available values, then:

$DOUBLE values are rounded to even as defined in the IEEE standard

Decimal values are rounded away from zero, that is toward a larger value (in absolute terms)
Comparison Operations
Homogeneous Representations
Comparisons between $DOUBLE(+0) and $DOUBLE(0) treat these values as equal. This follows the IEEE standard. This is the same as in InterSystems IRIS decimal because, when either $DOUBLE(+0) or $DOUBLE(0) is converted to a string, the result in both cases is “0”.
Comparisons between $DOUBLE(nan) and any other numeric value — including $DOUBLE(nan) — will say these values are not greater than, not equal, and not less than. This follows the IEEE standard. This is a departure from usual ObjectScript rule that says the equality comparison is done by converting to strings and checking the strings for equality.
The expression, nan, is equal to $DOUBLE(nan) because the comparison is done as a string compare.
Heterogeneous Representations
Comparisons between a decimal value and $DOUBLE value are fully accurate. The comparisons are done without any rounding of either value. If only finite values are involved then these comparisons get the same answer that would result if both values were converted to strings and those strings were compared using the default collation rules.
Comparison involving the operators <, <=, >, and => always produce a boolean result, 0 or 1, as a decimal value. If one of the operands is a string, that operand is converted to a decimal value before the comparison is performed. Other numeric operands are not converted. As noted, the comparison of mixed numeric types is done with full accuracy and no conversion.
In the case of the string comparison operators (=, '=, ], '], [, '[, ]], ']], and so on), any numeric operand is first converted to a string before the comparison is done.
LessThan Or Equal, GreaterThan Or Equal
In InterSystems IRIS, the operators <= and >= are treated as synonyms for the operators '> and '<, respectively.
If the operators <= or >= are used in comparisons where either or both of the operands may be NaNs, the results will be different from those mandated by the IEEE standard.
The expression A >= B when either A and/or B is a NaN is interpreted as follows:

The expression is transformed to A '> B.

It is further transformed to '(A >B).

As noted previously, comparisons involving NaNs give results that are (a) not equal, (b) not greaterthan, and (c) not lessthan, so the expression in parenthesis results in a value of false.

The negation of that value results in a value of true.
The expression A >= B can be rewritten to provide the IEEE expected results if it is expressed as ((A > B)  (A = B)).
Boolean Operations
For boolean operations and, or not, nor, nand and so on) any string operand is converted to decimal. Any numeric operand (decimal or $DOUBLE) is left unchanged.
A numeric value that is zero is treated as FALSE; all other numeric values (including $DOUBLE(nan) and $DOUBLE(inf)) are treated as TRUE. The result is 0 or 1 (as decimal.)
$DOUBLE(0) is also false.
See Also
For more information, see the following sources:

The IEEE754–2019Opens in a new tab standard.

What Every Computer Scientist Should Know About FloatingPoint ArithmeticOpens in a new tab, by David Goldberg, published in the March, 1991 issue of Computing Surveys.