|string||A string or expression that evaluates to a string.|
$WLENGTH returns the number of characters in string. $WLENGTH is functionally identical to $LENGTH, except that $WLENGTH recognizes surrogate pairs. It counts a surrogate pair as a single character. You can use the $WISWIDE function to determine if a string contains a surrogate pair.
A surrogate pair is a pair of 16-bit InterSystems IRIS character elements that together encode a single Unicode character. Surrogate pairs are used to represent certain ideographs which are used in Chinese, Japanese kanji, and Korean hanja. (Most commonly-used Chinese, kanji, and hanja characters are represented by standard 16-bit Unicode encodings.) Surrogate pairs provide InterSystems IRIS support for the Japanese JIS X0213:2004 (JIS2004) encoding standard and the Chinese GB18030 encoding standard.
A surrogate pair consists of high-order 16-bit character element in the hexadecimal range D800 through DBFF, and a low-order 16-bit character element in the hexadecimal range DC00 through DFFF.
The $WLENGTH function counts a surrogate pair as a single character. The $LENGTH function counts a surrogate pair as two characters. In all other aspects, $WLENGTH and $LENGTH are functionally identical. However, because $LENGTH is generally faster than $WLENGTH, $LENGTH is preferable for all cases where a surrogate pair is not likely to be encountered.
For further details on string length, refer to the $LENGTH function.
The following example shows how $WLENGTH counts a surrogate pair as a single character:
SET spair=$CHAR($ZHEX("D806"),$ZHEX("DC06")) SET str="AB"_spair_"CD" WRITE !,$LENGTH(str)," $LENGTH characters in string" WRITE !,$WLENGTH(str)," $WLENGTH characters in string"