Emit Telemetry Data to an OpenTelemetry-Compatible Monitoring Tool
OpenTelemetryOpens in a new tab (OTel) is an open source framework and toolkit for generating, exporting, and collecting telemetry data.
On supported systems, this version of InterSystems IRIS leverages the OpenTelemetry SDK to provide support for exporting and emitting telemetry data as OpenTelemetry ProtocolOpens in a new tab signals over HTTP (OTLP/HTTP) to the OpenTelemetry CollectorOpens in a new tab or any other compatible monitoring tool.
This feature is not available for macOS, Windows, and AIX systems in this version of InterSystems IRIS.
You can configure InterSystems IRIS to emit the following types of signals:
-
Metrics — the measurements which you have configured the InterSystems IRIS /api/monitor API to collect.
-
Logs — events which InterSystems IRIS records to either the system messages log or the audit database.
-
Traces — information about how a request moves through your application.
To learn more about how these different types of signals work within OTel, refer to the OTel documentation’s pages for metricsOpens in a new tab, logsOpens in a new tab, and tracesOpens in a new tab.
InterSystems IRIS pushes these signals to the endpoint that you specify, as described in Configure the Target Endpoint. InterSystems IRIS emits metrics and logs at a regular interval, based on a common configuration parameter; otherwise, you can enable and configure the emission of each type of signal independently, as described in the corresponding sections which follow.
Configure the Target Endpoint
To specify the endpoint to which an InterSystems IRIS instance sends OTLP/HTTP signals, set the environment variable OTEL_EXPORTER_OTLP_ENDPOINT on the instance’s host system to the desired address. For instructions on setting environment variables, refer to your operating system’s documentation.
If you enable OTLP/HTTP emission and an OTEL_EXPORTER_OTLP_ENDPOINT environment variable is not set, the instance emits signals to the default endpoint for the OpenTelemetry CollectorOpens in a new tab’s OTLP/HTTP receiver: http://localhost:4318.
Emit Metrics
InterSystems IRIS can emit all of the metric events that the /api/monitor API collects (including your custom application metrics) to the OTLP/HTTP endpoint that you designate, at regular intervals.
To configure your instance to emit metric events:
-
Configure the instance to collect all of the metrics that you want to collect. If you want to collect interoperability production metrics, you must manually enable them. You can also configure the instance to collect custom application metrics.
-
Configure your instance’s OpenTelemetry exporter to start emitting metrics when the instance starts. You can do this in any of the following ways:
-
From the Management Portal: navigate to System Administration > Configuration > Additional Settings > Monitor, and select Enable OTel Metrics. Then, select Save.
-
Change the OTELMetrics parameter by modifying the Config.MonitorOpens in a new tab class (as described in the class reference) or by editing the CPF file directly.
-
-
If needed, modify the frequency at which the exporter will emit metrics. By default, the exporter emits signals every 10 seconds. You can change this interval in any of the following ways:
-
From the Management Portal: navigate to System Administration > Configuration > Additional Settings > Monitor. Update the OTel Exporter Interval field with the length of the desired interval, in seconds. Then, select Save.
-
Change the OTELInterval parameter by modifying the Config.MonitorOpens in a new tab class (as described in the class reference) or by editing the CPF file directly.
-
-
If you configured your instance in the preceding steps by modifying CPF parameters, restart the instance to allow your changes to take effect.
Emit Logs
InterSystems IRIS can emit OTLP/HTTP signals for the same categories of log events which would be part of a structured log file—namely, events which are recorded to the system messages log (messages.log) or to the audit database. InterSystems IRIS emits structured log events to the OTLP/HTTP endpoint that you designate, at regular intervals.
To configure your instance to emit log events:
-
Configure your instance’s OpenTelemetry exporter to start emitting log events when the instance starts. You can do this in any of the following ways:
-
From the Management Portal: navigate to System Administration > Configuration > Additional Settings > Monitor, and select Enable OTel Logs. Then, select Save.
-
Change the OTELLogs parameter by modifying the Config.MonitorOpens in a new tab class (as described in the class reference) or by editing the CPF file directly.
-
-
As needed, configure the minimum severity level that a log event must meet or exceed in order to be emitted by the instance’s OpenTelemetry exporter. Severity levels are the same as those used in the structured log. The default severity level threshold is WARN. At this level, the exporter emits log events from the WARN, SEVERE, and FATAL levels; it does not emit log events from the DEBUG2, DEBUG, and INFO levels.
You can change the minimum severity level in any of the following ways:
-
From the Management Portal: navigate to System Administration > Configuration > Additional Settings > Monitor. Select the desired threshold severity level from the OTel Log Level drop-down menu. Then, select Save.
-
Change the OTELLogLevel parameter by modifying the Config.MonitorOpens in a new tab class (as described in the class reference) or by editing the CPF file directly.
-
-
If needed, modify the frequency at which the exporter will emit log events. By default, the exporter emits signals every 10 seconds.
You can change this interval in any of the following ways:
-
From the Management Portal: navigate to System Administration > Configuration > Additional Settings > Monitor. Update the OTel Exporter Interval field with the length of the desired interval, in seconds. Then, select Save.
-
Change the OTELInterval parameter by modifying the Config.MonitorOpens in a new tab class (as described in the class reference) or by editing the CPF file directly.
-
-
If you configured your instance in the preceding steps by modifying CPF parameters, restart the instance to allow your changes to take effect.
Emit Traces
A trace records how a request moves through an application. It consists of one or more nestable spans, which represent the constituent units of work that the application performs as part of responding to the request. To record a trace, the application’s code must include instruments which generate spans, populate them with information, and contextualize them as part of a continuous trace. For more information about the OpenTelemetry specification for traces, refer to the OpenTelemetry documentationOpens in a new tab.
Within InterSystems IRIS, the %Trace package provides a straightforward API for instrumenting your application code to produce traces in a way that conforms to the OTel specification. Once instruments within your application are producing traces, InterSystems IRIS emits them to the OTLP/HTTP endpoint that you designate.
Edit your application code to produce traces using the %Trace API as follows:
-
Create an instance of the %Trace.TracerProvider class to serve as your application’s Tracer ProviderOpens in a new tab.
The constructor for this class accepts an optional argument: an array which is used to set the TracerProvider object’s ResourceAttributes property. If you wish to specify global attributes about the application, define an array containing the desired key-value pairs and then pass it to the constructor method by reference, as in the following example:
set attributes("service.name") = "test_service" set attributes("service.version") = "2.0" set tracerProv = ##class(%Trace.TracerProvider).%New(.attributes)
attributes = {} attributes["service.name"] = "test_service" attributes["service.version"] = "2.0" attrArray = iris.arrayref(attributes) tracerProv = iris.cls('%Trace.TracerProvider')._New(.attrArray)
-
In most situations, it is preferable to instantiate a single TracerProvider object at startup, for shared use by instrumentation code across the namespace throughout the entire life cycle of the application. To do so, provide the newly created TracerProvider object to the SetTracerProvider() method of the %Trace.Provider class, as follows:
do ##class(%Trace.Provider).SetTracerProvider(tracerProv)
iris.cls('%Trace.Provider').SetTracerProvider(tracerProv)
When you need to access the TracerProvider object within your instrumentation code (as described in later steps), use the complementary GetTracerProvider() method to recall it:
set tracerProv = ##class(%Trace.Provider).GetTracerProvider()
tracerProv = iris.cls('%Trace.Provider').GetTracerProvider()
-
To instrument your application code, first use the TracerProvider object’s GetTracer() method to instantiate a Tracer. (The Tracer object generates the actual spans of the trace.)
GetTracer() accepts two arguments, Name and Version. Use these arguments to uniquely identify the application or application component that you are tracing and specify its version number. For example:
set tracer = tracerProv.GetTracer("service.orderprocessor", "2.0.2")
tracer = tracerProv.GetTracer("service.orderprocessor", "2.0.2")
-
Start a root span using the Tracer object’s StartSpan() method. In order, StartSpan() accepts the following arguments, which are used to define the properties of the span:
-
Name — A string, used to set the span’s name field
-
Parent — (Optional.) A %Trace.Context object that identifies the span which you wish to specify as the parent span for the new span. If you do not provide a Parent and you have not previously specified an active span (as described in a later step), then the method initializes the span as a root span, with a unique Trace ID.
-
Spankind — (Optional.) A string, identifying the span as belonging to one of the OpenTelemetry specification’s recognized span kindsOpens in a new tab. If you do not provide a Spankind, the span is classified as Internal by default.
-
AttributesOpens in a new tab — (Optional.) An array of key-value pairs, passed by reference.
-
StartTime — (Optional.) A timestamp recording the span’s start time, in $ZTIMESTAMP format. If not provided, StartTime is set to the current time.
For example, code which initializes a root span for processing a retail transaction may resemble the following:
set rootAttr("customer.id") = customer.ID set rootAttr("product.id") = product.ID set rootSpan = tracer.StartSpan("order", , "Server", .rootAttr)
rootspan = {} rootAttr("customer.id") = customer.ID rootAttr("product.id") = product.ID rootAttrArray = iris.arrayref(rootAttr) set rootSpan = tracer.StartSpan("order", , "Server", .rootAttrArray)
-
-
As needed, nest child spans hierarchically within this root span. For simple implementations, you can manually specify the parent span for a new span by creating a %Trace.Context object which identifies the desired parent as the ActiveSpan, and then providing that Context object as the Parent argument of StartSpan().
However, attempting to manage context across lexical scopes using this manual approach would be impractical. For this reason, the %Trace API provides a dynamic scoping mechanism for managing context.
To manage the distributed context of your trace dynamically, perform the following steps:
-
Designate the parent span as the "active" span using the Tracer object’s SetActiveSpan() method, as follows:
set rootScope = tracer.SetActiveSpan(rootSpan)
rootScope = tracer.SetActiveSpan(rootSpan)
SetActiveSpan() returns an instance of the %Trace.Scope class. This Scope object acts as a record that the corresponding span is active.
-
Start a new span by invoking StartSpan() without specifying a Parent, as follows:
set childSpan1 = tracer.StartSpan("order_payproc")
childSpan1 = tracer.StartSpan("order_payproc")
As long as the active span’s Scope object remains in memory, StartSpan() initializes new spans as children of the active span by default when no other Parent is specified.
-
To nest spans further, invoke SetActiveSpan() on a child span to designate it as the new active span and generate a new Scope object. The active span is identified by the newest Scope object which exists in memory at a given time. Therefore, once you have generated a Scope object for a new active span, StartSpan() will initialize new spans as its children by default.
Continuing the previous example, the following code starts a new span childSpan2 as a child of childSpan1 (which is itself a child of rootSpan):
set child1Scope = tracer.SetActiveSpan(childSpan1) set childSpan2 = tracer.StartSpan("order_payproc_addnewcard")
child1Scope = tracer.SetActiveSpan(childSpan1) childSpan2 = tracer.StartSpan("order_payproc_addnewcard")
-
When you destroy the Scope object for the current active span, the span which was previously active becomes the default parent for StartSpan() once again (assuming you have not destroyed its Scope object as well). Continuing the previous examples, the following code starts a new span childSpan3 as a child of rootSpan and a sibling of childSpan1:
kill child1Scope set childSpan3 = tracer.StartSpan("order_sendconfirm", , "Server")
iris.execute('kill child1Scope) childSpan3 = tracer.StartSpan("order_sendconfirm", , "Server")
-
-
As needed, define information about your spans. Available methods for this purpose include the following:
-
Use the Span object’s AddEvent() method to add a span eventOpens in a new tab.
-
Use the Span object’s AddLink() method to add a span linkOpens in a new tab.
-
Modify the TraceFlags and TraceState properties of the Span object’s Context property (an instance of %Trace.SpanContext) to modify a span’s trace flags and trace state, respectively.
Note:The TraceFlags property provides a bit which specifies whether or not the span will be sampled for export. By defining the logic which sets the value of this bit conditionally, you can define a sampling algorithm for tracing within your application.
Continuing the previous examples, the following code enhances the "order_payproc" span (childSpan1) with a "paymentdeclined" event and a link to a hypothetical span named paymentProcLivenessSpan:
set eventAttr("declined.reason")="Unknown error occurred." do childSpan1.AddEvent("paymentdeclined", .eventAttr) do childSpan1.AddLink(paymentProcLivenessSpan.Context)
eventAttr = {} eventAttr("declined.reason") = "Unknown error occurred." eventAttrArray = iris.arrayref(eventAttr) childSpan1.AddEvent("paymentdeclined", .eventAttrArray) childSpan1.AddLink(paymentProcLivenessSpan.Context)
-
-
Before you end a span, update the span’s status using the Span object’s SetStatus() method, as in the following example:
do childSpan1.SetStatus("Ok")
childSpan1.SetStatus("Ok")
SetStatus() accepts one argument (a string) which can have three possible values corresponding to the three span statusesOpens in a new tab recognized by the OpenTelemetry specification.
-
End each span using the Span object’s End() method. End() accepts an optional argument: a timestamp, in $TIMESTAMP format. If a timestamp is provided, End() records that time as the end time for the span. Otherwise, End() sets the span’s end time to the current time, as in the following example:
do childSpan1.End()
childSpan1.End()
InterSystems IRIS invokes the OpenTelemetry SDK to export the span when you end it, assuming that the ‘sampled’ bit in the span Context property’s TraceFlags has been set appropriately.
-
If you are using SetActiveSpan() to manage nested spans across lexical scopes (as suggested in a preceding step), destroy the Scope object for each active span after you end the span. Continuing the previous examples, the following code would conclude the trace encompassed by rootSpan and prepare the application to record a new trace:
do rootSpan.SetStatus("Ok") do rootSpan.End() kill rootScope
rootSpan.SetStatus("Ok") rootSpan.End() iris.execute('kill rootScope')
-
As needed, import and recompile the code you have edited to enable tracing for your application.
Deactivate Tracing
To deactivate tracing for your application after you have instrumented it, perform the following steps:
-
Edit your code so that it initializes any Tracer Provider that your application uses as an instance of %Trace.NoopTracerProvider (instead of %Trace.TracerProvider). Revisiting the example provided in the preceding instrumentation instructions, deactivation would require you to change the line of code which sets the tracerProv object so that it reads as follows:
set tracerProv = ##class(%Trace.NoopTracerProvider).%New(.attributes)
tracerProv = iris.cls('%Trace.NoopTracerProvider')._New(.attrArray)
Assuming that tracerProv has been set as the Tracer Provider which serves the entire namespace, no further edits would be necessary.
-
As needed, import and recompile the code you have edited to deactivate tracing for your application.
Error Handling and Recovery
If the OpenTelemetry-compatible tool which was receiving signals at the OTLP/HTTP endpoint becomes unavailable due to an unexpected system error, the InterSystems IRIS instance’s OpenTelemetry exporter logs an error to the system messages log (messages.log). It then stops emitting signals from the instance.
The SYS.Monitor.OTelOpens in a new tab class provides methods to help you test whether communication at the OTLP/HTTP endpoint has been restored: TestLogs()Opens in a new tab, TestMetrics()Opens in a new tab, and TestTraces()Opens in a new tab.
Once you have resolved the cause of the error and restored the OTLP/HTTP connection, you can resume the emission of signals as follows:
-
Open a Terminal session on the instance and navigate to the %SYS namespace.
-
To resume the emission of metrics and logs, execute the following command:
do ##class(SYS.Monitor.OTel).Start()
-
To resume the emission of traces, execute the following command:
do ##class(SYS.Monitor.OTel).EnableTraces()