Horizontally Scaling Systems for User Volume with InterSystems Distributed Caching
When vertical scaling alone proves insufficient for scaling your InterSystems IRIS data platform to meet your workload’s requirements, you can consider distributed caching, an architecturally straightforward, application-transparent, low-cost approach to horizontal scaling.
The InterSystems IRIS™ distributed caching architecture scales horizontally for user volume by distributing both application logic and caching across a tier of application servers sitting in front of a data server, enabling partitioning of users across this tier. Each application server handles user requests and maintains its own database cache, which is automatically kept in sync with the data server, while the data server handles all data storage and management. Interrupted connections between application servers and data server are automatically recovered or reset, depending on the length of the outage.
Distributed caching allows each application server to maintain its own, independent working set of the data, which avoids the expensive necessity of having enough memory to contain the entire working set on a single server and lets you add inexpensive application servers to handle more users. Distributed caching can also help when an application is limited by available CPU capacity; again, capacity is increased by adding commodity application servers rather than obtaining an expensive processor for a single server.
Distributed Cache Cluster
This architecture is enabled by the use of the Enterprise Cache Protocol (ECP), a core component of InterSystems IRIS® Data Platform™, for communication between the application servers and the data server.
The distributed caching architecture and application server tier are entirely transparent to the user and to application code. You can easily convert an existing standalone InterSystems IRIS instance that is serving data into the data server of a cluster by adding application servers.
The following sections provide more details about distibuted caching:
To better understand distributed caching architecture, review the following information about how data is stored and accessed by InterSystems IRIS:
InterSystems IRIS stores data in a file in the local operating system called a database. An InterSystems IRIS instance may (and usually does) have multiple databases.
InterSystems IRIS applications access data by means of a namespace, which provides a logical view of the data stored in one or more databases. A InterSystems IRIS instance may (and usually does) have multiple namespaces.
Each InterSystems IRIS instance maintains a database cache a local shared memory buffer used to cache data retrieved from the databases, so that repeated instances of the same query can retrieve results from memory rather than storage, providing a very significant performance benefit.
The architecture of a distributed cache cluster is conceptually simple, using these elements in the following manner:
An InterSystems IRIS instance becomes an application server by adding another instance as a remote server, and then adding any or all of its databases as remote databases. This makes the second instance a data server for the first instance.
Local namespaces on the application server are mapped to remote databases on the data server in the same way they are mapped to local databases. The difference between local and remote databases is entirely transparent to an application querying a namespace on the application server.
The application server maintains its own database cache in the same manner as it would if using only local databases. ECP efficiently shares data, locks, and executable code among multiple InterSystems IRIS instances, as well as synchronizing the application server caches with the data server.
In practice, a distributed cache cluster of multiple application servers and a data server works as follows:
The data server continues to store, update, and serve the data. The data server also synchronizes and maintains the coherency of the application servers’ caches to ensure that users do not receive or keep stale data, and manages locks across the cluster.
Each query against the data is made in a namespace on one of the various application servers, each of which uses its own individual database cache to cache the results it receives; as a result, the total set of cached data is distributed across these individual caches. If there are multiple data servers, the application server automatically connects to the one storing the requested data. Each application server also monitors its data server connections and, if a connection is interrupted, attempts to recover it.
User requests can be distributed round-robin across the application servers by a load balancer, but the most effective approach takes full advantage of distributed caching by directing users with similar requests to the same application server, increasing cache efficiency. For example, a health care application might group clinical users who run one set of queries on one application server and front-desk staff running a different set on another. If the cluster handles multiple applications, each application's users can be directed to a separate application server. The illustrations that follow compare a single InterSystems IRIS instance to a cluster in which user connections are distributed in this manner. (Load balancing user requests can even be detrimental in some circumstances; for more information see Use Caution with Load-balanced Application Servers
The number of application servers in a cluster can be increased (or reduced) without requiring other reconfiguration of the cluster or operational changes, so you can easily scale as user volume increases.
Local databases mapped to local namespaces on a single InterSystems IRIS instance
Remote databases on a data server mapped to namespaces on application servers in a distributed cache cluster
In a distributed cache cluster, the data server is responsible for the following:
Storing data in its local databases.
Synchronizing the application server database caches with the databases so the application servers do not see stale data.
Managing the distribution of locks across the network.
Monitoring the status of all application servers connections and taking action if a connection is interrupted for a specific amount of time (see ECP Recovery
In a distributed cache cluster, each application server is responsible for the following:
Establishing connections to a specific data server whenever an application requests data that is stored on that server.
Maintaining, in its cache, data retrieved across the network.
Monitoring the status of all connections to the data server and taking action if a connection is interrupted for a specific amount of time (see ECP Recovery
A distributed cache cluster can include more than one data server (although this is uncommon). An InterSystems IRIS instance can simultaneously act as both a data server and an application server, but cannot act as a data server for the data it receives as an application server.
ECP supports the distributed cache architecture by providing the following features:
Automatic, fail-safe operation.
Once configured, ECP automatically establishes and maintains connections between application servers and data servers and attempts to recover from any disconnections (planned or unplanned) between application server and data server instances (see ECP Recovery
). ECP can also preserve the state of a running application across a failover of the data server (see Distributed Caching and High Availability
Along with keeping data available to applications, these features make a distributed cache cluster easier to manage; for example, it is possible to temporarily take a data server offline or fail over as part of planned maintenance without having to perform any operations on the application server instances.
InterSystems IRIS systems in a distributed cache cluster can run on different hardware and operating system platforms. ECP automatically manages any required data format conversions.
ECP is designed to automatically recover from interruptions in connectivity between an application server and the data server. In the event of such an interruption, ECP executes a recovery protocol that differs depending on the nature of the failure and on the configured timeout intervals. The result is that the connection is either recovered, allowing the application processes to continue as though nothing had happened, or reset, forcing transaction rollback and rebuilding of the application processes.
While ECP recovery handles interrupted application server connections to the data server, the application servers in a distributed cache cluster are also designed to preserve the state of the running application across a failover of the data server. Depending on the nature of the application activity and the failover mechanism, some users may experience a pause until failover completes, but can then continue operating without interrupting their workflow.
Data servers can be mirrored for high availability in the same way as a stand-alone InterSystems IRIS instance, and application servers can be set to automatically redirect connections to the backup in the event of failover. (It is not necessary or even possible to mirror an application server, as it does not store any data.) For detailed information about the use of mirroring in a distributed cache cluster, see Configuring ECP Connections to a Mirror
the “Mirroring” chapter in the High Availability Guide
The other failover strategies detailed in the “System Failover Strategies
” chapter of the High Availability Guide
can also be used in a distributed cache cluster. Regardless of the failover strategy employed for the data server, the application servers reconnect and recover their states following a failover, allowing application processing to continue where it left off prior to the failure.
An InterSystems IRIS distributed cache cluster consists of a data server providing data to one or more application servers, which in turn provide it to the application. This section describes procedures for deploying a distributed cache cluster.
The recommended method for deploying InterSystems IRIS Data Platform is InterSystems Cloud Manager (ICM). By combining plain text declarative configuration files, a simple command line interface, the widely-used Terraform infrastructure as code tool, and InterSystems IRIS deployment in Docker containers, ICM provides you with a simple, intuitive way to provision cloud or virtual infrastructure and deploy the desired InterSystems IRIS architecture on that infrastructure, along with other services. ICM can deploy distributed cache clusters and other InterSystems IRIS configurations on Amazon Web Services, Google Cloud Platform, Microsoft Azure or VMware vSphere. ICM can also deploy InterSystems IRIS on an existing physical or virtual cluster.
You can also deploy a distributed cluster by using the management portal to configure existing or newly installed InterSystems IRIS instances; instructions for this procedure are provided in Deploy the Cluster Using the Management Portal
The most typical distributed cache cluster configuration involves one InterSystems IRIS instance per system, and one cluster role per instance that is, either data server or application server. When deploying using ICM, this configuration is the only option. The provided procedure for using the management portal assumes this configuration as well.
There are several stages involved in provisioning and deploying a containerized InterSystems IRIS configuration, including a distributed cache cluster, with ICM. The ICM Guide
provides complete documentation of ICM, including details of each of the stages. This section briefly reviews the stages and provides links to the ICM Guide
ICM is provided as a Docker image. Everything required by ICM to carry out its provisioning, deployment, and management tasks is included in the ICM container, including a /Samples
directory that provides you with samples of the elements required by ICM, customized to the four supported cloud providers. To launch ICM, on a system on which Docker is installed, you use the docker run
command with the ICM image from the InterSystems repository to start the ICM container.
Before defining your deployment, you must obtain security-related files including cloud provider credentials and keys for SSH and SSL/TLS. For more information about these files and how to obtain them, see Obtain Security-Related Files
in the “Using ICM” chapter.
ICM uses JSON files as both input and output. To provide the needed parameters to ICM, you must represent your target configuration and the platform on which it is to be deployed in two of ICM’s JSON configuration files: the defaults.json
file, which contains information about the entire deployment, and the definitions.json
file, which contains information about the types and numbers of the nodes provisioned and deployed by ICM, as well as details specific to each node type. For example, the defaults file determines which cloud provider your distributed cache cluster nodes are provisioned on and the locations of the required security files and InterSystems IRIS license keys, while the definitions file determines how many application servers are included in the sharded cluster and whether the data volume for the data server will be larger than for the application servers. Most ICM parameters have defaults; a limited number of parameters can be specified on the ICM command line as well as in the configuration file.
The following table briefly lists the types of node (determined by the Role
field in the definitions file) that ICM can provision, configure, and deploy services on, with the distributed cache cluster roles highlighted. For detailed descriptions of the node types, see ICM Node Types
in the “ICM Reference” chapter of the ICM Guide
ICM Node Types
||Sharded Cluster Role
Distributed cache data server (can also serve as shard master data server and stand-alone instance)
Distributed cache application server (can also serve as Shard master application server)
Shard data server
Shard query server
Mirror arbiter (for mirrored deployments)
Load balancer (can be automatically provisioned with AM, WS and GG nodes)
Virtual machine (general purpose)
||Custom and third-party container node
When your definitions files are complete, begin the provisioning phase by issuing the command icm provision
on the ICM command line. This command allocates and configures the nodes specified in the definitions file. At completion, ICM also provides a summary of the compute nodes and associated components that have been provisioned, and outputs a command line which can be used to delete the infrastructure at a later date, for example:
Machine IP Address DNS Name
------- --------- -------
ACME-DM-TEST-0001 00.53.183.209 ec2-00-53-183-209.us-west-1.compute.amazonaws.com
ACME-AM-TEST-0002 00.56.59.42 ec2-00-56-59-42.us-west-1.compute.amazonaws.com
ACME-AM-TEST-0003 00.67.1.11 ec2-00-67-1-11.us-west-1.compute.amazonaws.com
ACME-AM-TEST-0004 00.193.117.217 ec2-00-193-117-217.us-west-1.compute.amazonaws.com
ACME-LB-TEST-0000 (virtual AM) ACME-AM-TEST-1546467861.amazonaws.com
To destroy: icm unprovision stateDir /Samples/AWS/ICM-8620265620732464265 provider AWS [-cleanUp] [-force]
ICM carries out deployment of InterSystems IRIS and other software services using Docker images, which it runs as containers by making calls to Docker. In addition to Docker, ICM also carries out some InterSystems IRIS-specific configuration over JDBC. There are many container management tools available that can be used to extend ICM’s deployment and management capabilities.
The icm run
command downloads, creates, and starts the specified container on the provisioned nodes. The icm run
command has a number of useful options, and also lets you specify Docker options to be included, so there are many versions on the command line depending on your needs. Here are just two examples:
When deploying InterSystems IRIS images, you must set the password for the predefined accounts on the deployed instances. The simplest way to do this is to omit a password specification from both the definitions files and the command line, which causes ICM to prompt you for the password (with typing masked) when you execute icm run
. But this may not be possible in some situations, such as when running ICM commands with a script, in which case you need either the -iscPassword
command line option or the iscPassword
field in the defaults file.
You can deploy different containers on different nodes for example, InterSystems IRIS the DM and AM nodes and the InterSystems Web Gateway on the WS nodes by specifying different values for the DockerImage
field (such as intersystems/iris:stable
) in the different node definitions in the definitions.json
file. To deploy multiple containers on a node or nodes, however, you can run the icm run
command more than once the first time to deploy the image(s) specified by the DockerImage
field, and subsequent times using the -image
options (and possibly the -role
option) to deploy a custom container.
At deployment completion, ICM sends a link to the appropriate node’s management portal, for example:
Management Portal available at: http://ec2-00-153-49-109.us-west-1.compute.amazonaws.com:52773/csp/sys/UtilHome.csp
In the case of a distributed cache cluster, the provided link is for the data server instance.
Because public cloud platform instances continually generate charges and unused instances in private clouds consume resources to no purpose, it is important to unprovision infrastructure in a timely manner. The icm unprovision
command deallocates the provisioned infrastructure based on the state files created during provisioning. As described in Provision the Infrastructure
, you must specify the state directory with this command; the needed command line is provided when the provisioning phase is complete, and is also contained in the ICM log file, for example:
To destroy: icm unprovision stateDir /Samples/AWS/ICM-8620265620732464265 provider AWS [-cleanUp] [-force]
Once you have installed or identified the InterSystems IRIS instances you intend to include, and arranged network access of sufficient bandwidth among their hosts, configuring a distributed cache cluster using the management portal involves the following steps:
An InterSystems IRIS instance cannot actually operate as the data server in a distributed cache cluster until it is configured as such on the application servers. The procedure for preparing the instance to be a data server, however, includes one required action and two optional actions.
In the This System as an ECP Data Server
box on the right, enable the ECP service by clicking the Enable
link for the service This opens an Edit Service dialog for %Service_ECP; select Service Enabled
and click Save
to enable the service. (If the service is already enabled, as indicated by the presence of a Disable
link in the box, go on to the next step.)
If you want multiple application servers to be able to connect simultaneously to the data server, in the This System as an ECP Data Server
box, change the Maximum number of application servers
setting to the number of application servers you want to configure, then click Save
and restart the instance. (If the number of simultaneous application server connections becomes greater than the number you enter for this setting, the data server instance automatically restarts.)
The Time interval for Troubled state
settings determines one of three timeouts used manage recovery of interrupted connections between application servers and the data server; leave it at the default of 60 until you have some data about the cluster’s operation over time. For more information on the ECP recovery timeouts, see ECP Recovery Protocol
An application server can connect only if Use SSL/TLS
is selected for this data server.
An application server can connect regardless of whether Use SSL/TLS
is selected for this data server.
An application server cannot connect if Use SSL/TLS
is selected (default) for this data server.
The data server is now ready to accept connections from valid application servers.
Configuring an InterSystems IRIS instance as an application server in a distributed cache cluster involves two steps:
To add the data server to the application server, do the following:
to display the ECP Data Servers page and click Add Server
. In the ECP Data Server dialog, enter the following information for the data server:
A descriptive name identifying the data server. (This name is limited to 64 characters.)
When adding a mirror primary as a data server (see the Mirror Connection
setting), do not enter the virtual IP address (VIP) of the mirror, but rather the DNS name or IP address of the current primary failover member.
The port number defaults to 51773
, the default InterSystems IRIS superserver (IP) port; change it as necessary to the superserver port of the InterSystems IRIS instance on the data server.
If the ECP SSL/TLS support
setting for the data server you are adding is Disabled
, it does not matter whether you select this checkbox; SSL/TLS will not be not used to secure connections to the data server.
If the ECP SSL/TLS support
setting for the data server you are adding is Enabled
, select this checkbox to use SSL/TLS to secure connections to this data server; clear it to not use SSL/TLS.
. The data server appears in the data server list; you can remove or edit the data server definition, or change its status (see Monitoring Distributed Applications
) using the available links. You can also view a list of all application servers connecting to a data server by going to the ECP Settings page on the data server and clicking the Application Servers
To add each desired database on the data server as a remote database on the application server, you must create a namespace on the application server and map it to that database, as follows:
Enter a name for the new namespace, which typically reflects the name of the remote database it is mapped to.
. The new namespace now appears in the list on the Namespaces list.
Once you have added a data server database as a remote database on the application server, applications can query that database through the namespace it is mapped to on the application server.
Remember that even though a namespace on the application server is mapped to a database on the data server, changes to the namespace mapped to that database on the data server
are unknown to the application server. (For information about mapping, see Global Mappings
in the “Configuring InterSystems IRIS” chapter of the System Administration Guide
.) For example, suppose the namespace DATA
on the data server has the default globals database DATA
; on the application server, the namespace REMOTEDATA
is mapped to the same (remote) database, DATA
. If you create a mapping in the DATA
namespace on the data server mapping the global ^DATA2
to the DATA2
database, this mapping is not propagated to the application server. Therefore, if you do not add DATA2
as a remote database on the application server and create the same mapping in the REMOTEDATA
namespace, queries the application server receives will not be able to read the ^DATA2
All InterSystems instances in a distributed cache cluster need to be within the secured InterSystems IRIS perimeter (that is, within an externally secured environment). This is because ECP is a basic security service, rather than a resource-based service, so there is no way to regulate which users have access to it. (For more information on basic and resource-based services, see the Available Services
section of the “Services” chapter of the Security Administration Guide
However, the following security tools are available:
The use of SSL/TLS for application server connections to this data server is disabled, even for an application server on which Use SSL/TLS
The use of SSL/TLS for application server connections is enabled on the data server; SSL/TLS is used for connections from application servers on which Use SSL/TLS
is selected, and is not used for connections from application servers on which Use SSL/TLS
is not selected.
The data server requires application server connections to use SSL/TLS; an application server can connect to the data server only if Use SSL/TLS
is selected for the data server, in which case SSL/TLS is used for all connections.
There are two requirements for establishing a connection from an application server to a data server using SSL/TLS, as follows:
By default, any InterSystems IRIS instance on which the data server instance is configured as a data server (as described in the previous section) can connect to the data server. However, you can restrict which instances can act as application servers for the data server by specifying the hosts from which incoming connections are allowed; if you do this, hosts not explicitly listed cannot connect to the data server. Do this by performing the following steps on the data server:
By default, the Allowed Incoming Connections
box is empty, which means any application server can connect to this instance if the ECP service is enabled; click
and enter a single IP address (such as 22.214.171.124
) or fully-qualified domain name (such as mycomputer.myorg.com
), or a range of IP addresses (for example,8.61.202210.*
). Once there are one or more entries on the list and you click Save
in the Edit Service dialog, only the hosts specified by those entries can connect.
To be granted access to a database on the data server, the role held by the user initiating the process on the application server and the role set for the ECP connection on the data server must both include permissions for the same resource representing that database. For example, if a user belongs to a role on an application server that grants the privilege of read permission for a particular database resource, and the role set for the ECP connection on the data server also includes this privilege, the user can read data from the database on the application server.
By default, InterSystems IRIS grants ECP connections on the data server the %All
privilege when the data server runs on behalf of an application server. This means that whatever privileges the user on the application server has are matched on the data server, and access is therefore controlled only on the application server. For example, a user on the application server who has privileges only for the %DB_USER
resource but not the %DB_IRISLIB
resource can access data in the USER
database on the data server, but attempting to access the IRISLIB
database on the data server results in a <PROTECT>
error. If a different user on the application server has privileges for the %DB_IRISLIB
resource, the IRISLIB
database is available to that user.
InterSystems recommends the use of an LDAP server to implement centralized security. including user roles and privileges, across the application servers of a distributed cache cluster. For information about using LDAP with InterSystems IRIS, see the “Using LDAP
” chapter of the Security Administration Guide
However, you can also restrict the roles available to ECP connections on the data server based on the application server host. For example, on the data server you can specify that when interacting with a specific application server, the only available role is %DB_USER
. In this case, users on the application server granted the %DB_USER
role can access the USER
database on the data server, but no users on the application server can access any other database on the data server regardless of the roles they are granted.
InterSystems strongly recommends that you secure the cluster by specifying available roles for all application servers in the cluster, rather than allowing the data server to continue to grant the %All privilege to all ECP connections.
The following are exceptions to this behavior:
InterSystems IRIS always grants the data server the %DB_IRISSYS
role since it requires Read access to the IRISSYS
database to run. This means that a user on an application server with %DB_IRISSYS
can access the IRISSYS
database on the data server.
To prevent a user on the application server from having access to the IRISSYS
database on the data server, there are two options:
If the data server has any public resources, they are available to any user on the ECP application server, regardless of either the roles held on the application server or the roles configured for the ECP connection.
To specify the available roles for ECP connections from a specific application server on the data server, do the following:
Go to the Services page (from the portal home page, select Security
and then Services
) and click %Service_ECP
to deiplay the Edit Service dialog.
Click the Edit
link for the application server host you want to restrict to display the Select Roles
To specify roles for the host, select roles from those listed under Available
and click the right arrow to add them to the Selected
To remove roles from the Selected
list, select them and then click the left arrow.
To add all roles to the Selected
list, click the double right arrow; to remove all roles from the Selected
list, click the double left arrow.
to associate the roles with the IP address.
By default, a listed host holds the %All role, but if you specify one or more other roles, these roles are the only roles that the connection holds. Therefore, a connection from a host or IP range with the %Operator role has only the privileges associated with that role, while a connection from a host with no associated roles (and therefore %All) has all privileges.
Changes to the roles available to application server hosts and to the public permissions on resources on the data server require a restart of InterSystems IRIS before taking effect.
The behavior of security-related error reporting with ECP varies depending on whether the check fails on the application server or the data server and the type of operation:
If the check fails on the application server, there is an immediate <PROTECT>
For synchronous operations on the data server, there is an immediate <PROTECT>
A running distributed cache cluster consists of a data server instance a data provider connected to one or more application server systemsdata consumers. Between each application server and the data server, there is an ECP connection a TCP/IP connection that ECP uses to send data and commands.
The following sections describe status information for connections:
The ECP Data Servers page displays the following information for each data server
The logical name of the data server system on this connection, as entered when the server was added to the application server configuration.
The host name of the data server system, as entered when the server was added to the application server configuration.
The IP port number used to connect to the data server.
The current status of this connection. Connection states are described in the ECP Connection States
If the current status of this connection is Not Connected
, you can edit the port and host information of the data server.
From each data server row you can change the status of an existing ECP connection with that data server; see the ECP Connection Operations
section for more information.
You can delete the data server information from the application server.
The logical name of the application server on this connection.
The current status of this connection. Connection states are described in the ECP Connection States
The host name or IP address of the application server
The port number used to connect to the application server.
In an operating cluster, an ECP connection can be in one of the following states:
ECP Connection States
||The connection is defined but has not been used yet.
|Connection in Progress
||The connection is in the process of establishing itself. This is a transitional state that lasts only until the connection is established.
||The connection is operating normally and has been used recently.
||The connection has encountered a problem. If possible, the connection automatically corrects itself.
||The connection has been manually disabled by a system administrator. Any application making use of this connection receives a <NETWORK> error.
The following sections describe each connection state as it relates to application servers or the data server:
The following sections describe the application server side of each of the connection states:
An application server-side ECP connection starts out in the Not Connected
state. In this state, there are no ECP daemons for the connection. If an application server process makes a network request, daemons are created for the connection and the connection enters the Connection in Progress
In the Connection in Progress
state, a network daemon exists for the connection and actively tries to establish a connection to the data server; when the connection is established, it enters the Normal
state. While the connection is in the Connection in Progress
state, the user process must wait for up to 20 seconds for it to be established. If the connection is not established within that time, the user process receives a <NETWORK>
The application server ECP daemon attempts to create a new connection to the data server in the background. If no connection is established within 20 minutes, the connection returns to the Not Connected
state and the daemon for the connection goes away.
After a connection completes, it enters the Normal
(data transfer) state. In this state, the application server-side daemons exist and actively send requests and receive answers across the network. The connection stays in the Normal
state until the connection becomes unworkable or until the application server or the data server requests a shutdown of the connection.
If the connection from the application server to the data server encounters problems, the application server ECP connection enters the Trouble
state. In this state, application server ECP daemons exist and are actively try to restore the connection. An underlying TCP connection may or may not still exist. The recovery method is similar whether or not the underlying TCP connection gets reset and must be recreated, or if it stops working temporarily.
During the application server Time to wait for recovery
timeout (default of 20 minutes), the application server attempts to reconnect to the data server to perform ECP connection recovery. During this interval, existing network requests are preserved, but the originating application server-side user process blocks new network requests, waiting for the connection to resume. If the connection returns within the Time to wait for recovery
timeout, it returns to the Normal
state and the blocked network requests proceed.
For example, if a data server goes offline, any application server connected to it has its state set to Trouble
until the data server becomes available. If the problem is corrected gracefully, a connection’s state reverts to Normal
; otherwise, if the trouble state is not recovered, it reverts to Not Connected
Applications continue running until they require network access. All locally cached data is available to the application while the server is not responding.
Transitional recovery states are part of the Trouble
state. If there is no current TCP connection to the data server, and a new connection is established, the application server and data server engage in a recovery protocol
which flushes the application server cache, recovers transactions and locks, and returns to the Normal
Similarly, if the data server shuts down, either gracefully or as a result of a crash, and then restarts, it enters a short period (approximately 30 seconds) during which it allows application servers to reconnect and recover their existing sessions. Once again, the application server and the data server engage in the recovery protocol.
If connection recovery is not complete within the Time to wait for recovery
timeout, the application server gives up on connection recovery. Specifically, the application server returns errors to all pending network requests and changes the connection state to Not Connected
. If it has not already done so, the data server rolls back all the transactions and releases all the locks from this application server the next time this application server connects to the data server.
If the recovery is successful, the connection returns to the Normal
state and the blocked network requests proceed.
An ECP connection is marked Disabled
if an administrator declares that it is disabled. In this state, no daemons exist and any network requests that would use that connection immediately receive <NETWORK>
The following sections describe the data server side of each of the connection states:
When an ECP server instance starts up, all incoming ECP connections are in an initial “unassigned” Free
state and are available for connections from any application server that is listed in the connection access control list. If a connection from an application server previously existed and has since gone away, but does not require any recovery steps, the connection is placed in the “idle” Free
state. The only difference between these two states is that in the idle state, this connection block is already assigned to a particular application server, rather than being available for any application server that passes the access control list.
In the data server Normal
state, the application server connection is normal. At any point in the processing of incoming connections, whenever the application server disconnects from the data server (except as part of the data server’s own shutdown sequence), the data server rolls back any pending transactions and releases any incoming locks from that application server, and places the application server connection in the “idle” Free
If the application server is not responding, the data server shows a Trouble
state. If the data server crashes or shuts down, it remembers the connections that were active at the time of the crash or shutdown. After restarting, the data server waits for a brief time (usually 30 seconds) for application servers to reclaim their sessions (locks and open transactions). If an application server does not complete recovery during this awaiting recovery interval, all pending work on that connection is rolled back and the connection is placed in the “idle” state.
The data server connection is in a recovery state for a very short time when the application server is in the process of reclaiming its session. The data server keeps the application server in trouble state for the Time interval for Troubled state
timeout (default is 60 seconds) for it to reclaim the connection; otherwise, it releases the application resources (rolls back all open transactions and releases locks) and then sets the state to Free
Change to Disabled
Set the state of this connection to Disabled
. This releases any locks held for the application server, rolls back any open transactions involving this connection, and purges cached blocks from the data server. If this is an active connection, the change in status sends an error to all applications waiting for network replies from the data server.
Change to Normal
Set the state of this connection to Normal
Change to Not Connected
Set the state of this connection to Not Connected
. As with changing the state to Disabled
, this releases any locks held for the application server, rolls back any open transactions involving this connection, and purges cached blocks from the data server. If this is an active connection, the change in status sends an error to all applications waiting for network replies from the data server.
This chapter discusses application development and design issues that are helpful if you would like to deploy your application on a distributed cache cluster, either as an option or as its primary configuration.
With InterSystems IRIS, the decision to deploy an application as a distributed system is primarily a runtime configuration issue (see Deploying a Distributed Cache Cluster
). Using InterSystems IRIS configuration tools, map the logical names of your data (globals) and application logic (routines) to physical storage on the appropriate system.
This chapter discusses the following topics:
ECP is designed to automatically recover from interruptions in connectivity between an application server and the data server. In the event of such an interruption, ECP executes a recovery protocol that differs depending on the nature of the failure. The result is that the connection is either recovered, allowing the application processes to continue as though nothing had happened, or reset, forcing transaction rollback and rebuilding of the application processes. The main principles are as follows:
When the connection between an application server and data server is interrupted, the application server attempts to reestablish its connection with the data server, repeatedly if necessary, at an interval determined by the Time between reconnections
setting (5 seconds by default).
When the interruption is brief, the connection is recovered.
If the connection is reestablished within the data server’s configured Time interval for Troubled state
timeout period (60 seconds by default), the data server restores all locks and open transactions to the state they were in prior to the interruption.
If the interruption is longer, the data server resets the connection, so that it cannot be recovered when the interruption ends.
If the connection is not reestablished within the Time interval for Troubled state
, the data server unilaterally resets the connection, allowing it to roll back transactions and release locks from the unresponsive application server so as not to block functioning application servers. When connectivity is restored, the connection is disabled from the application server point of view; all processes waiting for the data server on the interrupted connection receive a <NETWORK>
error and enter a rollback-only condition. The next request received by the application server establishes a new connection to the data server.
If the interruption is very long, the application server also resets the connection.
If the connection is not reestablished within the application server’s longer Time to wait for recovery
timeout period (20 minutes by default), the application server unilaterally resets the connection; all processes waiting for the data server on the interrupted connection receive a <NETWORK>
error and enter a rollback-only condition. The next request received by the application server establishes a new connection to the data server, if possible.
ECP Timeout Values
|Management Portal Setting
|Time between reconnections
||The interval at which an application makes attempts to reconnect to the data server.
|Time interval for Troubled state
||The length of time for which the data server waits for contact from the application server before resetting an interrupted connection.
|Time to wait for recovery
||1200 seconds (20 minutes)
||The length of time for which an application server continues attempting to reconnect to the data server before resetting an interrupted connection.
The default values are intended to do the following:
Avoid tying up data server resources that could be used for other application servers for a long time by waiting for an application server to become available.
Give an application server which has nothing else to do when the data server is not available the ability to wait out an extended connection interruption for much longer by trying to reconnect at frequent intervals.
ECP relies on the TCP physical connection to detect the health of the instance at the other end without using too much of its capacity. On most platforms,you can adjust the TCP connection failure and detection behavior at the system level.
While an application server connection becomes inactive, the data server maintains an active daemon waiting for new requests to arrive on the connection, or for a new connection to be requested by the application server. If the old connection returns, it can immediately resume operation without recovery. When the underlying heartbeat mechanism indicates that the application server is completely unavailable due to a system or network failure, the underlying TCP connection is quickly reset. Thus, an extended period without a response from an application server generally indicates some kind of problem on the application server that caused it to stop functioning, but without interfering with its connections.
Collectively, the nonresponsive state and the awaiting reconnection state are known as the data server Trouble
state. The recovery required in both cases is very similar.
If the data server fails or shuts down, it remembers the connections that were active at the time of the crash or shutdown. After restarting, the data server has a short window (usually 30 seconds) during which it places these interrupted connections in the awaiting reconnection state. In this state, the application server and data server can cooperate together to recover all the transaction and lock states as well as all the pending Set
transactions from the moment of the data server shutdown.
During the recovery of an ECP-configured instance, InterSystems IRIS guarantees a number of recoverable semantics, and also specifies limitations to these guarantees. ECP Recovery Process, Guarantees, and Limitations
describes these in detail, as well as providing additional details about the recovery process.
By default, ECP automatically manages the connection between an application server and a data server. When an ECP-configured instance starts up, all connections between application servers and data servers are in the Not Connected
state (that is, the connection is defined, but not yet established). As soon as an application server makes a request (for data or code) that requires a connection to the data server, the connection is automatically established and the state changes to Normal
. The network connection between the application server and data server is kept open indefinitely.
In some applications, you may wish to close open ECP connections. For example, suppose you have a system, configured as an application server, that periodically (a few times a day) needs to fetch data stored on a data server system, but does not need to keep the network connection with the data server open afterwards. In this case, the application server system can issue a call to the SYS.ECP.ChangeToNotConnected
method to force the state of the ECP connection to Not Connected
Completes sending any data modifications to the data server and waits for acknowledgment from the data server.
Removes any locks on the data server that were opened by the application server.
Rolls back the data server side of any open transactions. The application server side of the transaction goes into a “rollback only” condition.
Completes pending requests with a <NETWORK>
Flushes all cached blocks.
After completion of the state change to Not Connected
, the next request for data from the data server automatically reestablishes the connection.
See Data Server Connections
for information about changing data server connection status from the management portal.
To achieve the highest performance and reliability from distributed cache cluster-based applications, you should be aware of the following issues:
InterSystems strongly discourages establishing multiple duplicate ECP channels between an application server and a data server to try to increase bandwidth. You run the dangerous risk of having locks and updates for a single logical transaction arrive out-of-sync on the data server, which may result in data inconsistency.
In addition to buffering the blocks that are served over ECP, data servers use global buffers to store various ECP control structures. There are several factors that go into determining how much memory these structures might require, but the most significant is a function of the aggregate sizes of the clients' caches. To roughly approximate the requirements, so you can adjust the data server’s database caches if needed, use the following guidelines:
|Database Block Size
||50 MB plus 1% of the sum of the sizes of all of the application servers’ 8 KB database caches
|16 KB (if enabled)
||0.5% of the sum of the sizes of all of the application servers’ 16 KB database caches
|32 KB (if enabled)
||0.25% of the sum of the sizes of all of the application servers’ 32 KB database caches
|64 KB (if enabled)
||0.125% of the sum of the sizes of all of the application servers’ 64 KB database caches
For example, if the 16 KB block size is enabled in addition to the default 8 KB block size, and there are six application servers, each with an 8 KB database cache of 2 GB and a 16 KB database cache of 4 GB, you should adjust the data server’s 8 KB database cache to ensure that 52 MB (50MB + [12 GB * .01]) is available for control structures, and the 16 KB cache to ensure that 2 MB (24 GB * .005) is available for control structures (rounding up in both cases).
Connecting users to application servers in a round-robin or load-balancing scheme may diminish the benefit of caching on the application server. This is particularly true if users work in functional groups that have a tendency to read the same data. As these users are spread among application servers, each application server may end up requesting exactly the same data from the data server, which not only diminishes the efficiency of distributed caching using multiple caches for the same data, but can also lead to increased block invalidation as blocks are modified on one application server and refreshed across other application servers. This is somewhat subjective, but someone very familiar with the application characteristics should consider this possible condition.
Restrict updates within a single transaction to either a single remote data server or the local server. When a transaction includes updates to more than one server (including the local server) and the TCommit
cannot complete successfully, some servers that are part of the transaction may have committed the updates while others may have rolled them back. For details, see Commit Guarantee
in “ECP Recovery Guarantees and Limitations”.
Updates to IRISTEMP
are not considered part of the transaction for the purpose of rollback, and, as such, are not included in this restriction.
Temporary (scratch) globals
should be local to the application server, assuming they do not contain data that needs to be globally shared. Often, temporary globals are highly active and write-intensive. If temporary globals are located on the data server, this may penalize other application servers sharing the ECP connection.
Repeated references to a global that is not defined (for example, $Data(^x(1))
is not defined) always requires a network operation to test if the global is defined on the data server.
By contrast, repeated references to undefined nodes within a defined global (for example, $Data(^x(1))
where any other node in ^x
is defined) does not require a network operation once the relevant portion of the global (^x
) is in the application server cache.
This behavior differs significantly from that of a non-networked application. With local data, repeated references to the undefined global are highly optimized to avoid unnecessary work. Designers porting an application to a networked environment may wish to review the use of globals that are sometimes defined and sometimes not. Often it is sufficient to make sure that some other node of the global is always defined.
InterSystems IRIS addresses this by providing the $Increment
function, which automatically increments a counter value (stored in a global) without any need of application-level locking. Concurrency for $Increment
is built into the InterSystems IRIS database engine as well as ECP, making it very efficient for use in single-server as well as in distributed applications.
Applications built using the default structures provided by InterSystems IRIS objects (or SQL) automatically use $Increment
to allocate object identifier values. $Increment
is a synchronous operation involving journal synchronization when executed over ECP. For this reason, $Increment
over ECP is a relatively slow operation, especially compared to others which may or may not already have data cached (in either the application server database cache or the data server database cache). The impact of this may be even greater in a mirrored environment due to network latency between the failover members. For this reason, it may be useful to redesign an application to assign a batch of new values to each application server and use $Increment
with that local batch within each application server, involving the data server only when a new batch of values is needed. (This approach cannot be used, however, when consecutive application counter values are required.) The $Sequence
function can also be helpful in this context, as an alternative to or used in combination with $Increment
There are several runtime errors that can occur on a system using ECP. An ECP-related error may occur immediately after a command is executed or, in the case of commands that are asynchronous in nature, such as Kill
, the error occurs a short time after the command completes.
error indicates an error that could not be handled by the normal ECP recovery mechanism.
In an application, it is always acceptable to halt a process or roll back any pending work whenever a <NETWORK>
error is received. Some <NETWORK>
errors are essentially fatal error conditions. Others indicate a temporary condition that might clear up soon; however, the expected programming practice is to always roll back any pending work in response to a <NETWORK>
error and start the current transaction over from the beginning.
error on a get-type request such as $Data
can often be retried manually rather than simply rolling back the transaction immediately. ECP tries to avoid giving a <NETWORK>
error that would lose data, but gives an error more freely for requests that are read-only.
The application-side rollback-only condition occurs when the data server detects a network failure during a transaction initiated by the application server and enters a state in which all network requests are met with errors until the transaction is rolled back.
The simplest case of ECP recovery is a temporary network interruption that is long enough to be noticed, but short enough that the underlying TCP connection stays active during the outage. During the outage, the application server notices that the connection is nonresponsive and blocks new network requests for that connection. Once the connection resumes, processes that were blocked are able to send their pending requests.
If the underlying TCP connection is reset, the data server waits for a reconnection for the Time interval for Troubled state
setting (one minute by default). If the application server does not succeed in reconnecting during that interval, the data server resets its connection, rolls back its open transactions, and releases its locks. Any subsequent connection from that application server is converted into a request for a brand new connection and the application server is notified that its connection is reset.
The application server keeps a queue of locks to remove and transactions to roll back once the connection is reestablished. By keeping this queue, processes on the application server can always halt, whether or not the data server on which it has pending transactions and locks is currently available. ECP recovery completes any pending Set and Kill operations that had been queued for the data server before the network outage was detected, before it completes the release of locks.
Any time a data server learns that an application server has reset its own connection (due to application server restart, for example), even if it is still within the Time interval for Troubled state
, the data server resets the connection immediately, rolling back transactions and releasing locks on behalf of that application server. Since the application server’s state was reset, there is no longer any state to be maintained by the data server on its behalf.
The final case is when the data server shut down, either gracefully or as a result of a crash. The application server maintains the application state and tries to reconnect to the data server for the Time to wait for recovery
setting (20 minutes by default). The data server remembers the application server connections that were active at the time of the crash or shutdown; after restarting, it waits up to thirty seconds for those application servers to reconnect and recover their connections. Recovery involves several steps on the data server, some of which involve the data server journal file in very significant ways. The result of the several different steps is that:
The data server’s view of the current active transactions from each application server has been restored from the data server’s journal file.
The data server’s view of the current active Lock
operations from each application server has been restored, by having the application server upload those locks to the data server.
The application server and the data server agree on exactly which requests from the application server can be ignored (because it is certain they completed before the crash) and which ones should be replayed. Therefore, the last recovery step is to simply let the pending network requests complete, but only those network requests that are safe to replay.
Finally, the application server delivers to the data server any pending unlock or rollback indications that it saved from jobs that halted while the data server was restarting. All guarantees are maintained, even in the face of sudden and unanticipated data server crashes, as long as the integrity of the storage devices (for database, WIJ, and journal files) are maintained.
During the recovery of an ECP-configured system, InterSystems IRIS guarantees a number of recoverable semantics which are described in detail in ECP Recovery Guarantees
. Limitations to these guarantees are described in detail in the ECP Recovery Limitations
section of the aforementioned appendix.
During the recovery of an ECP-configured system, InterSystems IRIS guarantees the following recoverable semantics:
In the description of each guarantee the first paragraph describes a specific condition. Subsequent paragraphs describe the data guarantee applicable to that particular situation.
In these descriptions, Process A, Process B and so on refer to processes attempting update globals on a data server. These processes may originate on the same or different application servers, or on the data server itself; in some cases the origins of processes are specified, in others they are not germane.
updates two data elements sequentially, first global ^x
and next global ^y
, where ^x
are located on the same data server.
If Process B
sees the change to ^y
, it also sees the change to ^x
. This guarantee applies whether or not Process A
and Process B
are on the same application server as long as the two data items are on the same data server and the data server remains up.
’s ability to view the data modified by Process A
does not ensure that Set
operations from Process B
are restored after the Set
operations from Process A
. Only a Lock
or a $Increment
operation can ensure proper ordering of competing Set
operations from two different processes during cluster failover or cluster recovery.
This guarantee does not apply if the data server crashes, even if ^x
are journaled. See the Dirty Data Reads for ECP Without Locking
limitation for a case in which processes that fit this description can see dirty data that never becomes durable before the data server crash.
The lock and the data it protects must reside on the same data server.
on a cluster member acquires a lock on global ^x
in a clustered database; a lock once held by Process A
Additionally, if Process C
on a cluster member sees the updates on a clustered database made by Process B
(while holding a lock on ^x
), Process C
also sees the updates made by Process A
on any clustered database (while holding a lock on ^x
Serializability is guaranteed whether or not Process A
, Process B
, and Process C
are located on the same cluster member, and whether or not any cluster member crashes.
All the updates made by Process A
as part of the transaction are rolled back in the reverse order in which they originally occurred.
On each DataServer S
that is part of the transaction, the data modifications on DataServer S
are either committed or rolled back. If the process that executes the TCommit
has the Perform Synchronous Commit
property turned on (SynchCommit=1
, in the configuration file) and the TCommit
operation returns without errors, the transaction is guaranteed to have durably committed on all the data servers that are part of the transaction.
If the transaction includes updates to more than one server (including the local server) and the TCommit
cannot complete successfully, some servers that are part of the transaction may have committed the updates while others may have rolled them back.
InterSystems IRIS guarantees that the lock on ^x
is not released until the transaction has been either committed or rolled back. No other process can acquire a lock on ^x
until Transaction T
either commits or rolls back on DataServer S
An data server crashes in the middle of an application server transaction, restarts, and completes recovery within the application server recovery timeout interval.
The transaction can be completed normally without violating any of the described guarantees. The data server does not perform any data operations that violate the ordering constraints defined by lock semantics. The only exception is the $Increment
function (see the ECP and Clusters $Increment Limitation
section for more information). Any transactions that cannot be recovered are rolled back in a way that preserves lock semantics.
InterSystems IRIS expects but does not guarantee that in the absence of continuing faults (whether in the network, the data server, or the application server hardware or software), all or most of the transactions pending into a data server at the time of a data server outage are recovered.
has an unplanned shutdown, restarts, and completes recovery within the recovery interval.
The ECP Lock Guarantee
still applies as long as all the modified data is journaled. If data is not being journaled, updates made to the data server before the crash can disappear without notice to the application server. InterSystems IRIS no longer guarantees that a process that acquires the lock sees all the updates that were made earlier by other processes while holding the lock.
If DataServer S
shuts down gracefully, restarts, and completes recovery within the recovery interval, the ECP Lock Guarantee
still applies whether or not data is being journaled.
Updates that are part of a transaction are always journaled; the ECP Transaction Recovery Guarantee
applies in a stronger form. Other updates may or may not be journaled, depending on whether or not the destination global in the destination database is marked for journaling.
function induces a loose ordering on a series of Set
operations from separate processes, even if those operations are not protected by a lock.
is relevant only for processes running on an application server. If either process A or B are running on DataServer S itself, then that process does not need to issue a $system.ECP.Sync()
. If both are running on DataServer S then neither needs $system.ECP.Sync
, and this is simply the statement that global updates are immediately visible to processes running on the same server.
During the recovery of an ECP-configured system, there are the following limitations to the InterSystems IRIS guarantees:
If a data server crashes while the application server has a $Increment
request outstanding to the data server and the global is journaled, InterSystems IRIS attempts to recover the $Increment
results from the journal; it does not re-increment the reference.
In the absence of continuing faults, application servers observe data that is no more than a few seconds out of date, but this is not guaranteed. Specifically, if an ECP connection to the data server becomes nonfunctional (network problems, data server shutdown, data server backup operation, and so on), the user process may observe data that is arbitrarily stale, up to an application server connection-timeout value. To ensure that data is not stale, use the Lock
command around the data-fetch operation, or use $system.ECP.Sync
. Any network request that makes a round trip to the data server updates the contents of the application server ECP network cache.
If an application server downloads routines from a data server and the data server restarts (planned or unplanned), the routines downloaded from the data server are marked as if they had been edited.
Additionally, if the connection to the data server suffers a network outage (neither application server nor data server shuts down), the routines downloaded from the data server are marked as if they had been edited. In some cases, this behavior causes spurious <EDITED>
errors as well as <ERRTRAP>
In InterSystems IRIS, the Lock
command is only advisory. If Process A
starts a transaction that is updating global ^x
under protection of a lock on global ^y
, and another process modifies ^x
without the protection of a lock on ^y
, the rollback of ^x
does not work.
On the rollback of Set
operations, if the current value of the data item is what the operation set it to, the value is reset to what it was before the operation. If the current value is different from what the specific Set
operation set it to, the current value is left unchanged.
If a data item is sometimes modified inside a transaction, and sometimes modified outside of a transaction and outside the protection of a Lock
command, rollback is not guaranteed to work. To be effective, locks must be used everywhere a data item is modified.
Rollback depends on the reliability and completeness of the journal. If something interrupts the continuity of the journal data, rollbacks do not succeed past the discontinuity. InterSystems IRIS silently ignores this type of transaction rollback.
A journal discontinuity can be caused by executing ^JRNSTOP
while InterSystems IRIS is running, by deleting the Write Image Journal (WIJ) file after an InterSystems IRIS shutdown and before restart, or by an I/O error during journaling on a system that is not set to freeze the system on journal errors.
operation completes on a data server, but receives an error. The data server crashes after completing that packet, but before delivering that packet to the application server system.
ECP recovery does not replay this packet, but the application server has not found out about the error; resulting in the application server missing Set
operations on the data server.
There are certain cases where a Set
operation can be journaled successfully, but receive an error before actually modifying the database. Given the particular way rollback of a data item is defined, this should not ever break transaction rollback; but the state of a database after a journal restore may not match the state of that database before the restore.
Cluster dejournaling is loosely ordered. The journal files from the separate cluster members are only synchronized wherever a lock, a $Increment
, or a journal marker event occurs. This affects the database state after either a cluster failover or a cluster crash where the entire cluster must be brought down and restored. The database may be restored to a state that is different from the state just before the crash. The $Increment Ordering Guarantee
places additional constraints on how different the restored database can be from its original form before the crash.
’s ability to view the data modified by Process A
does not ensure that Set
operations from Process B
are restored after the Set
operations from Process A
. Only a Lock
or a $Increment
operation can ensure proper ordering of competing Set
operations from two different processes during cluster failover or cluster recovery.
A workaround is to use synchronous commit mode for transactions on the cluster slave Member A
. When using synchronous commit mode, Transaction T1
is durable on disk before its locks are released, so Transaction T1
is not rolled back once the application sees that it is complete.
If an incoming ECP transaction reads data without locking, it may see data that is not durable on disk which may disappear if the data server crashes. It can only see such data when the data location is set by other ECP connections or by the local data server system itself. It can never see nondurable data that is set by this connection itself. There is no possibility of seeing nondurable data when locking is used both in the process reading the data and the process writing the data. This is a violation of the In-order Updates Guarantee
and there is no easy workaround other than to use locking.
If the data server side of a transaction receives an asynchronous error condition, such as a <FILEFULL>
, while updating a database, and the application server does not see that error until the TCommit
, the transaction is automatically rolled back on the data server. However, rollbacks are synchronous while TCommit
operations are usually asynchronous because the rollback will be changing blocks the application server should be notified of before the application server process surrenders any locks.
The data server and the database are fine, but on the application server if the locks get traded to another process he may see temporarily see data that is about to be rolled back. However, the application server does not usually do anything that causes asynchronous errors
Content Date/Time: 2019-04-10 14:45:56