docs.intersystems.com
InterSystems IRIS Data Platform 2019.2  /  Scalability Guide

Scalability Guide
Horizontally Scaling InterSystems IRIS for Data Volume with Sharding
Previous section          
InterSystems: The power behind what matters   
Search:  


This chapter describes the deployment and use of an InterSystems IRIS sharded cluster, and covers the following topics:
Overview of InterSystems IRIS Sharding
Sharding is a significant horizontal scalability feature of InterSystems IRIS® Data Platform™. An InterSystems IRIS™ sharded cluster partitions both data storage and caching across a number of servers, providing flexible, inexpensive performance scaling for queries and data ingestion while maximizing infrastructure value through highly efficient resource utilization. Sharding is easily combined with the considerable vertical scaling capabilities of InterSystems IRIS and with horizontal scaling for user volume through the distributed caching architecture, greatly widening the range of solutions available.
Note:
For a brief introduction to sharding including a hands-on exploration of deploying and using a sharded cluster, see First Look: Scaling for Data Volume with an InterSystems IRIS Sharded Cluster.
Elements of Sharding
Horizontally scaling InterSystems IRIS through sharding can benefit a wide range of applications, but provides the greatest gains in use cases involving one or both of the following:
Sharding horizontally partitions large database tables and their associated indexes across multiple InterSystems IRIS instances, called shard data servers, while allowing applications to access these tables through a single instance, called the shard master data server, or shard master. This architecture provides the following advantages:
Together, the shard master and its associated shard data servers form a sharded cluster.
A shard is a subset of a table's rows, with each row contained within exactly one shard, and all shards of a table containing roughly the same number of rows. Each shard data server hosts a data shard, which is comprised of one shard of each sharded table to which the shard master provides access. A federated software component called the sharding manager keeps track of which data shards (and therefore which table rows) are located on which shard data servers and directs queries accordingly, as well as managing other sharded operations. From the perspective of the application SQL, the distinction between sharded and nonsharded tables is totally transparent.
The shard master provides application access through the master namespace, which contains the sharded tables and in general has all the capabilities of a standard InterSystems IRIS namespace. The master namespace can also contain nonsharded tables, which can be joined to sharded tables in queries. Data shards are contained in shard namespaces assigned to the master namespace; these are managed entirely by the sharding manager and are not exposed to end users.
A shard key, composed one or more of the fields in a sharded table, is used to horizontally partition the table across the data shards, providing a deterministic method of distributing data evenly. A shard key can be either a system-assigned ID or a user-defined set of fields.
Additional options for a sharded cluster include:
For definitions of the terms used in the preceding and other sharding-related terms, see Sharding Glossary.
Sharded Cluster Illustrations
Sample basic sharded cluster configurations are shown in the following diagrams — first, a basic cluster with four shard data server, and then the same cluster with three shard master application servers added. For an illustration of a sharded cluster with query servers, see Deploy Query Shards.
All communication between the nodes of an InterSystems IRIS sharded cluster is managed by the sharding manager. The sharding manager connects the shard master app servers directly to the data shards in the same way as it connects the shard master data server to the data shards.
Basic sharded cluster
Sharded cluster with application servers
Evaluating the Benefits of Sharding
InterSystems IRIS sharding can benefit a wide range of applications, but provides the greatest gains in use cases involving the following:
Each of these factors on its own influences the potential gain from sharding, but the benefit may be enhanced where they combine. For example, a combination of all three factors — large amounts of data ingested quickly, large data sets, and complex queries that retrieve and process a lot of data — makes many of today’s analytic workloads very good candidates for sharding.
Sharding uses InterSystems IRIS ECP to further enhance performance in two ways, as described in Elements of Sharding:
As previously noted, and discussed in more detail in Plan an InterSystems IRIS Sharded Cluster, combining InterSystems IRIS sharding with the use of vertical scaling to address some of the factors described in the foregoing may be most beneficial under many circumstances.
Note:
In the current release, transactions initiated on the shard master do not open corresponding transactions on the shard data servers, and sharding is therefore not appropriate for workloads involving complex transactions requiring atomicity.
Node-Level Architecture
In future releases, the InterSystems IRIS sharding architecture will be simplified and streamlined. The current release includes a new API corresponding to this node-level architecture (see %SYSTEM.Cluster API), which makes it possible for you to explore it, and to implement it if you choose.
In the node-level architecture, there are just two types of cluster nodes, which are described in the following:
Note:
The namespace-level architecture described in Elements of Sharding and illustrated in Sharded Cluster Illustrations remains in place as the transparent foundation of the node-level architecture. Instead of a shard namespace, each data node has a cluster namespace (identically named across the cluster) that provides transparent access to all sharded and nonsharded data and code on the cluster. The master namespace, now located on the first data node, still stores metadata, nonsharded data, and code, but need not be accessed directly; distributing user connections across the data nodes’ cluster namespaces provides a more uniform and transparent model.
Deploying the Sharded Cluster
This section provides procedures for deploying an InterSystems IRIS sharded cluster consisting of a shard master, shard data servers, and optionally shard master application servers.
Note:
For an important discussion of performance planning, including memory management and scaling, CPU sizing and scaling, and other considerations, see the “Vertical Scaling” chapter of this guide.
The recommended method for deploying InterSystems IRIS Data Platform is InterSystems Cloud Manager (ICM). By combining plain text declarative configuration files, a simple command line interface, the widely-used Terraform infrastructure as code tool, and InterSystems IRIS deployment in Docker containers, ICM provides you with a simple, intuitive way to provision cloud or virtual infrastructure and deploy the desired InterSystems IRIS architecture on that infrastructure, along with other services. ICM can deploy sharded clusters and other InterSystems IRIS configurations on Amazon Web Services, Google Cloud Platform, Microsoft Azure or VMware vSphere. ICM can also deploy InterSystems IRIS on an existing physical or virtual cluster.
Deploy the Cluster with InterSystems Cloud Manager offers an overview of the process of using ICM to deploy the sharded cluster.
Note:
For a brief introduction to ICM including a hands-on exploration of deploying a sharded cluster, see First Look: InterSystems Cloud Manager.
For complete ICM documentation, see the InterSystems Cloud Manager Guide.
You can also manually deploy a sharded cluster using the %SYSTEM.Sharding API or the Management Portal; instructions for this procedure are provided in Deploy the Cluster Using the API or the Management Portal.
Note:
The most typical sharded cluster configuration involves one InterSystems IRIS instance per system, and one cluster role per instance. When deploying using ICM, this configuration is the only option. The provided procedure for using the Sharding API assumes this configuration as well.
InterSystems recommends the use of an LDAP server to implement centralized security across the nodes of a sharded cluster. For information about using LDAP with InterSystems IRIS, see the “Using LDAP” chapter of the Security Administration Guide.
Regardless of the method you use to deploy the cluster, there are two decisions you must make first:
You also need to plan the sizes of the database caches and globals databases on the cluster members.
This section covers the following topics:
Plan Shard Data Servers
Decide how many shard data server instances to configure.
Depending on the anticipated working set of the sharded data you intend to store on the cluster and the nature of the queries you will run against it, as few as four shard data servers or up to sixteen (or possibly more) may be appropriate for your cluster.
A good basic method for an initial estimate of the ideal number of shard data servers needed (assuming one data shard per server) for a production configuration, subject to resource limitations, is to calculate the size of the database cache needed on each shard data server for a given number of servers and then determine which combination of number of servers and memory per server is optimal, given your circumstances. This calculation is outlined in the following; see Plan an InterSystems IRIS Sharded Cluster for more detail.
  1. Review the data you intend to store on the cluster to estimate the following:
    1. Total size of all sharded tables to be stored on the cluster, including their indexes.
    2. Total size of the nonsharded tables (including indexes) to be stored on the cluster that will be frequently joined with sharded tables.
    3. Total size of all nonsharded tables (including indexes) to be stored on the cluster.
  2. Translate these totals into estimated working sets, based on the proportion of the data that is regularly queried. For example, if 40% of the data in the sharded tables is never retrieved by queries, only the 60% that is retrieved and cached is relevant to the total cache size of the shard data servers. Add a safety factor of 50% to each estimate, and call the sizes of the working sets ShardSizeWS, NonshardSizeJoinedWS, and NonshardSizeTotalWS.
    Bear in mind that while queries joining a nonsharded table and a sharded table count towards the working set NonshardSizeJoinedWS, queries against that same nonsharded data table that do not join it to a sharded table count towards the working set NonshardSizeTotalWS; the same nonsharded data can be returned by both types of queries, and thus would count towards both working sets.
    Note:
    NonshardSizeTotalWS is not germane to the shard data servers, but is used in sizing the shard master data server’s database cache; it is included here for consistency with the instructions in Plan an InterSystems IRIS Sharded Cluster.
  3. Considering all your options regarding both number of systems and memory per system, configure enough shard data servers so that the database cache (global buffer pool) on each shard data server equals (ShardSizeWS/ShardCount) + NonshardSizeJoinedWS — that is, the shard data server’s share of the working set of sharded data, plus all of the working set of frequently joined nonsharded data. Where more systems with lower memory resources are available, you can add more shard data servers and allocate smaller amounts of memory to the database caches; if memory per system is higher, you can configure fewer servers and allocate more memory to each.
    Bear these points in mind, however:
All shard data servers in a sharded cluster should have identical or at least closely comparable specifications and resources; parallel query processing is only as fast as the slowest shard data server. In addition, the configuration of all IRIS instances in the cluster should be consistent; database settings such as collation and those SQL settings configured at instance level (default date format, for example) should be the same on all nodes to ensure correct SQL query results. Standardized procedures and tools like ICM can help ensure this consistency.
Plan Shard Master App Servers
Decide whether you want to scale for user volume by configuring shard master app servers to distribute application load across multiple instances, and if so how many you will include. You can also do this later using the Sharding APIs after the cluster has been deployed, without reloading data.
If shard master app servers are to be deployed, a mechanism to distribute application connections across them is required. ICM can automatically provision and configure load balancers as needed when deploying in a public cloud. Ideally, however, application connections involving similar queries are grouped on the shard master app servers, increasing the benefit of distributed caching.
Plan Database Cache and Database Sizes
Before beginning the deployment process, you need to know the database cache size to be specified on each member of the sharded cluster. You also need to know the expected size of the data volume needed for the default globals database on the shard master data server and shard data servers.
When you deploy a sharded cluster using ICM, these settings are specified by including properties in the configuration files. When you deploy manually using the Sharding API, you specify these settings by hand.
Bear in mind that the sizes below are guidelines, not requirements, and that your estimates for these numbers are likely to be adjusted in practice.
Database Cache Sizes
As described in Plan Shard Data Servers and Plan an InterSystems IRIS Sharded Cluster, the amount of memory that should ideally be allocated to database caches is as follows:
Note:
As noted in Plan an InterSystems IRIS Sharded Cluster, all instances should have database directories and journal directories located on separate storage devices, if possible, especially when high volume or high speed data ingestion is concurrent with running queries.
Globals Database Sizes
As described in Plan an InterSystems IRIS Sharded Cluster, the target configured sizes of the default globals databases are as follows:
Note:
In a sharded cluster, the master namespace and shard namespaces all share a single default globals database and a single default routines database, physically located on the shard master data server and known as the master globals database and the master routines database. The default globals database created when a shard namespace is created remains on the shard, however, becoming the local globals database, which contains the data stored on the shard. Because the defaults for a shard namespace are not switched to the master databases until the instance is assigned to the cluster as a shard, which comes after the creation of the shard namespace, for clarity, the planning guidelines and instructions in this document refer to the eventual local globals database as the default globals database of the shard namespace.
Because the shard master data server and the shard data servers share the master globals database, mappings created in the master namespace are propagated to the data shards. Mappings created on shard master application servers, however, are not propagated.
Deploy the Cluster Using InterSystems Cloud Manager
There are several stages involved in provisioning and deploying a containerized InterSystems IRIS configuration, including a sharded cluster, with ICM. The ICM Guide provides complete documentation of ICM, including details of each of the stages. This section briefly reviews the stages and provides links to the ICM Guide.
Launch ICM
ICM is provided as a Docker image. Everything required by ICM to carry out its provisioning, deployment, and management tasks is included in the ICM container, including a /Samples directory that provides you with samples of the elements required by ICM, customized to the four supported cloud providers. To launch ICM, on a system on which Docker is installed, you use the docker run command with the ICM image from the InterSystems repository to start the ICM container.
For detailed information about launching ICM, see Launch ICM in the “Using ICM” chapter of the ICM Guide.
Obtain Security-Related Files
Before defining your deployment, you must obtain security-related files including cloud provider credentials and keys for SSH and SSL/TLS. For more information about these files and how to obtain them, see Obtain Security-Related Files in the “Using ICM” chapter.
Define the Deployment
ICM uses JSON files as both input and output. To provide the needed parameters to ICM, you must represent your target configuration and the platform on which it is to be deployed in two of ICM’s JSON configuration files: the defaults.json file, which contains information about the entire deployment, and the definitions.json file, which contains information about the types and numbers of the nodes provisioned and deployed by ICM, as well as details specific to each node type. For example, the defaults file determines which cloud provider your sharded cluster nodes are provisioned on and the locations of the required security files and InterSystems IRIS license keys, while the definitions file determines how many shards are included in the sharded cluster and whether the data volume for the shard master data server will be larger than those for the other nodes. Most ICM parameters have defaults; a limited number of parameters can be specified on the ICM command line as well as in the configuration file.
For sample defaults and definitions files for sharded cluster deployment, see Define the Deployment in the “Using ICM” chapter of the ICM Guide. You can create your files by adapting the template defaults.json and definitions.json files provided with ICM in the /Samples directory (for example, /Samples/AWS for AWS deployments), or start with the contents of the samples provided in the documentation.For a complete list of the fields you can include in these files, see ICM Configuration Parameters in the “ICM Reference” chapter of the ICM Guide.
The following table briefly lists the types of node (determined by the Role field in the definitions file) that ICM can provision, configure, and deploy services on, with sharded cluster roles highlighted. For detailed descriptions of the node types, see ICM Node Types in the “ICM Reference” chapter of the ICM Guide.
ICM Node Types
Node Type Sharded Cluster Role
DM
Shard master data server (can also serve as data server and stand-alone instance)
AM
Shard master application server (can also serve as application server)
DS
Shard data server
QS
Shard query server
AR
Mirror arbiter (for mirrored deployments)
LB
Load balancer (can be automatically provisioned with AM, WS and GG nodes)
WS
Web server
VM
Virtual machine (general purpose)
CN
Custom and third-party container node
When creating your configuration files, bear in mind that they must represent the database cache and database sizes you determined in Plan Database Cache and Database Sizes using the following fields (described in General Parameters and Provider-Specific Parameters in the “ICM Reference” chapter of the ICM Guide):
The values provided for these fields in the sample configuration files that follow are based on arbitrary sizes for the data and working sets involved. What is important is that you determine their values based on your own situation, and include them as needed. In general, appropriate sizing of the compute nodes in a sharded cluster and configuration of the InterSystems IRIS instances on them is a complex matter influenced by many factors, including experience; as your experience accumulates, you are likely to include other InterSystems IRIS configuration parameters in the configuration files.
For a complete list of the fields you can include in these files, see ICM Configuration Parameters in the “ICM Reference” chapter of the ICM Guide.
Note:
All InterSystems IRIS instances in a sharded cluster must have sharding licenses.
Provision the Infrastructure
When your definitions files are complete, begin the provisioning phase by issuing the command icm provision on the ICM command line. This command allocates and configures the nodes specified in the definitions file. At completion, ICM also provides a summary of the compute nodes and associated components that have been provisioned, and outputs a command line which can be used to delete the infrastructure at a later date, for example:
Machine           IP Address      DNS Name                      
-------            ---------       -------                      
ACME-DM-TEST-0001  00.53.183.209   ec2-00-53-183-209.us-west-1.compute.amazonaws.com
ACME-AM-TEST-0002  00.56.59.42     ec2-00-56-59-42.us-west-1.compute.amazonaws.com
ACME-AM-TEST-0003  00.67.1.11      ec2-00-67-1-11.us-west-1.compute.amazonaws.com
ACME-AM-TEST-0004  00.193.117.217  ec2-00-193-117-217.us-west-1.compute.amazonaws.com
ACME-LB-TEST-0000  (virtual AM)    ACME-AM-TEST-1546467861.amazonaws.com
ACME-DS-TEST-0005  00.72.116.99    ec2-00-72-116-99.us-west-1.compute.amazonaws.com
ACME-DS-TEST-0006  00.67.11.111    ec2-00-67-11-111.us-west-1.compute.amazonaws.com
ACME-DS-TEST-0007  00.193.21.171   ec2-00-193-21-171.us-west-1.compute.amazonaws.com
ACME-DS-TEST-0008  00.56.103.98    ec2-00-56-103-98.us-west-1.compute.amazonaws.com
To destroy: icm unprovision  stateDir /Samples/AWS/ICM-8620265620732464265  provider AWS [-cleanUp] [-force]
Once your infrastructure is provisioned, you can use several infrastructure management commands. For detailed information about these and the icm provision command, see Provision the Infrastructure in the “Using ICM” chapter of the ICM Guide.
Deploy and Manage Services
ICM carries out deployment of InterSystems IRIS and other software services using Docker images, which it runs as containers by making calls to Docker. In addition to Docker, ICM also carries out some InterSystems IRIS-specific configuration over JDBC. There are many container management tools available that can be used to extend ICM’s deployment and management capabilities.
The icm run command downloads, creates, and starts the specified container on the provisioned nodes. The icm run command has a number of useful options, and also lets you specify Docker options to be included, so there are many versions on the command line depending on your needs. Here are just two examples:
For a full discussion of the use of the icm run command, see The icm run Command in the “Using ICM” chapter of the ICM Guide.
At deployment completion, ICM sends a link to the appropriate node’s management portal, for example:
Management Portal available at: http://ec2-00-153-49-109.us-west-1.compute.amazonaws.com:52773/csp/sys/UtilHome.csp 
In the case of a sharded cluster, the provided link is for the shard master data server instance.
Once your containers are deployed, you can use a number of ICM commands to manage the deployed containers and interact with the containers and the InterSystems IRIS instances and other services running inside them; for more information, see Container Management Commands and Service Management Commands in the “Using ICM” chapter of the ICM Guide.
Unprovision the Infrastructure
Because public cloud platform instances continually generate charges and unused instances in private clouds consume resources to no purpose, it is important to unprovision infrastructure in a timely manner. The icm unprovision command deallocates the provisioned infrastructure based on the state files created during provisioning. As described in Provision the Infrastructure, you must specify the state directory with this command; the needed command line is provided when the provisioning phase is complete, and is also contained in the ICM log file, for example:
To destroy: icm unprovision  stateDir /Samples/AWS/ICM-8620265620732464265  provider AWS [-cleanUp] [-force]
For more detailed information about the unprovisioning phase, see Unprovision the Infrastructure in the “Using ICM” chapter of the ICM Guide.
Deploy the Cluster Using the API or the Management Portal
Use the following procedure to deploy an InterSystems IRIS sharded cluster consisting of a shard master, shard data servers, and optionally shard master application servers using the %SYSTEM.Sharding API. Instructions are also provided for deploying the cluster using the Sharding Configuration page in the Management Portal (System Administration > Configuration > System Configuration > Sharding Configuration).
Note:
As with all classes in the %SYSTEM package, the %SYSTEM.Sharding methods are available through $SYSTEM.Sharding.
This procedure assumes each InterSystems IRIS instance is installed on its own system.
Provision or Identify the Infrastructure
Identify the needed number of networked host systems (physical, virtual, or cloud) — one host each for the shard master, shard data servers, and shard master app servers (if any).
A minimum network bandwidth of 1 GB is recommended, but 10 GB or more is preferred, if available; greater network throughput increases the performance of the sharded cluster.
Install InterSystems IRIS on the Cluster Nodes
  1. Note:
    All InterSystems IRIS instances in a sharded cluster must be of the same version.
    All InterSystems IRIS instances in a sharded cluster must have sharding licenses.
  2. Ensure that the storage device hosting each instance’s databases is large enough to accommodate the target globals database size, as described in Plan Database Cache and Database Sizes.
    All instances should have database directories and journal directories located on separate storage devices, if possible. This is particularly important when high volume data ingestion is concurrent with running queries. For guidelines for file system and storage configuration, including journal storage, see the “File System and Storage Configuration Recommendations” chapter of the Installation Guide and Journaling Best Practices in the “Journaling” chapter of the Data Integrity Guide.
  3. Allocate the database cache (global buffer pool) for each instance, depending on its eventual role in the cluster, according to the sizes you determined in Plan Database Cache and Database Sizes. For the procedure for allocating the database cache, see Memory and Startup Settings in the “Configuring InterSystems IRIS” chapter of the System Administration Guide.
    Note:
    In some cases, it may be advisable to increase the size of the generic memory heap on the cluster members. For information on how to allocate memory to the generic memory heap, see gmheap in the Configuration Parameter File Reference.
    For guidelines for allocating memory to an InterSystems IRIS instance’s routine and database caches as well as the generic memory heap, see Calculating Initial Memory Requirements in the “Vertical Scaling” chapter.
Configure the Cluster Nodes
Perform the following steps on the instances with each role in the cluster.
Configure the Shard Data Servers
On each shard data server instance:
  1. Create the shard namespace using the management portal, as described in Create/Modify a Namespace in the “Configuring InterSystems IRIS” chapter of the System Administration Guide. (The namespace need not be interoperability-enabled.)
    Create a new database for the default globals database, making sure that it is located on a device with sufficient free space to accommodate its target size, as described in Plan Database Cache and Database Sizes. If data ingestion performance is a significant consideration, set the initial size of the database to its target size.
    Select the globals database you created for the namespace’s default routines database.
    Note:
    As noted in the Plan Database Cache and Database Sizes, the shard master data server and shard data servers all share a single default globals database physically located on the shard master and known as the master globals database. The default globals database created when a shard namespace is created remains on the shard, however, becoming the local globals database, which contains the data stored on the shard. Because the shard data server does not start using the master globals database until assigned to the cluster, for clarity, the planning guidelines and instructions in this document refer to the eventual local globals database as the default globals database of the shard namespace.
    A new namespace is automatically created with IRISTEMP configured as the temporary storage database; do not change this setting for the shard namespace.
  2. For a later step, record the DNS name or IP address of the host system, the superserver (TCP) port of the instance, and the name of the shard namespace you created. To see or set the instance’s superserver port number, select System Administration > Configuration > System Configuration > Memory and Startup in the management portal.
  3. In a Terminal window, in any namespace, call $SYSTEM.Sharding.EnableSharding (see %SYSTEM.Sharding API) to enable the instance to participate in a sharded cluster, as follows:
    set status = $SYSTEM.Sharding.EnableSharding()
    No arguments are required.
    Note:
    To display the status of the each API call detailed in these instructions in order to confirm that it succeeded, enter:
    do $SYSTEM.Status.DisplayError(status) 
  4. Restart the instance.
Management Portal
Take the following steps to deploy using the Management Portal instead of the API:
Configure the Shard Master Data Server
On the shard master data server instance:
  1. Create the master namespace using the management portal, as described in Create/Modify a Namespace in the “Configuring InterSystems IRIS” chapter of the Administration Guide. (The namespace need not be interoperability-enabled.)
    Ensure that the default globals database you create is located on a device with sufficient free space to accommodate its target size, as described in Plan Database Cache and Database Sizes. If data ingestion performance is a significant consideration, set the initial size of the database to its target size.
    Select the globals database you created for the namespace’s default routines database.
    Note:
    A new namespace is automatically created with IRISTEMP configured as the temporary storage database; do not change this setting for the master namespace. Because the intermediate results of sharded queries are stored in IRISTEMP, this database should be located on the fastest available storage with significant free space for expansion, particularly if you anticipate many concurrent sharded queries with large result sets.
  2. In a Terminal window, in any namespace, do the following:
    1. Call $SYSTEM.Sharding.EnableSharding() (see %SYSTEM.Sharding API) to enable the instance to participate in a sharded cluster (no arguments are required), as follows:
      set status = $SYSTEM.Sharding.EnableSharding()
    2. Restart the instance.
    3. Optionally, if you want to identify shard data server hosts in the $SYSTEM.Sharding.AssignShard() calls (the next step) using IP addresses rather than host names, you must first use the $SYSTEM.Sharding.SetOption() call to set the shard master’s IP address, as follows:
      Set status=$system.Sharding.SetOption("master-namespace","MasterIpAddress","master-ip-address")
      where the arguments represent the name of the master namespace you created, the option to set (MasterIPAddress), and the shard master’s IP address, for example:
      Set status=$system.Sharding.SetOption("master","MasterIpAddress","00.53.183.209")
    4. Call $SYSTEM.Sharding.AssignShard() (see %SYSTEM.Sharding API) once for each shard data server, to assign the shard to the master namespace you created, as follows:
      set status = $SYSTEM.Sharding.AssignShard("master-namespace","shard-host",shard-superserver-port,\
          "shard_namespace")
      where the arguments represent the name of the master namespace you created and the information you recorded for that shard data server in the previous step, for example:
      set status = $SYSTEM.Sharding.AssignShard("master","shardserver3",51773,"shard3")
      The following shows the same call using the IP address rather than the host name to identify the shard data server, which is possible if you used the $SYSTEM.Sharding.SetOption() call to set the shard master’s IP address in the previous step:
      set status = $SYSTEM.Sharding.AssignShard("master","00.193.21.171",51773,"shard3")
    5. To verify that you have assigned the shards correctly, you can issue the following command and verify the hosts, ports, and namespace names:
      do $SYSTEM.Sharding.ListShards()
      Shard   Host                       Port    Namespc  Mirror  Role    VIP
      1       shard1.internal.acme.com   56775   SHARD1
      2       shard2.internal.acme.com   56777   SHARD2
      ...
    6. To confirm that the ports are correct and all needed configuration of the nodes is in place so that the shard master can communicate with the shard data servers, call $SYSTEM.Sharding.VerifyShards() (see %SYSTEM.Sharding API) as follows:
      do $SYSTEM.Sharding.VerifyShards() 
      The $SYSTEM.Sharding.VerifyShards() call identifies a number of errors. For example, if the port provided in a $SYSTEM.Sharding.AssignShard() call is a port that is open on the shard data server host but not the superserver port for an InterSystems IRIS instance, the shard is not correctly assigned; the $SYSTEM.Sharding.VerifyShards() call indicates this.
      After configuring shard master application servers as described in the next section, you can call $SYSTEM.Sharding.VerifyShards() on each of them as well to confirm that they can communicate with the shard master data server and the shards.
Management Portal:
Take the following steps to deploy using the Management Portal instead of the API:
Configure the Shard Master App Servers
On each shard master app server (if you are configuring them):
  1. In a Terminal window, in any namespace, call $SYSTEM.Sharding.EnableSharding() (see %SYSTEM.Sharding API) to enable the instance to participate in a sharded cluster, as follows:
    set status = $SYSTEM.Sharding.EnableSharding()
    No arguments are required.
  2. As described in Configuring an Application Server in the “Horizontally Scaling Systems for User Volume with InterSystems Distributed Caching” chapter of this guide:
If you have configured shard master app servers, configure the desired mechanism to distribute application connections across them.
Management Portal:
Take the following steps to deploy using the Management Portal instead of the API:
Creating Sharded Tables and Loading Data
Once the cluster is fully configured, you can plan and create the sharded tables and load data into them. The steps involved are as follows:
Evaluate Existing Tables for Sharding
In deciding which of your existing tables to load as sharded tables and which to load as nonsharded tables, your primary considerations should be improving query performance and/or the rate of data ingestion, based on the following factors (which were also discussed in Evaluating the Benefits of Sharding):
Regardless of other factors, tables that are involved in complex transactions requiring atomicity should never be sharded.
Create Target Sharded Tables
Sharded tables are created only in the master namespace on the shard master data server, using a SQL CREATE TABLE statement containing a sharding specification. This specification indicates that the table is to be sharded, and determines the shard key — the field or fields used to determine which rows of a sharded table are stored on which shards. By default, the RowID is used as the shard key, but you can optionally specify a user-defined shard key.
Once the table is created with the appropriate shard key, you can load data into it using INSERT and dedicated tools.
Choose a Shard Key
By default, data is loaded into a sharded table using the RowID as the shard key. This is the most effective approach for almost all sharded tables, because it offers the best guarantee of an even distribution of data and allows the most efficient parallel data loading. There is one case, however, in which another approach may be beneficial: that in which you will be frequently joining two large sharded tables in your queries. To address this situation, you can enable cosharded joins by specifying the same shard key for two or more tables.
A sharded query is decomposed into shard-local queries, each of which is run independently and locally on its shard and needs to see only the data that resides on that shard. When the sharded query involves one or more joins, however, the shard-local queries need to see data from other shards, which requires more processing time and uses more of the memory allocated to the database cache. But in a cosharded join, a row is guaranteed to join only with rows that reside on the same shard, so shard-local queries can run independently and locally.
For example, suppose you will be frequently joining the sharded DEPARTMENT and EMPLOYEE tables on the deptnum field. If you shard them both by the deptnum field, joins such as the following will be cosharded:
SELECT * FROM employee, department where employee.deptnum = department.deptnum
If joins will involve multiple fields, tables can be sharded on equivalent multiple field shard keys. When this has been done, all of the fields in the key must be specified as equal predicates in the join in order for the query to be cosharded.
Like queries with no joins, cosharded joins scale well with increasing numbers of shards, and they also scale well with increasing numbers of joined tables. Joins that are not cosharded perform well with moderate numbers of shards and joined tables, but scale less well with increasing numbers of either. For these reasons, you should carefully consider cosharded joins at this stage, just as, for example, indexing is taken into account to improve performance for frequently-queried sets of fields.
Bear in mind the following:
Evaluate Unique Constraints
When a sharded table has a unique constraint, uniqueness is guaranteed across all shards. Generally, this means uniqueness must be enforced across all shards for each row inserted or updated, which substantially slows insert/update performance. When the shard key is a subset of the fields of the unique key, however, uniqueness can be guaranteed across all shards by enforcing it locally on the shard on which a row is inserted or updated, which avoids this performance impact.
For example, if you shard the DEPARTMENT and EMPLOYEE tables on the deptnum field to create a cosharded join between these tables, as described in the foregoing, a unique constraint on the empnum field of the EMPLOYEE table requires cross-shard enforcement and thus slows insert/update performance, whereas a unique constraint on the deptnum field of the EMPLOYEE table can be enforced locally and thus has no such impact.
It is therefore advisable to avoid defining unique constraints on sharded tables, unless they are defined with the shard key as a subset. Existing nonsharded tables with unique constraints are poor candidates for sharding unless one of the following is true:
Create the Target Table
The empty target sharded table is created using a standard CREATE TABLE statement (see CREATE TABLE in the SQL Reference) in the sharded namespace on the shard master. A sharding specification added to the statement indicates that the table is to be sharded and provides the shard key; it consists of the term shard, which indicates a sharded table with sharding on the ID key by default, and an optional shard key specification containing one or more fields.
For example, to create a target table that has a system-assigned shard key — that is, is sharded on its RowID — include the keyword shard in the the CREATE TABLE statement, for example:
CREATE TABLE DEPARTMENT (deptnum INT, deptname VARCHAR(50) not null, divid INT not null, directorid INT not null, \
locationid INT not null, depttype CHAR(10) not null, primary key (deptnum), shard)
However, to explicitly define a shard key, follow shard with the keyword key and one or more fields in the table. To use multiple fields in the shard key, include a comma-separated list, for example:>
shard key (deptnum, startdate)
As an example of creating cosharded tables with user-defined shard keys, if you wanted to support cosharded joins between the DEPARTMENT and EMPLOYEE tables, as described in Choose a Shard Key, you might use the following statements:
CREATE TABLE DEPARTMENT (deptnum INT, deptname VARCHAR(50) not null, divid INT not null, directorid INT not null, \
locationid INT not null, depttype CHAR(10) not null, primary key (deptnum), shard key (deptnum))
CREATE TABLE EMPLOYEE (empid INT, fname VARCHAR(50) not null, lname VARCHAR(75) not null, dob DATE not null, \
startdate DATE not null, deptnum INT not null, exempt BINARY not null, primary key (empid), shard key (deptnum))
While these statements are an example of a common cosharded join scenario, in which the shard key is the primary key on one side of the join, cosharding also works for many-to-many joins; the shard key is not required to be the primary key (or any other kind of key) on either side.
You can coshard a new sharded table with an existing table by using the same shard key, as in these examples, regardless of whether the existing table has data in it. To coshard a new table with an existing table that has a system-assigned shard key, you can use the coshard with keywords. For example, suppose the department table in the example had been previously created with a system-assigned shard key:
CREATE TABLE DEPARTMENT (deptnum INT, deptname VARCHAR(50) not null, divid INT not null, directorid INT not null, \
locationid INT not null, depttype CHAR(10) not null, primary key (deptnum), shard)
To coshard the employee table with the department table, you would use coshard with as follows:
CREATE TABLE EMPLOYEE (empid INT, fname VARCHAR(50) not null, lname VARCHAR(75) not null, dob DATE not null, \
startdate DATE not null, deptnum INT not null, exempt BINARY not null, primary key (empid), \
shard coshard with (DEPARTMENT))
Note:
If the PK_IS_IDKEY option is set when you create a table, as described in Defining the Primary Key in the “Create Table” entry in the SQL Reference, the table’s RowID is the primary key; in such a case, omitting the shard key specification from the CREATE TABLE statement sets the primary key as the shard key. The best practice, however, if you want to use the primary key as the shard key is to explicitly specify the shard key, so that there is no need to determine the state of this setting before creating tables.
You can display a list of all of the sharded tables on a cluster, including their names, owners, and shard keys, by navigating to the Sharding Configuration page of the Management Portal (System Administration > Configuration > System Configuration > Sharding Configuration) on the shard master data server or a shard data server, selecting the namespace that belongs to the cluster (master namespace on the shard master or shard namespace on a data shard), and selecting the Sharded Tables tab. For a table you have loaded with data, you can click the Details link to see how many of the table’s rows are stored on each shard data server in the cluster.
Sharded Table Creation Constraints
The following constraints apply to sharded table creation:
For further details on the topics and examples in this section, see CREATE TABLE in the InterSystems SQL Reference.
Defining Sharded Tables Using Sharded Classes
In this release, in addition to using DDL to define sharded tables, you can define classes as sharded using the Sharded class keyword. The class compiler has been extended to warn against using class definition features incompatible with sharding, such as customized storage definitions, at compile time. More developed workload mechanisms and support for some of these “incompatible” features, such as the use of stream properties, will be introduced in upcoming versions of InterSystems IRIS.
Load Data Onto the Cluster
Data can be loaded into sharded tables by using INSERT statements through any InterSystems IRIS interface that supports SQL, for example the management portal, the Terminal, or JDBC. Rapid bulk loading of data into sharded tables is supported by the transparent parallel load capability built into the InterSystems IRIS JDBC driver, as well as by the InterSystems IRIS Connector for Spark (see Deploy and Manage Services), which leverages the same capability. Java-based applications also transparently benefit from the InterSystems IRIS JDBC driver’s parallel loading capability.
Load Data Using INSERT
You can verify that a sharded table was created as intended by loading data using an INSERT or INSERT SELECT FROM through any InterSystems IRIS interface that supports SQL and then querying the table or tables in question.
Load Data Using the InterSystems IRIS Spark Connector
The InterSystems IRIS Spark Connector allows you to add Apache Spark capabilities to a sharded cluster. The recommended configuration is to locate Spark slaves on the shard data server hosts and a Spark master on the shard master data server host, connected to the corresponding InterSystems IRIS instances. When you deploy a sharded cluster using ICM, the Apache Spark image provided by InterSystems lets you easily and conveniently create this configuration (see The icm run Command in the “Using ICM” chapter of the ICM Guide). For more information about the Spark Connector and using it to load data, see Using the InterSystems IRIS Spark Connector.
Load Data Using the InterSystems IRIS JDBC Driver
Using the transparent parallel load capability of the InterSystems IRIS JDBC driver, you can construct a tool that retrieves data from a data source and passes it to the target table on the sharded cluster by means of JDBC connections, as follows:
For your convenience, InterSystems provides Simple Data Transfer, a JDBC-based utility that can be used to load large amounts of data from a JDBC data source or flat CSV file into both sharded tables and nonsharded tables (in both sharded and nonsharded namespaces). For more information about Simple Data Transfer, see the “Using the Simple Data Transfer Utility” chapter of Using Java JDBC with InterSystems IRIS.
Create and Load Nonsharded Tables
You can create nonsharded tables in the master namespace on the shard master data server, and load data into them, using your customary methods. These tables are immediately available to the cluster for both nonsharded queries and sharded queries that join them to sharded tables. (This is in contrast to architectures in which nonsharded tables must be explicitly replicated to each node that may need them.) See Evaluate Existing Tables for Sharding for guidance in choosing which tables to load as nonsharded.
Querying the Sharded Cluster
The master namespace and the sharded tables it contains are fully transparent, and SQL queries involving any mix of sharded and nonsharded tables in the master namespace, or the corresponding namespace on a shard master app server, are no different from any SQL queries against any tables in an InterSystems IRIS namespace. No special query syntax is required to identify sharded tables or shard keys. Queries can join multiple sharded tables, as well as sharded and nonsharded tables. Everything is supported except what is specified in the following, which represent limitations and restrictions in the initial version of the InterSystems IRIS sharded cluster; the goal is that they will all be removed.
Note:
If you want to explicitly purge cached queries on the shard data servers, you can either purge all cached queries from the master namespace, or purge cached queries for a specific table. Both of these actions propagate the purge to the shard data servers. Purging of individual cached queries is never propagated to the shard data servers. For more information about purging cached queries, see Purging Cached Queries in the “Cached Queries” chapter of the SQL Optimization Guide.
Additional Sharded Cluster Options
Sharding offers many configurations and options, suitable to your needs. This section provides brief coverage of additional options of interest, including:
For further assistance in evaluating the benefits of these options for your cluster, please contact the InterSystems Worldwide Response Center (WRC).
Add Shard Data Servers and Rebalance Data
As described in Plan an InterSystems IRIS Sharded Cluster, the number of shard data servers you include in a cluster when first deployed is influenced by a number of factors, including but not limited to the estimated working set for sharded tables and the compute resources you have available. Over time, you may want to increase the number of shard data servers because the size of your sharded data may grow significantly enough to make a higher shard count desirable, for example, or because a resource constraint has been removed. Shard data servers can be added by reprovisioning and redeploying the cluster using ICM (see Reprovisioning the Infrastructure and Redeploying Services in the ICM Guide), or using the Sharding API or Management Portal by repeating the steps outlined in Configure the Shard Data Servers.
When you add shard data servers to a cluster, there is no data stored on them. Sharded data that is already on the cluster and data that is loaded onto the cluster after they are added is distributed as follows:
You can, however, use the $SYSTEM.Sharding.Rebalance() API call to rebalance existing sharded data across the expanded set of shard data servers. For example, if you go from four shard data servers to eight, rebalancing takes you from four existing shard data servers with one fourth of the sharded data on each, plus four empty new servers, to eight servers with one eighth of the data on each. Rebalancing also allows rows added to existing sharded tables with user-defined shard keys to be evenly distributed across all of the shard servers. Thus, after you rebalance, all sharded data — including existing tables, rows added to existing tables, and new tables — is evenly distributed across all shard servers.
Rebalancing cannot coincide with queries and updates, and so can take place only when the sharded cluster is offline and no other sharded operations are possible. (In a future release, this limitation will be removed.) For this reason, the $SYSTEM.Sharding.Rebalance() call places the sharded cluster in a state in which queries and updates of sharded tables are not permitted to execute, and return an error if attempted.
Each rebalancing call can specify a time limit, however, so that the call can be scheduled in a maintenance window, move as much data as possible within the window, and return the sharded cluster to a fully-usable state before the window ends. By using this approach with repeated calls, you can fully rebalance the cluster over a series of scheduled maintenance outages without otherwise interfering with its operation. You can also specify the minimum amount of data to be moved by the call; if it is not possible to move that much data within the specified time limit, no rebalancing occurs.
Note:
Query and update operations execute correctly before rebalancing is performed (when new shard servers are still empty), in between the calls of a multicall rebalancing operation, and after rebalancing is complete, but they are most efficient after all of the data has been rebalanced across all of the shard servers.
The illustration that follows show the process of adding shards and rebalancing data using a multicall rebalancing operation.
Adding a Shard and Rebalancing Data
Mirror for High Availability
An InterSystems IRIS mirror is a logical grouping of physically independent InterSystems IRIS instances simultaneously maintaining exact copies of production databases, so that if the instance providing access to the databases becomes unavailable, another can take over. A mirror can provide high availability through automatic failover, in which a failure of the InterSystems IRIS instance providing database access (or its host system) causes another instance to take over automatically and immediately. The “Mirroring” chapter of the High Availability Guide contains detailed information about InterSystems IRIS mirroring.
Both the shard master data server and the shard data servers in a sharded cluster can be mirrored; the recommended general best practice is that either all of these nodes are mirrored, or that none are. Note that when data shards are mirrored, sharded queries can transparently complete successfully even if one or more shards fail over during query execution.
Because they do not store persistent data, shard master app servers and shard query servers never require mirroring.
Note:
This release does not support the use of async members in mirrors serving as shard master data server or shard data server nodes in sharded clusters.
Deploy a Mirrored Cluster Using ICM
To deploy a fully mirrored using InterSystems Cloud Manager, refer to Define the Deployment and make the following changes:
  1. Add “Mirror”: “True” to the defaults file.
  2. Define two DM nodes in the definitions file.
  3. Define an even number of DS nodes in the definitions file. (If you define an odd number of DS nodes when Mirror is set to True, provisioning fails.)
For more information on deploying mirrored configurations with ICM, see ICM Cluster Topology and Mirroring in the “ICM Reference” chapter of the ICM Guide.
Deploy a Mirrored Cluster Using the Sharding API
To deploy a fully mirrored cluster using the Sharding API, do the following:
  1. After determining how many shard data servers you want, install InterSystems IRIS on twice that number of nodes, plus two more for the mirrored shard master data server, as described in Install InterSystems IRIS on the Cluster Nodes.
  2. Configure each pair of shard data servers and the pair of shard master data servers as the failover members of a mirror, as described in Creating a Mirror in the “Mirroring” chapter of the High Availability Guide.
  3. For each mirrored pair, use the following procedure:
    1. Configure the primary as described in Configure the Shard Data Servers and Configure the Shard Master Data Server, ensuring that when you create the shard namespaces and master namespace, you create the default globals database of each namespace as a mirrored database in the mirror, and including the $SYSTEM.Sharding.EnableSharding() call on all primary nodes and the $SYSTEM.Sharding.AssignShard(), $SYSTEM.Sharding.ListShards(), and $SYSTEM.Sharding.VerifyShards() calls on the shard master data server primary.
    2. Configure the backup as described in the same sections, choosing the mirrored database you created on the primary as the default globals database for the shard or master namespace on the backup, including the $SYSTEM.Sharding.EnableSharding() call, and omitting the $SYSTEM.Sharding.AssignShard() and the calls that follow — that is, calling only $SYSTEM.Sharding.EnableSharding() — on the shard master data server backup.
Deploy Query Shards
A query shard does not contain data, but provides query access through ECP to the data on the data shard it is assigned to. You can assign one or more of query shards to each data shard in a sharded cluster to both
Query shards provide these performance enhancements through the following behavior:
The following diagram illustrates a sharded cluster with query shards. See Plan Shard Query Servers for information about circumstances under which query servers are beneficial and guidelines for planning their configuration within the cluster.
Sharded cluster with shard query servers
Deploy Query Shards Using ICM
To include shard query servers in the sharded cluster when deploying using InterSystems Cloud Manager (ICM), include the desired number of QS nodes in the definitions file (see Define the Deployment). Query shards are assigned to data shards in round robin fashion; to follow the recommendation for the same number of query shards per data shard, define the same number of QS nodes as DS nodes, or twice as many, or three times as many, and so on.
Deploy Query Shards Using the Sharding API
To assign a query shard using the Sharding API, use the following steps:
  1. On the shard query server, follow the procedure described for data shard servers in Configure the Shard Servers. Choose any database for the default globals and routines databases of the shard namespace; on a shard query server, this is not significant.
  2. On the shard master data server, following the procedure described in Configure the Shard Master Data Server for assigning data shards, use the $SYSTEM.Sharding.AssignShard() API call to assign the query shard, with an added argument, as follows:
    set status = $SYSTEM.Sharding.AssignShard("master_namespace","shard_host",superserver port,"shard_namespace",N)
    where the arguments represent the name of the master namespace on the shard master data server, the shard query server host, the superserver port of the shard query server InterSystems IRIS instance, the name of the shard namespace, and the number of the data shard to which you are assigning the query shard. You can get the data shard number by calling $SYSTEM.Sharding.ListShards(), as follows:
    set status = $SYSTEM.Sharding.ListShards("master namespace")
    where the argument is the name of the master namespace.
Install Multiple Shard Data Servers per System
With a given number of systems hosting shard data servers, configuring multiple shard data server instances per system, using the %SYSTEM.Sharding API, can significantly increase data ingestion throughput. Therefore, when achieving the highest data ingestion throughput at the lowest cost is a concern, this may be achieved by installing more shard server instances than there are hosts, with each system hosting two or three shard servers. The gain achieved will depend on server type, server resources, and overall workload. While adding to the total number of systems might achieve the same throughput gain, or more (without dividing a host system’s memory among multiple database caches), adding instances is less expensive than adding systems.
InterSystems IRIS Sharding Reference
This section contains additional information about planning, deploying, and using a sharded configuration, including the following:
Plan an InterSystems IRIS Sharded Cluster
This section provides some first-order guidelines for planning a basic sharded cluster, and for adding shard query servers and shard master app servers. It is not intended to represent detailed instructions for a full-fledged design and planning process.
Combine Sharding with Vertical Scaling
Planning for sharding typically involves considering the tradeoff between resources per system and number of systems in use. At the extremes, the two main approaches can be stated as follows:
In practice, in most situations, a combination of these approaches works best. Unlike other horizontal scaling approaches, InterSystems IRIS sharding is easily combined with InterSystems IRIS’s considerable vertical scaling capacities. In many cases, a cluster hosted on reasonably high-capacity systems with a range of from 4 to 16 shard servers will yield the greatest benefit.
Plan a Basic Cluster of Shard Master Data Server and Shard Data Servers
To use these guidelines, you need to estimate several variables related to the amount of data to be stored on the cluster.
  1. First, review the data you intend to store on the cluster to estimate the following:
    1. Total size of all the sharded tables to be stored on the cluster, including their indexes.
    2. Total size of the nonsharded tables (including indexes) to be stored on the cluster that will be frequently joined with sharded tables.
    3. Total size of the nonsharded tables (including indexes) to be stored on the cluster.
  2. Translate these totals into estimated working sets, based on the proportion of the data that is regularly queried.
    Estimating working sets can be a complex matter. You may be able to derive useful information about these working sets from historical usage statistics for your existing database cache(s). In addition to or in place of that, divide your tables into the three categories and determine a rough working set for each by doing the following:
    • For significant the SELECT statements frequently made against the table, examine the WHERE clauses. Do they typically look at a subset of the data that you might be able to estimate the size of based on table and column statistics? Do the subsets retrieved by different SELECT statements overlap with each other or are they additive?
    • Review significant INSERT statements for size and frequency. It may be more difficult to translate these into working set, but as a simplified approach, you might estimate the average hourly ingestion rate in MB (records per second * average record size * 3600) and add that to the working set for the table.
    • Consider any other frequent queries for which you may be able to specifically estimate results returned.
    • Bear in mind that while queries joining a nonsharded table and a sharded table count towards the working set NonshardSizeJoinedWS, queries against that same nonsharded data table do not join it to a sharded table count towards the working set NonshardSizeTotalWS, which means that the same nonsharded data can be returned by both types of queries, and thus would count towards both working sets.
    You can then add these estimates together to form a single estimate for the working set of each table, and add those estimates to roughly calculate the overall working sets. These overall estimates are likely to be fairly rough and may turn out to need adjustment in production. Add a safety factor of 50% to each estimate.
  3. Record the total data sizes and the working sets as the following variables:
    Cluster Planning Variables
    Variable
    Value
    ShardSize, ShardSizeWS
    Total size and working set of sharded tables
    NonshardSizeJoined, NonshardSizeJoinedWS
    Total size and working set of nonsharded tables that are frequently joined to sharded tables
    NonshardSizeTotal, NonshardSizeTotalWS
    Total size and working set of nonsharded tables
    ShardCount
    Number of data shard server instances
In reviewing the guidelines in the table that follows, bear the following in mind:
Cluster Planning Guidelines
Size of ...
should be at least ...
Database cache on each shard data server (see Memory and Startup Settings in the System Administration Guide)
(ShardSizeWS / ShardCount) + NonshardSizeJoinedWS
Default globals database for shard namespace on shard data server (see Configuring Namespaces)
ShardSize / ShardCount plus a margin for growth
When data ingestion performance is a major consideration, consider configuring initial size of the database (see Configuring Databases) to equal the expected maximum size. This database should have the same characteristics, such as locale and collations, on all shard data servers.
Database cache on shard master data server (see Memory and Startup Settings)
NonshardSizeTotalWS.
Default globals database for master namespace on shard master data server (see Configuring Namespaces)
NonshardSizeTotal
IRISTEMP database on shard master data server
No specific recommendation, but ensure that the database is located on the fastest possible storage, with space for significant expansion.
CPU
No specific recommendations. All InterSystems IRIS servers can benefit by greater numbers of CPUs, whether or not sharding is involved. Vertical scaling of CPU, memory, and storage resources can always be used in conjunction with sharding to provide additional benefit, but is not specifically required, and is governed by the usual cost/performance tradeoffs.
All shard data servers in a sharded cluster should have identical or at least closely comparable specifications and resources; parallel query processing is only as fast as the slowest shard data server. In addition, the configuration of all IRIS instances in the cluster should be consistent; database settings such as collation and those SQL settings configured at instance level (default date format, for example) should be the same on all nodes to ensure correct SQL query results. Standardized procedures and tools like ICM can help ensure this consistency.
Plan Shard Master App Servers
Considerations for deciding whether to use master app servers, how many to use, and how to configure them are the same as for application servers in a distributed cache cluster (see the chapter “Horizontally Scaling Systems for User Volume with InterSystems Distributed Caching”). As with any distributed cluster, when applications connect to a shard master app server, all the application work (including SQL execution, which for sharding means execution of the combining queries assembling the results of the shard-local queries) happens on the shard master app server, with the shard master data server behaving only as a page server. Thus, when shard master app servers are in use:
Plan Shard Query Servers
The scenarios most likely to benefit from the addition of shard query servers to a cluster are as follows:
When query shards are in use:
Coordinated Backup and Restore of Sharded Clusters
When data is distributed across multiple systems, as in an InterSystems IRIS sharded cluster, backup and restore procedures may involve additional complexity. Where strict consistency of the data across a sharded cluster is required, independently backing up and restoring individual nodes is insufficient, because the backups may not all be created at the same logical point in time. This makes it impossible to be certain, when the entire cluster is restored following a failure, that ordering is preserved and the logical integrity of the restored databases is thereby ensured.
For example, suppose update A of data on shard data server S1 was committed before update B of data on shard data server S2. Following a restore of the cluster from backup, logical integrity requires that if update B is visible, update A must be visible as well. But if backups of S1 and S2 are taken independently, it is impossible to guarantee that the backup of S1 was made after A was committed, even if the backup of S2 was made after B was committed, so restoring the backups independently could lead to S1 and S2 being inconsistent with each other.
If, on the other hand, the procedures used coordinate either backup or restore and can therefore guarantee that all systems are restored to the same logical point in time — in this case, following update B — ordering is preserved and the logical integrity that depends on it is ensured. This is the goal of coordinated backup and restore procedures.
To greatly reduce the chances of having to use any of the procedures described here to restore your sharded cluster, you can deploy it with mirrored data servers, as described in Mirror for High Availability. Even if the cluster is unmirrored, most data errors (data corruption, for example, or accidental deletion of data) can be remedied by restoring the data server on which the error occurred from the latest backup and then recovering it to the current logical point in time using its journal files. The procedures described here are for use in much rarer situations requiring a cluster-wide restore.
This section covers the following topics:
Coordinated Backup and Restore Approaches for Sharded Clusters
Coordinated backup and restore of a sharded cluster always involves all of the data servers in the cluster — that is, the shard master data server and the shard data servers. The InterSystems IRIS Backup API includes a Backup.ShardedCluster class that supports three approaches to coordinated backup and restore of a sharded cluster’s data servers.
Bear in mind that the goal of all approaches is to restore all data servers to the same logical point in time, but the means of doing so varies. In one, it is the backups themselves that share a logical point in time, but in the others, InterSystems IRIS database journaling provides the common logical point in time, called a journal checkpoint, to which the databases are restored. The approaches include:
To understand how these approaches work, it is important that you understand the basics of InterSystems IRIS data integrity and crash recovery, which are discussed in the “Introduction to Data Integrity” chapter of the Data Integrity Guide. Database journaling, a critical feature of data integrity and recovery, is particularly significant for this topic. Journaling records all updates made to an instance’s databases in journal files. This makes it possible to recover updates made between the time a backup was taken and the moment of failure (or another selected point) by restoring updates from the journal files following restore from backup. Journal files are also used to ensure transactional integrity by rolling back transactions that were left open by the failure. For detailed information about journaling, see the “Journaling” chapter of the Data Integrity Guide.
Considerations when selecting an approach to coordinated backup and restore include the following:
These issues are discussed in detail later in this section.
Coordinated Backup and Restore API Calls
The methods in the Backup.ShardedCluster class can be invoked on a sharded cluster’s shard master data server or on one of its shard master application servers (if they exist). All of the methods take a ShardMasterNamespace argument; this is the name of either the master namespace on the shard master data server, or the namespace on a shard master application server that is mapped to the default globals database of the master namespace. (For information about how this relationship is configured with the API, see Configure the Shard Master App Servers; ICM creates this configuration automatically, but the result is the same.)
The available methods are as follows:
You can review the technical documentation of these calls in the InterSystems Class Reference.
Procedures for Coordinated Backup and Restore
The steps involved in the three coordinated backup and restore approaches provided by the Sharding API are described in the following sections.
Data server backups should, in general, include not only database files but all files used by InterSystems IRIS, including the journal directories, write image journal, and installation data directory, as well as any needed external files. The locations of these files depend in part on how the cluster was deployed (see Deploying the Sharded Cluster); the measures required to include them in backups may have an impact on your choice of approach.
Important:
The restore procedures described here assume that the data server being restored has no mirror failover partner available, and would be used with a mirrored data server only in a disaster recovery situation in which mirror recovery procedures (see Disaster Recovery Procedures in the “Mirroring” chapter of the High Availability Guide) are insufficient.  If the data server being restored is part of a mirror, remove it from the mirror, complete the restore procedure described, and then rebuild it as described in Rebuilding a Mirror Member in the “Mirroring” chapter.
Create Coordinated Backups
  1. Call Backup.ShardedCluster.Quiesce, which pauses activity on all data servers in the cluster (and thus all application activity) and waits until all pending writes have been flushed to disk. When this process is completed and the call returns, all databases and journal files across the cluster are at the same logical point in time.
  2. Create backups of all data servers in the cluster. Although the database backups are coordinated, they may include open transactions; when the data servers are restarted after being restored from backup, InterSystems IRIS recovery uses the journal files to restore transactional integrity by rolling back these back.
  3. When backups are complete, call Backup.ShardedCluster.Resume to restore normal data server operation.
    Important:
    Resume() must be called within the same job that called Quiesce(). A failure return may indicate that the backup images taken under Quiesce() were not reliable and may need to be discarded.
  4. Following a failure, on each data server:
    1. Restore the backup image.
    2. Verify that the only journal files present are those in the restored image from the time of the backup.
      Caution:
      This is critically important because at startup, recovery restores the journal files and rolls back any transactions that were open at the time of the backup. If journal data later than the time of the backup exists at startup, it could be restored and cause the data server to be inconsistent with the others.
    3. Restart the data server.
    The data server is restored to the logical point in time at which database activity was quiesced.
Note:
As an alternative to the first three steps in this procedure, you can gracefully shut down all data servers in the cluster, create cold backups, and restart the data servers.
Create Uncoordinated Backups Followed by Coordinated Journal Checkpoints
  1. Create backups of the databases on all data servers in the cluster while the data servers are in operation and application activity continues. These backups may be taken at entirely different times using any method of your choice and at any intervals you choose.
  2. Call Backup.ShardedCluster.JournalCheckpoint() on a regular basis, preferably as a scheduled task. This method creates a coordinated journal checkpoint and returns the names of the last journal file to include in a restore on each data server in order to reach that checkpoint. Bear in mind that it is the time of the latest checkpoint and the availability of the precheckpoint journal files that dictate the logical point in time to which the data servers can be recovered, rather than the timing of the backups.
    Note:
    Before switching journal files, JournalCheckpoint() briefly quiesces all data servers in the sharded cluster to ensure that the precheckpoint files all end at the same logical moment in time; as a result, application activity may be very briefly paused during execution of this method.
  3. Ensure that for each data server, you store a complete set of journal files from the time of its last backup to the time at which the most recent coordinated journal checkpoint was created, ending with the precheckpoint journal file, and that all of these files will remain available following a server failure (possibly by backing up the journal files regularly). The databases backups are not coordinated and may also include partial transactions, but when the data servers are restarted after being restored from backup, recovery uses the coordinated journal files to bring all databases to the same logical point in time and to restore transactional integrity.
  4. Following a failure, identify the latest checkpoint available as a common restore point for all data servers. This requires means that for each data server you have a database backup that preceding the checkpoint and all intervening journal files up to the precheckpoint journal file.
    Caution:
    This is critically important because at startup, recovery restores the journal files and rolls back any transactions that were open at the time of the backup. If journal files later than the precheckpoint journal file exist at startup, they could be restored and cause the data server to be inconsistent with the others.
  5. On each data server, restore the databases from the backup preceding the checkpoint, restoring journal files up to the checkpoint. Ensure that no journal data after that checkpoint is applied. The simplest way to ensure that is to check if the server has any later journal files, and if so move or delete them, and then delete the journal log.
    The data server is now restored to the logical point in time at which the coordinated journal checkpoint was created.
Include a Coordinated Journal Checkpoint in Uncoordinated Backups
  1. Call Backup.ShardedCluster.ExternalFreeze(). This method freezes all activity on all data servers in the sharded cluster by suspending their write daemons; application activity continues, but updates are written to the journal files only and are not committed to disk. Before returning, the method creates a coordinated journal checkpoint and switches each data server to a new journal file, then returns the checkpoint number and the names of the precheckpoint journal files. At this point, the precheckpoint journal files represent a single logical point in time.
  2. Create backups of all data servers in the cluster. The databases backups are not coordinated and may also include partial transactions, but when restoring the data servers you will ensure that they are recovered to the journal checkpoint, bringing all databases to the same logical point in time and to restoring transactional integrity.
    Note:
    By default, when the write daemons have been suspended by Backup.ShardedCluster.ExternalFreeze() for 10 minutes, application processes are blocked from making further updates (due to the risk that journal buffers may become full). However, this period can be extended using an optional argument to ExternalFreeze() if the backup process requires more time.
  3. When all backups are complete, call Backup.ShardedCluster.ExternalThaw() to resume the write daemons and restore normal data server operation.
    Important:
    A failure return may indicate that the backup images taken under ExternalFreeze() were not reliable and may need to be discarded.
  4. Following a failure, on each data server:
    1. Restore the backup image.
    2. Remove any journal files present in the restored image that are later than the precheckpoint journal file returned by ExternalFreeze().
    3. Follow the instructions in Starting InterSystems IRIS Without Automatic WIJ and Journal Recovery in the “Backup and Restore” chapter of the Data Integrity Guide to manually recover the InterSystems IRIS instance. When you restore the journal files, start with the journal file that was switched to by ExternalFreeze() and endi with the precheckpoint journal file returned by ExternalFreeze(). (Note that these may be the same file, in which case this is the one and only journal file to restore.)
      Note:
      If you are working with containerized InterSystems IRIS instances, see Upgrading When Manual Startup is Required in Running InterSystems Products in Containers for instructions for doing a manual recovery inside a container.
    The data server is restored to the logical point in time at which the coordinated journal checkpoint was created by the ExternalFreeze() method.
Note:
This approach requires that the databases and journal files on each data server be located such that a single backup can include them both.
Sharding APIs
At this release, InterSystems IRIS provides two APIs for use in configuring and managing a sharded cluster:
%SYSTEM.Cluster API
For more detail on the %SYSTEM.Cluster API methods. and instructions for calling the methods, see the %SYSTEM.Cluster class documentation in the InterSystems Class Reference.
Use the %SYSTEM.Cluster API methods in the following ways:
%SYSTEM.Cluster methods include the following:
%SYSTEM.Sharding API
This section describes the use of the %SYSTEM.Sharding API and the methods it includes. For more information about the API and instructions for calling the methods, see the %SYSTEM.Sharding class documentation in the InterSystems Class Reference.
Use the %SYSTEM.Sharding API methods in the following ways:
%SYSTEM.Sharding methods include the following:
Reserved Names
The following names are used by InterSystems IRIS and should not be used in the names of user-defined elements:
Sharding Glossary
sharding
Transparent horizontal partitioning of database tables. Sharding provides horizontal performance scaling for queries and data ingestion.
sharded table
A database table that is horizontally partitioned, that is, divided by rows. Each shard of a sharded table contains a roughly equal subset of the table's rows, and each row is contained within exactly one shard.
master namespace
An InterSystems IRIS namespace that has been assigned one or more shards and can therefore contain sharded tables.
A master namespace defines a sharded cluster, and is hosted by a shard master instance.
A master namespace can also contain nonsharded tables, which can be joined to its sharded tables in queries, and in general has all the capabilities of a standard namespace.
Any namespace (except a shard namespace) can become a master namespace by being assigned one or more shards.
shard
An assigned partition of a master namespace, hosted by a shard server instance. A shard is implemented as a shard namespace, which is transparent to end users of sharding.
A data shard contains one horizontal partition of each sharded table in the master namespace; a query shard provides query access to the data shard it is assigned to.
n.b. The term shard is also used to refer to a single horizontal partition of an individual sharded table.
sharded cluster
A set of InterSystems IRIS instances configured together to support sharding.
A basic sharded cluster is comprised of one shard master data server on which the master namespace is defined, one or more shard servers each hosting one or more shards of the master namespace, and optionally one or more shard master app servers, across which applications can be load balanced.
A sharded cluster is internally managed by the sharding manager.
shard namespace
An InterSystems IRIS namespace that has been assigned to serve as a shard of a master namespace.
Shard namespaces are transparent to end users of sharding. A namespace cannot be both a master namespace and a shard namespace.
While shard namespaces are created in the same manner as standard namespaces, once assigned to a master namespace they are managed only by the sharding manager and never accessed directly by end users or used for any purpose other than as shards.
shard master
The member or members of a sharded cluster to which sharded queries are directed. In a cluster without shard master app servers, the shard master data server is the shard master; when shard master app servers are configured, they are all shard masters.
shard master data server
An InterSystems IRIS instance, or a mirrored pair of instances, on which one or more master namespaces are defined. Each master namespace defines a separate sharded cluster in which the shard master data server participates.
In a sharded cluster without shard master app servers, the shard master data server is typically referred to simply as the shard master. Either term can be used to indicate the system hosting the InterSystems IRIS instance (or mirrored pair), rather than the instance itself.
If any master namespace on a shard master data server is mirrored, all master namespaces on the instance must be mirrored.
shard master app server
An InterSystems IRIS instance that is configured as an application server to a shard master data server. On the shard master app server, the default globals and routines databases of the master namespace on the shard master data server are configured as remote databases and a namespace is defined with these remote databases as its default globals and routines databases.
shard server
An InterSystems IRIS instance which hosts one or more shardsdata shards, query shards, or both.
A shard server hosting only data shards is referred to as a shard data server, and one hosting only query shards as a shard query server, but the general term shard server can be used for either of these as well as for an instance hosting both shard types. Any of these terms may be used to indicate the system hosting the InterSystems IRIS instance (or mirrored pair), rather than the instance itself. A shard data server can be a mirrored shard server, but not a shard query server or a mixed shard server. All data shards on a mirrored shard server must be mirrored.
A single shard server can host shards of multiple master namespaces, thereby participating in multiple sharded clusters.
data shard
A shard that stores one horizontal partition of each sharded table in the master namespace to which the data shard is assigned. A data shard must be mirrored if it is hosted by a mirrored shard server.
query shard
A shard that provides query access by ECP to the data on the data shard to which it is assigned. Query shards can be used to minimize interference between query and data ingestion workloads, and to increase the bandwidth of a sharded configuration for high volume multiuser query workloads.
One or more of the query shards may be optionally assigned to each data shard in a sharded cluster. A query shard is never located on the same shard server as the data shard to which it is assigned. A query shard must have the same default globals database as its associated data shard; to enable this, the data shard’s globals database is configured as a remote database on the query shard.
When SQL operations are executed on sharded tables, read-only queries are automatically executed on the query shards for all data shards that have them assigned, but directly on data shards that have no query shards assigned. If more than one query shard has been assigned to a data shard, queries are automatically load balanced among them.
All write operations (insert, update, delete, and DDL operations) are automatically executed on data shards.
sharding manager
A federated software component running on the shard master data server and each of the shard servers and shard master app servers in the sharded cluster, responsible for managing the master namespace and executing sharded operations.
sharded query
A query against one or more of the sharded tables contained in a master namespace that is executed in parallel across the shards hosted on the cluster’s shard servers.
shard-local query
The individual partition of a sharded query distributed by the sharding manager to a shard.
combining query
The query used by the sharding manager to assemble the results of the shard-local queries into a single result to be returned to the user in response to a sharded query.
shard key
The field or fields in a sharded table used to horizontally partition the table, that is, to distribute its rows across a cluster’s data shards.
A shard key can be either a system-assigned ID or a user-defined key.
cosharded join
A query that joins sharded tables that have equivalent user-defined shard keys and are thus distributed across the data shards in an equivalent manner, and that specifies equal predicates for all fields of the shard key. Such a query can be performed locally on each data shard (or the query shards assigned to it), enhancing performance. Use of equivalent shard keys is therefore beneficial when frequent use of specific joins is anticipated. The shard key fields of all tables involved in a cosharded join must have the same number, order, and datatypes.
shard server job
A background job that runs on a shard server and executes commands against a given shard on behalf of a given application.
mirrored master
A shard master data server configured as a mirror for high availability. On a mirrored master, the globals and routines databases of each master namespace must be mirrored.
A shard master app server is never mirrored, since the sharded cluster’s shard master data server can optionally be mirrored.
mirrored shard server
A shard server configured as a mirror for high availability. A mirrored shard server can host only data shards, and each data shard’s default globals database must be mirrored.
A query shard is never mirrored, since the globals database of the data shard it is assigned to can optionally be mirrored.


Previous section          
Send us comments on this page
View this book as PDF   |  Download all PDFs
Copyright © 1997-2019 InterSystems Corporation, Cambridge, MA
Content Date/Time: 2019-09-20 05:47:55