Configure Parallel Query Processing
Parallel query hinting directs the system to perform parallel query processing when running on a multi-processor system. This can substantially improve performance of certain types of queries. The SQL optimizer determines whether a specific query could benefit from parallel processing, and performs parallel processing where appropriate. Specifying a parallel query hint does not force parallel processing of every query; it is only applied to those queries that may benefit from parallel processing. If the system is not a multi-processor system, this option has no effect. To determine the number of processors on the current system use the %SYSTEM.Util.NumberOfCPUs()Opens in a new tab method.
Adaptive Mode controls parallel query processing by default. If it has been turned off, you can specify parallel query processing in two ways:
System-wide, by setting the auto parallel option.
Per query, by specifying the %PARALLEL keyword in the FROM clause of an individual query.
Parallel query processing is applied to SELECT queries. It is not applied to INSERT, UPDATE, or DELETE operations.
Avoid parallel processing for queries that involve process-specific functions such as $job and $tlevel. Also avoid them in queries of process-specific variables such as %ROWID.
System-Wide Parallel Query Processing
When Adaptive Mode is off, you can still turn on system-wide parallel query processing separately by using either of the following options:
From the Management Portal choose System Administration, then Configuration, then SQL and Object Settings, then SQL. View or change the Execute queries in a single process check box. Note that the default for this check box is unselected, which means that parallel processing is activated by default.
Invoke the $SYSTEM.SQL.Util.SetOption()Opens in a new tab method, as follows: SET status=$SYSTEM.SQL.Util.SetOption("AutoParallel",1,.oldval). The default is 1 (automatic parallel processing activated). To determine the current setting, call $SYSTEM.SQL.CurrentSettings()Opens in a new tab which displays the Enable auto hinting for %PARALLEL option.
Note that changing this configuration setting purges all cached queries in all namespaces.
It is possible that server-side SQL queries that include literal constants that have a conversion between a Logical format and a Display/ODBC format when compiled in %Runtime mode and are not executed in ODBC mode may return incorrect results when run in parallel. To prevent this, you should add %NOPARALLEL to the affected query.
Parallel Query Processing for a Specific Query
The optional %PARALLEL keyword is specified in the FROM clause of a query. It suggests that InterSystems IRIS perform parallel processing of the query, using multiple processors (if applicable). This can significantly improve performance of some queries that uses one or more COUNT, SUM, AVG, MAX, or MIN aggregate functions, and/or a GROUP BY clause, as well as many other types of queries. These are commonly queries that process a large quantity of data and return a small result set. For example, SELECT AVG(SaleAmt) FROM %PARALLEL User.AllSales GROUP BY Region would likely use parallel processing.
A “one row” query that specifies only aggregate functions, expressions, and subqueries performs parallel processing, with or without a GROUP BY clause. However, a “multi-row” query that specifies both individual fields and one or more aggregate functions does not perform parallel processing unless it includes a GROUP BY clause. For example, SELECT Name,AVG(Age) FROM %PARALLEL Sample.Person does not perform parallel processing, but SELECT Name,AVG(Age) FROM %PARALLEL Sample.Person GROUP BY Home_State does perform parallel processing.
If a query that specifies %PARALLEL is compiled in Runtime mode, all constants are interpreted as being in ODBC format.
Specifying %PARALLEL may degrade performance for some queries. Running a query with %PARALLEL on a system with multiple concurrent users may result in degraded overall performance.
Parallel processing can be performed when querying a view. However, parallel processing is never performed on a query that specifies a %VID, even if the %PARALLEL keyword is explicitly specified.
For further details, refer to the FROM clause in the InterSystems SQL Reference.
%PARALLEL in Subqueries
%PARALLEL is intended for SELECT queries and their subqueries. An INSERT command subquery cannot use %PARALLEL.
%PARALLEL is ignored when applied to a subquery that is correlated with an enclosing query. For example:
SELECT name,age FROM Sample.Person AS p WHERE 30<(SELECT AVG(age) FROM %PARALLEL Sample.Employee WHERE Name = p.Name)
%PARALLEL is ignored when applied to a subquery that includes a complex predicate, or a predicate that optimizes to a complex predicate. Predicates that are considered complex include the FOR SOME and FOR SOME %ELEMENT predicates.
Parallel Query Processing Ignored
Regardless of the auto parallel option setting or the presence of the %PARALLEL keyword in the FROM clause, some queries may still use linear processing, not parallel processing. InterSystems IRIS makes the decision whether or not to use parallel processing for a query after applying other query optimization options (if specified). InterSystems IRIS may determine that the optimized form of the query is not suitable for parallel processing, even if the user-specified form of the query would appear to benefit from parallel processing. You can determine if and how InterSystems IRIS has partitioned a query for parallel processing using Show Plan.
In the following circumstances specifying %PARALLEL does not perform parallel processing. The query executes successfully and no error is issued, but parallelization is not performed:
The query contains the FOR SOME predicate.
The query contains both a TOP clause and an ORDER BY clause. This combination of clauses optimizes for fastest time-to-first-row which does not use parallel processing. Adding the FROM clause %NOTOPOPT optimize-option keyword optimizes for fastest retrieval of the complete result set. If the query does not contain an aggregate function, this combination of %PARALLEL and %NOTOPOPT performs parallel processing of the query.
A query containing a LEFT OUTER JOIN or INNER JOIN in which the ON clause is not an equality condition. For example, FROM %PARALLEL Sample.Person p LEFT OUTER JOIN Sample.Employee e ON p.dob > e.dob. This occurs because SQL optimization transforms this type of join to a FULL OUTER JOIN. %PARALLEL is ignored for a FULL OUTER JOIN.
The %PARALLEL and %INORDER optimizations cannot be used together; if both are specified, %PARALLEL is ignored.
The query references a view and returns a view ID (%VID).
COUNT(*) does not use parallel processing if the table has a BITMAPEXTENT index.
%PARALLEL is intended for tables using standard data storage definitions. Its use with customized storage formats may not be supported. %PARALLEL is not supported for GLOBAL TEMPORARY tables or tables with extended global reference storage.
%PARALLEL is intended for a query that can access all rows of a table, a table defined with row-level security (ROWLEVELSECURITYOpens in a new tab) cannot perform parallel processing.
%PARALLEL is intended for use with data stored in the local database. It does not support global nodes mapped to a remote database.
Shared Memory Considerations
For parallel processing, InterSystems IRIS supports multiple InterProcess Queues (IPQ). Each IPQ handles a single parallel query. It allows parallel work unit subprocesses to send rows of data back to the main process so the main process does not have to wait for a work unit to complete. This enables parallel queries to return their first row of data as quickly as possible, without waiting for the entire query to complete. It also improves performance of aggregate functions.
Parallel query execution uses shared memory from the generic memory heap (gmheap). Users may need to increase gmheap size if they are using parallel SQL query execution. As a general rule, the memory requirement for each IPQ is 4 x 64k = 256k. InterSystems IRIS splits a parallel SQL query into the number of available CPU cores. Therefore, users need to allocate this much extra gmheap:
<Number of concurrent parallel SQL requests> x <Number cores> x 256 = <required size increase (in kilobytes) of gmheap>
Note that this formula is not 100% accurate, because a parallel query can spawn sub queries which are also parallel. Therefore it is prudent to allocate more gmheap than is specified by this formula.
Failing to allocate adequate gmheap results in errors reported to messages.log. SQL queries may fail. Other errors may also occur as other subsystems try to allocate gmheap.
To review gmheap usage by an instance, including IPQ usage in particular, from the home page of the Management Portal choose System Operation then System Usage, and click the Shared Memory Heap Usage link; see Generic (Shared) Memory Heap Usage in the “Monitoring InterSystems IRIS Using the Management Portal” chapter of the Monitoring Guide for more information.
To change the size of the gmheap (sometimes also known as the shared memory heap or SMH), from the home page of the Management Portal choose System Administration then Configuration then Additional Settings then Advanced Memory; see Memory and Startup Settings in the “Configuring InterSystems IRIS” chapter in System Administration Guide for more information.
Cached Query Considerations
If you are running a cached SQL query which uses %PARALLEL and while this query is being initialized you do something that purges cached queries, then this query could get a <NOROUTINE> error reported from one of the worker jobs. Typical actions that cause cached queries to be purged are calling $SYSTEM.SQL.Purge() or recompiling a class which this query references. Recompiling a class automatically purges any cached queries relating to that class.
If this error occurs, running the query again will probably execute successfully. Removing %PARALLEL from the query will avoid any chance of getting this error.
SQL Statements and Plan State
An SQL query which uses %PARALLEL can result in multiple SQL Statements. The Plan State for these SQL Statements is Unfrozen/Parallel. A query with a plan state of Unfrozen/Parallel cannot be frozen by user action. Refer to the “SQL Statements” chapter for further details.