Configure Parallel Query Processing
Parallel query hinting directs the system to perform parallel query processing when running on a multi-processor system. This can substantially improve performance of certain types of queries. The SQL optimizer determines whether a specific query could benefit from parallel processing, and performs parallel processing where appropriate. Specifying a parallel query hint does not force parallel processing of every query; it is only applied to those queries that may benefit from parallel processing. If the system is not a multi-processor system, this option has no effect. To determine the number of processors on the current system use the %SYSTEM.Util.NumberOfCPUs()Opens in a new tab method.
Adaptive Mode controls parallel query processing by default. If it has been turned off, you can specify parallel query processing in two ways:
-
System-wide, by setting the auto parallel option.
-
Per query, by specifying the %PARALLEL keyword in the FROM clause of an individual query.
Parallel query processing is applied to SELECT queries. It is not applied to INSERT, UPDATE, or DELETE operations.
Avoid parallel processing for queries that involve process-specific functions such as $job and $tlevel. Also avoid them in queries of process-specific variables such as %ROWID.
System-Wide Parallel Query Processing
When Adaptive Mode is off, you can still turn on system-wide parallel query processing separately by using either of the following options:
-
From the Management Portal choose System Administration, then Configuration, then SQL and Object Settings, then SQL. View or change the Execute queries in a single process check box. Note that the default for this check box is unselected, which means that parallel processing is activated by default.
-
Invoke the $SYSTEM.SQL.Util.SetOption()Opens in a new tab method, as follows: SET status=$SYSTEM.SQL.Util.SetOption("AutoParallel",1,.oldval). The default is 1 (automatic parallel processing activated). To determine the current setting, call $SYSTEM.SQL.CurrentSettings()Opens in a new tab which displays the Enable auto hinting for %PARALLEL option.
Note that changing this configuration setting purges all cached queries in all namespaces.
For more information about what system-wide parallel query processing entails, see Auto-Parallel in Using Adaptive Mode to Improve Performance.
Parallel Query Processing for a Specific Query
The optional %PARALLEL keyword is specified in the FROM clause of a query. It suggests that InterSystems IRIS perform parallel processing of the query, using multiple processors (if applicable). This can significantly improve performance of some queries that uses one or more COUNT, SUM, AVG, MAX, or MIN aggregate functions, and/or a GROUP BY clause, as well as many other types of queries. These are commonly queries that process a large quantity of data and return a small result set. For example, SELECT AVG(SaleAmt) FROM %PARALLEL User.AllSales GROUP BY Region would likely use parallel processing.
A “one row” query that specifies only aggregate functions, expressions, and subqueries performs parallel processing, with or without a GROUP BY clause. However, a “multi-row” query that specifies both individual fields and one or more aggregate functions does not perform parallel processing unless it includes a GROUP BY clause. For example, SELECT Name,AVG(Age) FROM %PARALLEL Sample.Person does not perform parallel processing, but SELECT Name,AVG(Age) FROM %PARALLEL Sample.Person GROUP BY Home_State does perform parallel processing.
If a query that specifies %PARALLEL is compiled in Runtime mode, all constants are interpreted as being in ODBC format.
Specifying %PARALLEL may degrade performance for some queries. Running a query with %PARALLEL on a system with multiple concurrent users may result in degraded overall performance.
Parallel processing can be performed when querying a view. However, parallel processing is never performed on a query that specifies a %VID, even if the %PARALLEL keyword is explicitly specified.
For further details, refer to the FROM clause.
%PARALLEL in Subqueries
%PARALLEL is intended for SELECT queries and their subqueries. An INSERT command subquery cannot use %PARALLEL.
%PARALLEL is ignored when applied to a subquery that is correlated with an enclosing query. For example:
SELECT name,age FROM Sample.Person AS p
WHERE 30<(SELECT AVG(age) FROM %PARALLEL Sample.Employee WHERE Name = p.Name)
%PARALLEL is ignored when applied to a subquery that includes a complex predicate, or a predicate that optimizes to a complex predicate. Predicates that are considered complex include the FOR SOME and FOR SOME %ELEMENT predicates.
Parallel Query Processing Ignored
Regardless of the auto parallel option setting or the presence of the %PARALLEL keyword in the FROM clause, some queries may still use linear processing, not parallel processing. InterSystems IRIS makes the decision whether or not to use parallel processing for a query after applying other query optimization options (if specified). InterSystems IRIS may determine that the optimized form of the query is not suitable for parallel processing, even if the user-specified form of the query would appear to benefit from parallel processing. You can determine if and how InterSystems IRIS has partitioned a query for parallel processing using Show Plan.
In the following circumstances specifying %PARALLEL does not perform parallel processing. The query executes successfully and no error is issued, but parallelization is not performed:
-
The query contains both a TOP clause and an ORDER BY clause. This combination of clauses optimizes for fastest time-to-first-row which does not use parallel processing. Adding the FROM clause %NOTOPOPT optimize-option keyword optimizes for fastest retrieval of the complete result set. If the query does not contain an aggregate function, this combination of %PARALLEL and %NOTOPOPT performs parallel processing of the query.
-
The query references a view and returns a view ID (%VID).
-
%PARALLEL is intended for tables using standard data storage definitions. Its use with customized storage formats may not be supported. %PARALLEL is not supported for GLOBAL TEMPORARY tables or tables with extended global reference storage.
-
%PARALLEL is intended for a query that can access all rows of a table, a table defined with row-level security (ROWLEVELSECURITYOpens in a new tab) cannot perform parallel processing.
-
%PARALLEL is intended for use with data stored in the local database. It does not support global nodes mapped to a remote database.
-
%PARALLEL is ignored on queries when the process-level NLS collation does not exactly match the NLS collation of all globals involved in query processing. When parallel processing is ignored in this way, Show Plan includes the following warning:
WARNING: IRISTEMP globals and at lease one global in table %Dictionary.PropertyDefinition have NLS collations that are different from the NLS collation of this process. This may result in wrong results.
Shared Memory Considerations
For parallel processing, InterSystems IRIS supports multiple InterProcess Queues (IPQ). Each IPQ handles a single parallel query. It allows parallel work unit subprocesses to send rows of data back to the main process so the main process does not have to wait for a work unit to complete. This enables parallel queries to return their first row of data as quickly as possible, without waiting for the entire query to complete. It also improves performance of aggregate functions.
Parallel query execution uses shared memory from the generic memory heap (gmheap). Users may need to increase gmheap size if they are using parallel SQL query execution.
Failing to allocate adequate gmheap results in errors reported to messages.log. SQL queries may fail. Other errors may also occur as other subsystems try to allocate gmheap. For example:
ERROR #7828: Unable to allocate IPQ due to lack of shared memory heap, current IPQ used '1587281920' and shared memory heap free '50734886015' configured shared memory heap size '52428800000:1'.
To review gmheap usage by an instance, including IPQ usage in particular, from the home page of the Management Portal choose System Operation then System Usage, and click the Shared Memory Heap Usage link; see Generic (Shared) Memory Heap Usage for more information.
SQL Statements and Plan State
An SQL query which uses %PARALLEL can result in multiple SQL Statements. The Plan State for these SQL Statements is Unfrozen/Parallel. A query with a plan state of Unfrozen/Parallel cannot be frozen by user action. Refer to SQL Statements for further details.