Table Statistics for Query Optimizer
To ensure maximum performance of InterSystems SQL tables, you can create and employ representative or anticipated data metrics. The optimizations can have a significant effect on any queries run against this table. The following performance optimizing considerations are discussed in this chapter:
ExtentSize, Selectivity, and BlockCount to specify table data estimates before populating the table with data; this metadata is used to optimize future queries.
Running Tune Table to analyze representative table data in a populated table; this generated metadata is used to optimize future queries.
Tune Table Calculated Values include ExtentSize, Selectivity, Outlier Selectivity, Average Field Size, and BlockCount.
ExtentSize, Selectivity, and BlockCount
When the Query Optimizer decides the most efficient way to execute a specific SQL query, three of the things it considers are:
ExtentSize row count for each table used within the query.
Selectivity the percentage of distinct values calculated for each column used by the query.
BlockCount count for each SQL map used by the query.
In order to ensure that the Query Optimizer can make the correct decisions, it is important that these values are set correctly.
You can explicitly set any of these statistics during class (table) definition, prior to populating the table with data.
After populating the table with representative data, you can run Tune Table to calculate these statistics.
After running Tune Table, you can override a calculated statistic by specifying an explicit value.
You can compare your explicitly set statistics to the Tune Table generated results. If the assumptions made by Tune Table result in less-than-optimal results from the Query Optimizer, you can use an explicitly set statistic rather than a Tune Table generated statistic.
In Studio the Class Editor window displays the class source code. At the bottom of the source code it displays the Storage definition, which includes the class ExtentSize, and the Selectivity (and, where appropriate, the OutlierSelectivity) for each property.
The ExtentSize value for a table is simply the number of rows (roughly) stored within the table.
At development time, you can provide an initial ExtentSize value. If you do not specify an ExtentSize, the default is 100,000.
Typically, you provide a rough estimate of what you expect the size of this table will be when populated with data. It is not important to have an exact number. This value is used to compare the relative costs of scanning over different tables; the most important thing is to make sure that the relative values of ExtentSize between associated tables represent an accurate ratio (that is, small tables should have a small value and large tables a large one).
CREATE TABLE provides an %EXTENTSIZE parameter keyword to specify the expected number of rows in the table, as shown in the following example:
CREATE TABLE Sample.DaysInAYear (%EXTENTSIZE 366, MonthName VARCHAR(24),Day INTEGER, Holiday VARCHAR(24),ZodiacSign VARCHAR(24))
A persistent class definition for a table can specify an ExtentSize parameter within the storage definition:
<Storage name="Default"> <Data name="MyClassDefaultData"> ... <ExtentSize>200</ExtentSize> ... </Storage>
In this example, the fragment is the storage definition for the MyClass class, which specifies a value of 200 for ExtentSize.
If your table has real (or realistic) data, you can automatically calculate and set its ExtentSize value using the Tune Table facility within the Management Portal; for details, see the following section on Tune Table.
Within an InterSystems SQL table (class), every column (property) has a Selectivity value associated with it. The Selectivity value for a column is the percentage of rows within a table that would be returned as a result of query searching for a typical value of the column. The presence of outlier values may change how InterSystems SQL interprets Selectivity values.
Selectivity is based on roughly equal quantities of the distinct values. For example, suppose a table contains a Gender column whose values are roughly evenly distributed between “M” and “F”. The Selectivity value for the Gender column would be 50%. The Selectivity value for a more distinguishing property, such as Street Name, is typically a small percentage.
A field in which all the values are the same has a selectivity of 100%. To determine this, the optimizer first tests a small number or records and if these all have the same field value, it will test up to 100,000 randomly selected records to support the assumption that all of the values of a non-indexed field are the same. If other values for a field might not be detected in a test of 100,000 randomly-selected records, you should set the selectivity manually.
A field that is defined as Unique (all values different) has a selectivity of 1 (which should not be confused with a selectivity of 1.0000%). For example, a RowID has a selectivity of 1.
At development time, you can provide this value by defining a Selectivity parameter within the storage definition that is part of the class definition for the table:
<Storage name="Default"> <Data name="MyClassDefaultData"> ... <Property name="Gender"> <Selectivity>50%</Selectivity> </Property> ... </Storage>
Typically you provide an estimate of what you expect the Selectivity will be when used within an application. As with ExtentSize, it is not important to have an exact number. Many of the data type classes provided by InterSystems IRIS will provide reasonable default values for Selectivity.
You can also use the SetFieldSelectivity()Opens in a new tab method to set the Selectivity value for a specific field (property).
If your table has real (or realistic) data, you can automatically calculate and set its Selectivity values using the Tune Table facility within the Management Portal. Tune Table determines if a field has an outlier value, a value that is far more common than any other value. If so, Tune Table calculates a separate Outlier Selectivity percentage, and calculates Selectivity based on the presence of this outlier value. The presence of an outlier value may dramatically change the Selectivity value.
Selectivity is used for query optimization. The same Selectivity value is used for a field specified in a SELECT query and the same field specified in the SELECT clause of a view. Note that a view may have a different distribution of rows than the source table. This can affect the accuracy of view field selectivity.
When you compile a persistent class, the class compiler computes approximate numbers of map blocks used by each SQL map based on the ExtentSize and the property definitions. You can view these BlockCount values in the Map BlockCount tab of the Tune Table facility. The BlockCount is identified in Tune Table as Estimated by class compiler. Note that if you change the ExtentSize, you must close and re-open the SQL Tune Table window to see this change reflected in the BlockCount values.
When you run Tune Table, it measures the actual block count for each SQL map. Unless specified otherwise, the Tune Table measured values replace the class compiler approximate values. These Tune Table measured values are represented in the class definition as negative integers, to distinguish them from specified BlockCount values. This is shown in the following example:
<SQLMap name="IDKEY"> <BlockCount>-4</BlockCount> </SQLMap>
Tune Table measured values are represented in Tune Table as positive integers, identified as Measured by TuneTable.
You can define explicit BlockCount values in the class definition. You can explicitly specify a block count as a positive integer, as shown in the following example:
<SQLMap name="IDKEY"> <BlockCount>12</BlockCount> </SQLMap>
When you define a class you can omit defining the BlockCount for a map, explicitly specify a BlockCount as a positive integer, or explicitly define the BlockCount as NULL.
If you do not specify a BlockCount, or specify a BlockCount of 0, the class compiler estimates the block count. Running Tune Table replaces the class compiler estimated value.
If you specify an explicit positive integer BlockCount, running Tune Table does not replace this explicit BlockCount value. Explicit class definition block count values are represented in Tune Table as positive integers, identified as Defined in class definition. These block count values are not changed by subsequently running Tune Table.
If you specify an explicit BlockCount of NULL, the SQL Map uses the BlockCount value estimated by the class compiler. Because BlockCount is “defined” in the class definition, running Tune Table does not replace this estimated BlockCount value.
The size of all InterSystems SQL map blocks is 2048 bytes (2K bytes).
Tune Table does not measure BlockCount in the following circumstances:
If the table is a child table projected by an array or a list collection. The BlockCount values for these types of child tables are the same as BlockCount for the data map of the parent table.
If a global map is a remote global (a global in a different namespace). The estimated BlockCount used during class compilation is used instead.
Tune Table is a utility that examines the data in a table and returns statistics about the ExtentSize (the number of rows in the table), the relative distribution of distinct values in each field, and the Average Field Size (average length of values in each field). It also generates the BlockCount for each SQL map. You can specify that Tune Table use this information to update the metadata associated with a table and each of its fields. The query optimizer can subsequently use these statistics to determine the most efficient execution plan for a query.
Using Tune Table on an external table will only calculate the ExtentSize. Tune Table cannot calculate field Selectivity values, Average Field Size, or map BlockCount values for an external table.
When to Run Tune Table
You should run Tune Table on each table after that table has been populated with a representative quantity of real data. Commonly, you only need to run Tune Table once, as a final step in application development, before the data goes “live.” Tune Table is not a maintenance utility; it should not be run periodically on live data.
Because efficient query processing depends on Tune Table statistics, InterSystems IRIS runs Tune Table:
Automatically when the first SELECT query is executed against a table that has never been tuned. Automatic Tune Table execution is only performed under very specific circumstances.
Manually when you execute Tune Table using any of the supported user interfaces.
Generally, Tune Table should not be re-run when table data is added, modified, or deleted, unless there is an order-of-magnitude change to the characteristics of the current data, as follows:
Relative Table Sizes: Tune Table assumes that it is analyzing a representative subset of the data. This subset can be only a small percentage of the full data set, if it is a representative subset. Tune Table results remain relevant as the number of rows in a table changes, provided that the ExtentSizes of tables involved in joins or other relationships maintain roughly the same relative sizes. ExtentSize needs to be updated if the ratio between joined tables changes by an order of magnitude. This is important for JOIN statements, because the SQL optimizer uses ExtentSize when optimizing the table join order. As a general rule, a smaller table is joined before a larger table, regardless of the join order specified in the query. Therefore, you would want to re-run Tune Table on one or more tables if the ratio of rows in tableA to tableB changes from 1000:2000 to 10000:2000, but not if it changes to 2100:4000.
Even Value Distribution: Tune Table assumes that every data value is equally likely. If it detects an outlier value, it assumes that every data value other than the outlier value is equally likely. Tune Table establishes Selectivity by analyzing the current data values for each field. Equal likelihood in real data is always a rough approximation; normal variation in the number of distinct data values and their relative distribution should not warrant re-running Tune Table. However, an order-of-magnitude change in the number of possible values for a field (the ratio of distinct values to records), or the overall likelihood of a single field value can result in inaccurate Selectivity. Dramatically changing the percentage of records with a single field value can cause Tune Table to designate an outlier value or to remove outlier value designation, significantly changing the calculated Selectivity. If the Selectivity of a field no longer reflects the actual distribution of data values, you should re-run Tune Table.
A significant InterSystems IRIS upgrade, or a new site installation may warrant re-running Tune Table.
Automatic Tune Table
InterSystems IRIS automatically runs Tune Table the first time a SELECT query is executed against a table, subject to the following circumstances:
The table has never been tuned, either by a prior execution of Tune Table or by setting ExtentSize or Selectivity values. All non-unique fields must have selectivity = "".
The query has never been run. The query has not been cached. Executing the query invokes the optimizer to generate a query plan.
The table is a regular table. It is not a view or a some other non-table data structure.
The table is not sharded.
The table is not a linked table.
The query is not being run as part of the build routine.
Manual Tune Table
There are three manual interfaces for running Tune Table:
Using the Management Portal SQL interface Actions drop-down list, which allows you to run Tune Table on a single table or on multiple tables.
Invoking the $SYSTEM.SQL.Stats.Table.GatherTableStats()Opens in a new tab method for a single table, or all tables in the current namespace.
Issuing the SQL command TUNE TABLE for a single table.
Tune Table purges cached queries that reference the table being tuned. The TUNE TABLE command provides a recompile cached queries option to regenerate the cached queries using the new Tune Table calculated values.
If the table is mapped to a readonly database, Tune Table cannot be performed and an error message is generated.
After running the Tune Table facility, the resulting ExtentSize and Selectivity values are saved in the class’s storage definition. To view the storage definition, in Studio, from the View menu, select View Storage; Studio includes the storage at the bottom of the source code for the class.
There are rare cases where running Tune Table can decrease SQL performance. While Tune Table can be run on live data, it is recommended that you run Tune Table on a test system with real data, rather than on a production system. You can use the optional System Mode configuration parameter to indicate whether the current system is a test system or a live system. When set, the System Mode is displayed at the top of the Management Portal page, and can be returned by the $SYSTEM.Version.SystemMode()Opens in a new tab method.
Tune Table from the Management Portal
To run Tune Table from the Management Portal:
Select System Explorer, then SQL. Select a namespace by clicking the Switch option at the top of the page, then selecting a namespace from the displayed list. (You can set the Management Portal default namespace for each user.)
Select a Schema from the drop-down list on the left side of the screen, or use a Filter. For further details on how to use Schema and Filter, refer to Browsing SQL Schemas in the “Using the Management Portal SQL Interface” of this manual.
Do one of the following:
Tune a Single Table: Expand the Tables category and select a table from the list. Once you have selected a table, click the Actions drop-down list and select Tune Table Information. This displays the table’s current ExtentSize and Selectivity information. If Tune Table has never been run, ExtentSize=100000, no Selectivity, Outlier Selectivity, Outlier Value, or Average Field Size information is shown (other than the RowID having a selectivity of 1), and the Map BlockCount information is listed as Estimated by class compiler.
From the Selectivity tab, select the Tune Table button. This runs Tune Table on the table, calculating the ExtentSize, Selectivity, Outlier Selectivity, Outlier Value, and Average Field Size values based on the data in the table. The Map BlockCount information is listed as Measured by Tune Table.
Tune Table on a single table always runs as a background process and refreshes the table when done. This prevents timeout issues. While this background process is running, an in process message is displayed. The Close button is available to close the Tune Table window while the background process executes.
Tune All Tables in the Schema: click the Actions drop-down list and select Tune All Tables in Schema. This displays the Tune Table box. Select the Finish button to run Tune Table on all tables in the schema. When Tune Table completes this box displays a Done button. Select Done to exit the Tune Table box.
The SQL Tune Table window has two tabs: Selectivity and Map BlockCount. These tabs display the current values generated by Tune Table. They also allow you to manually set different values than the values generated by Tune Table.
The Selectivity tab contains the following fields:
Current Table Extentsize. This field has an edit button that allows you to enter a different Table Extentsize.
Keep class up to date check box. Any changes to statistics generated by Tune Table, or by user input value from the Tune Table interface, or from Tune Table methods are immediately represented in the class definition:
If this box is not checked (No), the up-to-date flag on the modified class definition is not set. This indicates that the class definition is out of date and should be recompiled. This is the default.
If this box is checked (Yes), the class definition remains flagged as up-to-date. This is the preferred option when making changes to statistics on a live system, because it makes it less likely that a table class definition will be recompiled.
Fields table with columns for Field Name, Selectivity, Notes, Outlier Selectivity, Outlier Value, and Average Field Size. By clicking on a Fields table heading, you can sort by that column’s values. By clicking on a Fields table row, you can manually set values for Selectivity, Outlier Selectivity, Outlier Value, and Average Field Size for that field.
The Map BlockCount tab contains the following fields:
Map Name table with columns for SQL Map Name, BlockCount, and Source of BlockCount. The SQL Map Name for an index is the SQL index name; this may differ from the persistent class index property name.
By clicking on an individual map name, you can manually set a BlockCount value for that map name.
From the Selectivity tab, you can click the Tune Table button to run Tune Table on this table.
Tune Table using a Method
You can use the $SYSTEM.SQL.Stats.Table.GatherTableStats()Opens in a new tab method to run the Tune Table facility in the current namespace.
GatherTableStats(“Sample.MyTable”) to run Tune Table on a single table.
GatherSchemaStats(“Sample”) to run Tune Table on all tables in the specified schema.
GatherTableStats(“*”) to run Tune Table on all tables in the current namespace.
When using the GatherTableStats() method, the following error messages may be generated:
Non-existent table: DO $SYSTEM.SQL.Stats.Table.GatherTableStats("NoSuchTable")
No such table 'SQLUser.NoSuchTable'
View: DO $SYSTEM.SQL.Stats.Table.GatherTableStats("Sample.MyView")
'Sample.MyView' is a view, not a table. No tuning will be performed.
When running GatherTableStats(”*”) or GatherSchemaStats(”SchemaName”), the system will use multiple processes to tune multiple tables in parallel, if the system supports parallel processing.
Running Tune Table on a Sharded Table
If Tune Table is run on a sharded table, the Tune Table operation is forwarded to each of the shards, where it runs against that shard of the table. Tune Table does not execute in the master namespace from which it was invoked. If Tune Table is run on a non-sharded table which has had its class definition exported to the shards, because that table is joined to a sharded table, the Tune Table operation is forwarded to each of the shards, and it is also executed in the master namespace.
The following guidelines should be followed when running Tune Table on a sharded table:
Tune the shard-master table, not the shard-local table.
EXTENTSIZE and BLOCKCOUNT values are per-shard values, not a total value for all shards.
If using $SYSTEM.SQL.Stats.Table.Export() and $SYSTEM.SQL.Stats.Table.Import(), export/import the table statics for the shard-master table, not the shard-local table.
When gather statistics for a sharded table, the RecompileCachedQueries argument is ignored, and cached queries for the table are always purged.
Gathering statistics for a sharded table will define table statistics in both the shard-master and shard-local class/table definition. If manually editing tune table metadata in the class definitions, the suggested procedure is to modify the definition of the shard-master class, then recompile the shard-master class. When the shard-master class is compiled, the shard-master table statistics will be copied to the shard-local version of the class.
If GatherTableStats() or GatherSchemaStats() specifies a logFile parameter, the log file in the shard master instance has an entry for the specified table, such as the following:
Sharded table: TABLE: <tablename> Invoking TuneTable on shards for sharded table <tablename>
Non-sharded table: TABLE: <tablename> Invoking TuneTable on shards for mapped non-sharded table <tablename>
On each of the shard instances, a log file of the same name is created in the mgr/<shard-namespace> directory, logging the Tune Table information for this table on this shard. If a directory path was specified for the log file, it is ignored on the shards, and the file is always stored in mgr/<shard- namespace>.
Tune Table Calculated Values
The Tune Table operation calculates and sets table statistics based on the representative data in the table:
ExtentSize, which may or may not be the actual number of rows in the table (Row Count).
Selectivity for each property (field) in the table. You can optionally prevent selectivity calculation for individual properties.
Outlier Selectivity for a property in which one value appears much more commonly than other values. An efficient query can make use of outlier optimization or use RTPC to choose a plan based on the query parameters as runtime.
Notes for each property that identify certain property characteristics.
Average Field Size for each property.
SQL Map Name, BlockCount, and Source of BlockCount for the table.
Extent Size and the Row Count
When running the Tune Table facility from the Management Portal, the ExtentSize is the actual count of the rows currently in the table. By default, the GatherTableStats() method also uses the actual row count as the ExtentSize. When a table contains a large number of rows, it may be preferable to perform analysis on a smaller number of rows. You can use the SQL TUNE TABLE command and specify %SAMPLE_PERCENT to perform analysis on only a percentage of the total rows. You can use this option to improve performance when running against a table with a large number of rows. This %SAMPLE_PERCENT value should be large enough to sample representative data. If ExtentSize < 1000, TUNE TABLE analyzes all rows, regardless of the %SAMPLE_PERCENT value.
A specified ExtentSize can be smaller or larger than the actual number of rows. However, ExtentSize should not significantly exceed the actual number of rows in the current table data. When you specify an ExtentSize, Tune Table extrapolates row Ids for that number of rows, then performs sampling. If the ExtentSize greatly exceeds the actual number of rows, most of the sampled row Ids will not correspond to actual row data. If this is the case, field selectivities cannot be calculated; instead, Tune Table lists the ExtentSize you specified as the CALCULATED ExtentSize and a smaller number as the SAMPLESIZE; Tune Table returns <Not Specified> for these non-existent Calculated values.
You can set an ExtentSize of 0. This may be desirable when you have a table that is never intended to be populated with data, but used for other purposes such as query joins. When you set ExtentSize to 0, InterSystems IRIS sets the Selectivity of each field as 100%, and the Average Field Size of each field as 0.
Selectivity and Outlier Selectivity
Tune Table calculates a Selectivity for each property (field) value as a percentage. It does this by sampling the data, so selectivity is always an estimate, not an exact value. Selectivity is based on the assumption that all property values are, or could be, equally likely. This is a reasonable assumption for most data. For example, in a general population table most data values are typical: any given date of birth will appear in around .27% of the data (1 in 365); roughly half will be female and half male (50%). A field that is defined as Unique has a selectivity of 1 (which should not be confused with a selectivity of 1.0000 (1%). A selectivity percentage is sufficient for most properties.
For a few properties, Tune Table also calculates an Outlier Selectivity. This is a percentage for a single property value that appears much more frequently in the sample than the other data values. Tune Table only returns an outlier selectivity when there is a substantial difference between the frequency of one data value and the frequency of the other data values. Tune Table returns, at most, one outlier for a table, regardless of the distribution of data values. If an outlier is selected, Tune Table displays this value as the Outlier Value. NULL is represented as <Null>. If Tune Table returns an outlier selectivity, the normal selectivity is still the percentage of each non-outlier data value within the whole set of rows.
If the Tune Table initial sample returns only a single value, but additional sampling returns more than a single distinct value, the normal selectivity is modified by these sampling results. For example, an initial random sample of 990 values detects only one value, but subsequent sampling detects 10 single instances of other distinct values. In this situation, the initial outlier value influences the selectivity value, which is now set to 1/1000 (0.1%) since each of the 10 non-outlier values occurs only once in the 1000 records.
The most common example of outlier selectivity is a property that permits NULLs. If the number of records with NULL for a property greatly exceeds the number of records that have any specific data value for that property, NULL is the outlier. The following is the Selectivity and Outlier Selectivity for the FavoriteColors field:
SELECTIVITY of FIELD FavoriteColors CURRENT = 1.8966% CALCULATED = 1.4405% CURRENT OUTLIER = 45.0000%, VALUE = <Null> CALCULATED OUTLIER = 39.5000%, VALUE = <Null>
If a field only contains one distinct value (all rows have the same value), that field has a Selectivity of 100%. A value that has a selectivity of 100% is not considered to be an outlier. Tune Table establishes Selectivity and Outlier Selectivity values by sampling the data. To determine this, the Tune Table first tests a small number or records and if these all have the same field value, it will test up to 100,000 randomly selected records to support the assumption that all of the values of a non-indexed field are the same. Tune Table can only fully determine if all the values of a field are the same if the field is indexed, the field is the first field of the index, and the field and the index have the same collation type.
If a non-indexed field is known to have other values that might not be detected in a test of 100,000 randomly-selected records, you should set the selectivity and outlier selectivity manually.
If a non-indexed field is known to have no other values, you can manually specify a Selectivity of 100%, delete any outlier selectivity, and set CALCSELECTIVITY=0 to prevent Tune Table attempting to calculate selectivity or specify this value as an outlier.
To modify these Selectivity, Outlier Selectivity, and Outlier Value calculated values, select an individual field from the Tune Table display. This displays these values for that field in the Details area to the right of the display. You can modify Selectivity, Outlier Selectivity, and/or Outlier Value to values that better fit the anticipated full data set.
You can specify Selectivity either as a percentage of rows with a percent (%) sign, or as an integer number of rows (no percent sign). If specified as an integer number of rows, InterSystems IRIS uses the extent size to calculate the Selectivity percentage.
You can specify an Outlier Selectivity and Outlier Value for a field that previously had no outlier. Specify Outlier Selectivity as a percentage with a percent (%) sign. If you specify just the Outlier Selectivity, Tune Table assumes the Outlier Value is <Null>. If you specify just the Outlier Value, Tune Table will not save this value unless you also specify an Outlier Selectivity.
CALCSELECTIVITY Parameter and Not Calculating Selectivity
Under certain circumstances, you may not want the Tune Table facility to calculate the Selectivity for a property. To prevent Selectivity from being calculated, specify the value of the property’s CALCSELECTIVITY parameter to 0 (the default is 1). In Studio, you can set CALCSELECTIVITY on the Property Parameters page of the New Property Wizard or in the list of a property’s parameters in the Inspector (you may need to contract and re-expand the property parameter list to display it).
One circumstance where you should specify CALCSELECTIVITY=0 is a field that is known to contain only one value in all rows (Selectivity=100%), if that field is not indexed.
The Notes Column
The Management Portal Tune Table Information option displays a Notes column for each field. The values in this field are system-defined and non-modifiable. They include the following:
RowID field: A table has one RowID, which is defined by the system. Its name is commonly ID, but it can have a different system-assigned name. Because all of its values are (by definition) unique, its Selectivity is always 1. If the class definition includes SqlRowIdPrivate, the Notes column value is RowID field, Hidden field.
Hidden field: A hidden field is defined as private, and is not displayed by SELECT *. By default, CREATE TABLE defines the RowID field as hidden; you can specify the %PUBLICROWID keyword to make the RowID not hidden and public. By default, tables defined by a persistent class definition define the RowID as not hidden; you can specify SqlRowIdPrivate to define the RowID as hidden and private. Container fields are defined as hidden.
Stream field: Indicates a field defined with a stream data type, either character stream (CLOB) or binary stream (BLOB). A stream file has no Average Field Size.
Parent reference field: A field that references a parent table.
An IDENTITY field, ROWVERSION field, SERIAL field, or UNIQUEIDENTIFIER (GUID) field is not identified in the Notes column.
Average Field Size
Running Tune Table calculates the average field size (in characters) for all non-Stream fields, based on the current table data set. This is (unless otherwise noted) the same as AVG($LENGTH(field)), rounded to two decimal places. You can change this average field size for individual fields to reflect the anticipated average size of the field’s data.
NULL: Because the $LENGTH function treats NULL fields as having a length of 0, NULL fields are averaged in, with a length 0. This may result in an Average Field Size of less than one character.
Empty column: If a column contains no data (no field values for all of the rows), the average field size value is 1, not 0. The AVG($LENGTH(field)) is 0 for a column that contains no data.
ExtentSize=0: When you set ExtentSize to 0, Average Field Size for all fields is reset to 0.
Logical field values: Average Field Size is always calculated based on the field’s Logical (internal) value.
List fields: InterSystems IRIS List fields are calculated based on their Logical (internal) encoded value. This encoded length is longer than the total length of the elements in the list.
Container fields: A container field for a collection is larger than the total length of its collection objects. For example, in Sample.Person the Home container field Average Field Size is larger than the total of the average field sizes of Home_Street, Home_City, Home_State, and Home_Zip. For further details, refer to “Controlling the SQL Projection of Collection Properties” in Using InterSystems IRIS Objects.
Stream fields: A stream field does not have an average field size.
If the property parameter CALCSELECTIVITY is set to 0 for a property/field, Tune Table does not calculate the Average Field Size for that property/field.
You can modify an Average Field Size calculated value by selecting an individual field from the Tune Table display. This displays the values for that field in the Details area to the right of the display. You can modify the Average Field Size to a value that better fits the anticipated full data set. Because Tune Table performs no validation when you set this value, you should make sure that the field is not a Stream field, and that the value you specify is not larger than the maximum field size (MaxLen).
The Average Field Size is also displayed in the Management Portal Catalog Details tab Fields option table. Tune Table must have been run for the Fields option table to display Average Field Size values. For further details, refer to the Catalog Details Tab section in the “Using the Management Portal SQL Interface” chapter of this guide.
Map BlockCount Tab
The Tune Table Map BlockCount tab displays the SQL Map Name, BlockCount (as a positive integer), and Source of BlockCount. The Source of BlockCount can be Defined in class definition, Estimated by class compiler, or Measured by TuneTable. Running Tune Table changes Estimated by class compiler to Measured by TuneTable; it does not affect Defined in class definition values.
You can modify a BlockCount calculated value by selecting an individual SQL Map Name from the Tune Table display. This displays the block count for that Map Name in the Details area to the right of the display. You can modify the BlockCount to a value that better fits the anticipated full data set. Because Tune Table performs no validation when you set this value, you should make sure that the block count is a valid value. Modifying BlockCount changes the Source of BlockCount to Defined in class definition. For further details, refer to the BlockCount section in this chapter.
Exporting and Re-importing Tune Table Statistics
You can export Tune Table statistics from a table or group of tables and then import these Tune Table statistics into a table or group of tables. The following are three circumstances in which you might want to perform this export/import. (For simplicity, these describe the export/import of statistics from a single table; in actual use, export/import of statistics from multiple inter-related tables is often performed):
To model a production system: A production table is fully populated with actual data and optimized using Tune Table. In a test environment you create a table with the same table definition but far less data. By exporting the Tune Table statistics from the production table and importing them into the test table, you can model the production table optimization on the test table.
To replicate a production system: A production table is fully populated with actual data and optimized using Tune Table. A second production table with the same table definition is created. (For example, a production environment and its backup environment, or a multiple identical table definitions with each table containing the patient records for a different hospital.) By exporting the Tune Table statistics from the first table and importing them into the second table, you can give the second table the same optimization as the first table without the overhead of running Tune Table a second time or waiting for the second table to be populated with representative data.
To revert to a prior set of statistics: You can create optimization statistics for a table by running Tune Table or by explicitly setting statistics. By exporting these statistics you can preserve them while experimenting with other statistics settings. Once you have determined the optimal set of statistics, you can import them back into the table.
You can export Tune Table statistics to an XML file using the $SYSTEM.SQL.Stats.Table.Export()Opens in a new tab method. This method can export the Tune Table statistics for one, more than one, or all tables within a namespace, as shown in the following examples:
DO $SYSTEM.SQL.Stats.Table.Export("C:\AllStats.xml") /* Exports TuneTable Statistics for all schemas/tables in the current namespace */
DO $SYSTEM.SQL.Stats.Table.Export("C:\SampleStats.xml","Sample") /* Exports TuneTable Statistics for all tables in the Sample schema */
DO $SYSTEM.SQL.Stats.Table.Export("C:\SamplePStats.xml","Sample","P*") /* Exports TuneTable Statistics for all tables beginning with the letter "P" in the Sample schema */
DO $SYSTEM.SQL.Stats.Table.Export("C:\SamplePersonStats.xml","Sample","Person") /* Exports TuneTable Statistics for the Sample.Person table */
You can re-import Tune Table statistics that were exported using $SYSTEM.SQL.Stats.Table.Export() by using the $SYSTEM.SQL.Stats.Table.Import()Opens in a new tab method.
$SYSTEM.SQL.Stats.Table.Import() has a KeepClassUpToDate boolean option. If TRUE (and update is TRUE), $SYSTEM.SQL.Stats.Table.Import() will update the class definition with the new EXTENTSIZE and SELECTIVITY values, but the class definition will be kept as up-to-date. In many cases, however, it is desirable to recompile the class after its table has been tuned so that queries in the class definition can be recompiled and the SQL query optimizer can use the updated data statistics. The default is FALSE (0). Note that if the class is deployed the class definition will not be updated.
$SYSTEM.SQL.Stats.Table.Import() has a ClearCurrentStats boolean option. If TRUE, $SYSTEM.SQL.Stats.Table.Import() will clear any prior EXTENTSIZE, SELECTIVITY, BLOCKCOUNT and other Tune Table statistics from the existing table before importing the stats. This can be used if you want to completely clear those table stats that are not specified in the import file, instead of leaving them defined in the persistent class for the table. The default is FALSE (0).
If $SYSTEM.SQL.Stats.Table.Import() does not find the corresponding table, it skips that table and proceeds to the next table specified in the import file. If a table is found, but some of the fields are not found, those fields will simply be skipped.
The BlockCount for a map in a class storage definition cannot be inherited. The BlockCount can only appear in the storage definition of the class where the map originated. $SYSTEM.SQL.Stats.Table.Import() only sets the projected table's BlockCount metadata and not the class storage BlockCount metadata if the map originated in a super class.