Compiling and Building Cubes
This page describes how to compile and build Business Intelligence cubes.
During the build process, users cannot execute queries. (However, if a query is currently running, you can build the cube.)
When to Recompile and Rebuild
Upon upgrade from a previous version of InterSystems IRIS, it is best practice to recompile all cube and subject area classes, to take advantage of any new optimizations.
If you make any change to a cube class or a subject area class, you must recompile that class before those changes take effect. For many changes to a cube, you must also rebuild the cube before those changes take effect.
The following table lists the required actions after changes:
Element Type | Type of Change | Required Actions |
---|---|---|
Cube (root element) | Edits to Name or Source class | Recompile and rebuild |
Filter Value | Other changes that apply to the cube but not to specific elements in the cube | Recompile |
Measure | Edits to the following options of an existing measure (many other elements have some or all of these common options).
|
Recompile |
Deleting measures | Recompile | |
All other changes, including adding measures | Recompile and perform a selective build for the changed measure(s), or recompile and perform a full cube rebuild | |
Dimension (not a computed dimension) | Edits to the following options of an existing dimension:
|
Recompile |
Deleting dimensions† | Recompile | |
All other changes, including adding dimensions | Recompile and perform a selective build for the levels that make up the changed dimension(s), or recompile and perform a full cube rebuild | |
Computed dimension | All changes | Recompile |
NLP dimension | All changes | Recompile |
Hierarchy | Edits to the common options of an existing hierarchy (as listed in Measure) | Recompile |
Hierarchy | All other changes, including adding and deleting hierarchies | Recompile |
Level | Edits to the following options of an existing level:
|
Recompile |
Deleting levels† | Recompile | |
All other changes, including adding levels* | Recompile and perform a selective build for the changed level(s), or recompile and perform a full cube rebuild | |
Property | Edits to the following options of an existing property:
|
Recompile |
Property | All other changes, including adding and deleting properties | Recompile and rebuild |
Listing | All changes | Recompile |
Calculated member | All changes | Recompile |
Named set | All changes | Recompile |
Subject area | All changes | Recompile |
Compound cube (a kind of subject area) | All changes | Recompile (after recompiling all cubes used in the compound cube) |
Quality measure | All changes | Recompile the quality measure class |
KPI or plug-in | All changes | Recompile the KPI or plug-in class |
*The current server locale determines the names of members of a time dimension. (See Using the Locale to Control the Names of Time Members.) If you change the locale, it is necessary to recompile and rebuild the cube.
†When you delete a dimension or a level and recompile, that does not delete the associated level tables and indexes. Rebuilding the cube also does not delete the no-longer-needed level tables and indexes.
Compiling a Cube
To compile a cube class in the Architect:
-
Click Compile.
The system starts to compile the class and displays a dialog box that shows progress.
If you have made changes that you have not yet saved, the system saves them before compiling the cube.
-
Click OK.
Or open the cube class in an IDE and compile it in the same way that you compile other classes.
When you compile a cube class, the system automatically generates the fact table and all related classes if needed. If the fact table already exists, the system regenerates it only if it is necessary to make a structural change.
If there are any cached results for this cube, the system purges them.
Building a Cube
The phrase building a cube refers to two tasks: adding data to the fact table and other tables and building the indexes used to access this data.
To perform a full cube build in the Architect:
-
Click Build.
The system displays a dialog box which summarizes the build procedure which it will perform; if there are any related cubes which depend upon the cube you are building, Business Intelligence will build those dependent cubes as well, determining the appropriate build order automatically.
Note that the Build option may be greyed out. In this case, you must compile the cube before performing a build.
-
Optionally specify a value for Maximum Number of Records to Build.
By default, the system iterates through all records in the source table and builds the same number of records to the fact table. You can override this behavior when you build the cube. If you specify the Maximum Number of Records to Build option, the system iterates through only that number of records. The result is a smaller fact table that the system builds more quickly.
If the Maximum Number of Records to Build field is initialized with a number, that means that the cube class overrides the default behavior. (For details, see the maxFacts attribute for <cube> in Reference Information for Cube Classes.) In this case, you can either use the value provided by the cube class or enter a smaller value.
-
Select Build Everything in the Build Option section of the dialog box.
-
Click Build.
The system starts to build the cube (and dependent cubes, as applicable) and displays progress as it does so. The Compile button is deactivated (greyed out) for the duration of the build.
Note:Clicking the Close button during the build process does not interrupt the build process. You may reopen it at any time to see the current state of any build which is currently in progress. If the build completes while the dialog is closed, the dialog will reappear to notify the user of build completion.
-
Click Close.
Upon completion, the cube is available for use as described in Using the Analyzer.
Using Selective Build
You can use the Selective Build feature to build certain elements in a cube without rebuilding the entire cube and experiencing the attendant downtime. For example, if you recently made changes to a specific dimension or have source data changes that affect only one dimension, you can use Selective Build to build the relevant levels in that dimension. You can also use Selective Build to build a recently added level, measure, or relationship.
You can use Selective Build to build specific levels, measures, or relationships in a cube. More specifically, when you use Selective Build, columns in the fact tables (which each correspond to a level, measure, or relationship) are built. Using Selective Build on a cube does not trigger a need to update that cube’s dependent cubes.
A cube may inherit elements of its definition from another cube. When Selective Build is enabled for a cube inherited by another, the inheriting cube is able to read the designated factNumbers in the supercube definition and assign factNumbers to the subcube definition accordingly. The subcube does not assume that the factNumbers of the supercube remain the same, and therefore regenerates all of its own factNumbers. This protects the current cube from any changes in the supercube that might have been assigned a factNumber that conflicts with a compiled factNumber in the current cube.
Implications of Selective Build
When a selective build is taking place, only the cube elements that are being built are unavailable for queries. Selective Build and cube synchronization cannot happen simultaneously, so while the cube is not entirely inactive as in a full build, cube synchronization is unavailable while a selective build is taking place, and vice versa. Any significant build operation can block a planned synchronize. Furthermore, selective builds take longer than full builds, so budget time accordingly.
Selective build attempts to synchronize the cube at the end of the main build procedure. Synchronization prevents any mismatch between your current source data and any columns in the fact table which you excluded from the selective build.
For this reason, InterSystems recommends using selective build only for cubes where synchronization is possible. If you perform a selective build on a cube where synchronization is not possible, you must subsequently perform a full build to ensure the accuracy of columns which were excluded from the selective build.
Multiple selective builds may run at the same time. In this case, each selective build will only build its selected cube elements. You can build multiple columns at once with Selective Build, but you cannot build any column more than once at the same time.
The system handles Selective Build errors the same way it handles errors for full cube builds.
If you implement %OnProcessFact() to process facts in a cube conditionally based on the value of a certain level or measure, that level or measure must be included in a Selective Build of that cube. Otherwise, the Selective Build will yield errors.
Using Selective Build in the Architect
Selective Build is automatically enabled for all cubes. You must compile your cube before you can use Selective Build.
The following procedure provides an example of using the Selective Build feature:
-
Navigate to the Analyzer and open the HoleFoods cube.
-
In the Model Contents pane, expand the Outlet dimension, then expand the Region level. Drag the Region level over to the Rows area. Observe the resulting pivot table.
-
Next, open the Architect. Click the Region level of the Outlet dimension in the Model Viewer.
-
In the Details Area to the right, under Source Values, select Expression. Enter the following in the Expression text box:
%source.Outlet.Country.Region.%ID _ "-" _ %source.Outlet.Country.Region.Name
-
Compile the HoleFoods cube.
-
Click Build. When the Build Cube dialog appears, note that the system automatically detects that the [Outlet].[H1].[Region] level has changed and preselects a Selective Build for [Outlet].[H1].[Region]. Click Build.
-
Navigate back to the Analyzer. In the Model Contents pane, expand the Outlet dimension, then expand the Region level. Drag the Region level over to the Rows area. Observe that the resulting pivot table and note the differences for the Region level.
Building the Cube Programmatically
To build the cube programmatically, execute the %BuildCube()Opens in a new tab class method of the %DeepSee.UtilsOpens in a new tab class. This method has the following signature:
classmethod %BuildCube(pCubeList As %String = "", pAsync As %Boolean = 1, pVerbose As %Boolean = 1,
pIndexOnly As %Boolean = 0, pMaxFacts As %Integer = 0, pTracking As %Boolean = 1,
ByRef pBuildStatistics As %String = "", pFactList As %String = "") as %Status
Where:
-
pCubeList is a comma-separated list which contains the names of the cubes which you want to build. You must provide the logical name of the cube. In the Architect, this is the cube’s Cube Name. In the cube class’s XData block, this is the <cube> element’s Name attribute. A cube’s logical name is not case-sensitive.
%BuildCube() builds the cubes which you specify in pCubeList, as well as any related cubes which depend upon them, determining the correct build order automatically.
-
pAsync controls whether the system performs the build operation using multiple background processes. If this argument is true, then the system uses multiple processes to build each cube in order wherever possible, and does not return until they are all complete.If this argument is false, the system uses a single process and does not return until it is complete.
Note:If you have specified the cube option Initial build order for a cube, the system ignores the value of pAsync and uses a single process to build the cube. These options are described in Specifying Cube Options.
If you are using %SetAgentCount to limit the assignment of worker agents to background tasks within a namespace, only one build may be active within that namespace at any given time. Build operations in other namespaces are not affected.
-
pVerbose controls whether the method writes status information. If this argument is 1, the system writes status updates to the current device. (This argument does not affect whether the method writes build errors or other logging information.)
-
pIndexOnly controls whether the method only updates the indexes. If this argument is 1, the system only updates the indexes of the fact table.
-
pMaxFacts specifies the maximum number of rows from the base table that the system should use to populate the fact table when it builds the cube (or cubes).
If pMaxFacts is 0 (the default) the system processes all of the rows in the base table
-
pTracking is for internal use.
-
pBuildStatistics returns an array of information about the build operation, by reference. This array contains the following values:
-
pBuildStatistics(“elapsedTime”) — the total elapsed build time, in seconds.
-
pBuildStatistics(“errors”) — the total number of errors that were encountered while building all the cubes.
-
pBuildStatistics(“factCount”) — the total number of facts that were built and indexed across all of the cubes which were built.
-
pBuildStatistics(“missingReferences”) — the total number of missing references across all of the cubes which were built.
-
pBuildStatistics(“expressionTime”) — the total length of time which was spent processing source expressions to build cube elements across all the cubes which were built.
-
pBuildStatistics(“iKnowTime”) — the total length of time which was spent building NLP indexes across all the cubes which were built.
-
pBuildStatistics(“cubes”, <cubeName>, <statisticName>) — the value of one of the preceding statistics (identified by <statisticName>) for the build operation on an individual cube (identified by <cubeName>). For example, pBuildStatistics(“cubes”, “PATIENTS”, “factCount”) provides the number of facts that were built and indexed during the building of the PATIENTS cube.
-
pBuildStatistics(“cubes”, <cubeName>, “async”) — whether the system used multiple background processes to build the cube identified by <cubeName>. If the system built the cube using a single process (for example, if you have specified an Initial Build Order for the cube), the value is 0.
-
-
pFactList is a list of specific Property names which may be found in the fact classes for the specified cube. If pFactList is supplied, the build will only update the columns listed in that fact list. This list can be in either comma-delimited or $LB format. The specific facts being updated will be individually marked as unavailable for queries while the build operation is underway. Queries referencing dimensions based on those facts will throw an error when they are prepared.
Note:You can only use pFactList when pCubeList identifies one cube as the target of a selective build. If other cubes depend upon the cube that you have targeted, %BuildCube() will synchronize the dependent cubes automatically to reflect the changes to the target.
When you specify a pFactList, you must ensure that each cube that the system will update during the build operation meets the following criteria:
-
Selective build is enabled.
-
The fact table for the cube contains the columns identified in pFactList.
If any cube in the build operation does not meet the preceding conditions, the build operation fails with an error.
-
Where supported, the system attempts to synchronize the cubes at the end of the main build procedure.
Upon conclusion, this method returns a status. If errors occur during the cube build, the status code indicates the number of build errors.
For example:
set status = ##class(%DeepSee.Utils).%BuildCube("patients")
This method writes output that indicates the following information:
-
Number of processors used.
-
Total elapsed time taken by the build operation.
-
Total amount of time spent evaluating source expressions, summed across all processors.
For example:
Building cube [patients]
Existing cube deleted.
Fact table built: 1,000 fact(s) (2 core(s) used)
Fact indexes built: 1,000 fact(s) (2 core(s) used)
Complete
Elapsed time: 1.791514s
Source expression time: 0.798949s
If Source expression time seems too high, you should re-examine your source expressions to be sure that they are as efficient as possible; in particular, if the expressions use SQL queries, double-check that you have the appropriate indexes on the tables that the queries use.
Cube Build Status
If there is a build in progress, you can monitor its progress using the %BuildStatus()Opens in a new tab method. In the Terminal, call:
DO ##class(%DeepSee.Utils).%BuildStatus("cubeName")
%BuildStatus() reports on the progress of a build regardless of whether you initiated the build programmatically or using the Architect. If there is no build in progress, %BuildStatus() displays the timestamp of the most recent build, like so: There is no build in progress. Last build was finished on 06/23/2020 11:31:07.
The build dialog in the Architect also reports on the progress of builds which are initiated programmatically.
Minimizing Cube Size During Development
While you are developing a cube, you typically recompile and rebuild it frequently. If you are using a large data set, you might want to limit the number of facts in the fact table, in order to force the cube to be rebuilt more quickly. To do this, do one of the following:
-
If you build the cube in the Architect, specify a value for Maximum Number of Records to Build.
-
Edit the cube class in an IDE and add the maxFacts attribute to the <cube> element. See Reference Information for Cube Classes.
If you do so, be sure to remove this attribute before deployment.
-
Build the cube in the Terminal and specify the pMaxFacts argument. See Building the Cube Programmatically.
Note that all these options are ignored during a selective build.
Using Parallel Processing During a Cube Build
If all the following items are true, the system uses multiple cores to perform the build:
-
You specify pAsync as 1 when you build the cube (see Building the Cube Programmatically).
-
The source for a cube is a persistent class (rather than a data connector). Data connectors are described in Implementing InterSystems Business Intelligence.
-
The persistent class is bitmap-friendly.
-
The Initial build order option of the cube has not been set. These options are described in Specifying Cube Options.
When you build a cube asynchronously, the system sets up %SYSTEM.WorkMgrOpens in a new tab agents to do the work, if it is possible to use parallel processing.
These agents are also used to execute queries.
On rare occasions, you might need to reset these agents. To do so, use the %Reset()Opens in a new tab method of %DeepSee.UtilsOpens in a new tab. This method also clears any pending tasks and clears the result cache for the current namespace, which would have an immediate impact on any users. This method is intended for use only during development.
Build Errors
When you build a cube, pay attention to any error messages and to the number of facts that it builds and indexes. This section discusses the following topics:
-
Fact count, which is a useful indicator of build problems in all scenarios
For more information on troubleshooting options, see the InterSystems Developer CommunityOpens in a new tab.
Seeing Build Errors
When you build a cube in the Architect or in the Terminal, the system indicates if there are any build errors but does not show all of them. To see all the recorded build errors, do either of the following:
-
Look for the log file install-dir/mgr/DeepSeeUpdate_cube_name_NAMESPACE.log, where cube_name is the name of the cube, and NAMESPACE is the namespace in which this cube is defined.
The time stamp in this file uses $NOW to write the local date and time, ignoring daylight saving time.
-
Use the %PrintBuildErrors()Opens in a new tab method of %DeepSee.UtilsOpens in a new tab, as follows:
do ##class(%DeepSee.Utils).%PrintBuildErrors(cubename)
Where cubename is the logical name of the cube, in quotes.
This method displays information about all build errors. For example (with added line breaks):
SAMPLES>do ##class(%DeepSee.Utils).%PrintBuildErrors("holefoods") 1 Source ID: 100000 Time: 05/09/2019 14:12:52 ERROR #5002: ObjectScript error: <DIVIDE>%UpdateFacts+106^HoleFoods.Cube.Fact.1 2 Source ID: 200000 Time: 05/09/2019 14:12:41 ERROR #5002: ObjectScript error: <DIVIDE>%UpdateFacts+106^HoleFoods.Cube.Fact.1 3 Source ID: 300000 Time: 05/09/2019 14:13:13 ERROR #5002: ObjectScript error: <DIVIDE>%UpdateFacts+106^HoleFoods.Cube.Fact.1 ... 10 build error(s) for 'holefoods'
In some cases, the system might not generate an error, so it is important to also check the fact count as discussed in the next section.
Checking the Fact Count
When you build a cube, the system reports the number of facts that it builds and indexes.
Each fact is a record in the fact table. The fact table should have the same as the number of records in the base table, except in the following cases:
-
You limit the fact count as discussed earlier in this page.
-
The cube class also defines the %OnProcessFact() callback, which you can use to exclude records from the cube. See Using Advanced Features of Cubes and Subject Areas.
Also, when the system builds the indexes, the index count should equal the number of records in the fact table. For example, the Architect should show the same number for Building facts and for Building indexes. If there is a discrepancy between these numbers, check the log files.
Possible Causes of Build Errors
If you see build errors or unexplained discrepancies in the fact count, do the following:
-
Examine any levels that use range expressions, and verify that these levels do not drop records. See Validating Your Levels.
An error of this kind affects the index count but not the fact count.
-
Try disabling selected dimensions or measures. Then recompile and rebuild to isolate the dimension or measure that is causing the problem.
<STORE> Errors
In some cases, the build log might include errors like the following:
ERROR #5002: ObjectScript error: <STORE>%ConstructIndices+44^Cube.cube_name.Fact.1
This error can occur when a level has a very large number of members. By default, when the system builds the indexes, it uses local memory to store the indexes in chunks and then write these to disk. If a level has a very large number of members, it is possible to run out of local memory, which causes the <STORE> errors.
To avoid such errors, try either of the following:
-
Build the cube with a single process. To do so, use %BuildCube() in the Terminal, and use 0 for its second argument.
-
In the <cube> element, specify bitmapChunkInMemory="false" (this is the default). When this cube is built using background processes, the system will use process-private globals instead of local variables (and will not be limited by local memory).
Missing Reference Errors
If your cubes have relationships to other cubes, the build log might include errors like the following:
ERROR #5001: Missing relationship reference in RelatedCubes/Patients: source ID 1 missing reference to RxHomeCity 4
The missing relationship reference error can occur when new source data becomes available during the cube build process — that is, after only some of the cubes have been built. For example, consider the sample cubes RelatedCubes/Cities and RelatedCubes/Patients (which are available in the SAMPLES namespace). Suppose that you build the cube RelatedCubes/Cities, and after that, the source table for RelatedCubes/Patients receives a record that uses a new city. When you build the cube RelatedCubes/Patients, there will be a missing relationship reference error.
The default procedure that Business Intelligence uses for building cubes ensures that related cubes are always built in the appropriate order. However, in rare cases where you build cubes using an unsupported procedure, this error may mean that you have built the cubes in the wrong order.
See the next section for information on recovering from these build errors without rebuilding the entire cube.
Recovering from Build Errors
The system provides a way to rebuild only the records that previously generated build errors, rather than rebuilding the entire cube. To do this:
-
Correct the issues that cause these errors.
-
Use the %FixBuildErrors()Opens in a new tab method of %DeepSee.UtilsOpens in a new tab, as follows:
set sc=##class(%DeepSee.Utils).%FixBuildErrors(cubename)
Where cubename is the logical name of the cube, in quotes. This method accepts a second argument, which specifies whether to display progress messages; for this argument, the default is true.
For example:
Fact '100' corrected Fact '500' corrected Fact '700' corrected 3 fact(s) corrected for 'patients' 0 error(s) remaining for 'patients'
Or rebuild the entire cube.
Business Intelligence Task Log
The system creates an additional log file (apart from the previously described build logs). After it builds the cube or tries to build the cube, the system also writes the DeepSeeTasks_NAMESPACE.log file to the directory install-dir/mgr. You can use the %SetLoggingOptions method of the %DeepSee.WorkMgrOpens in a new tab class to turn on logging for background agents that the system used during the build process. To do so, make a call like the following:
do ##class(%DeepSee.WorkMgr).%SetLoggingOptions(,,1)
To see this file from the Management Portal, select Analytics > Admin > Logs.
This file also contains information about runtime errors of various kinds such as listing errors and KPI errors.
The time stamps in this files use the local date and time (taking daylight saving time into account).