Defining Models for InterSystems IRIS Business Intelligence
Details for the Fact and Dimension Tables
Internally, the system uses a star schema, which consists of the fact table and dimension tables, which the fact table references. Note that dimension tables are also called star tables. You can directly query these tables, which the system populates when you build a cube, and which the system updates when changes occur. This chapter discusses the following:
Do not redefine these tables or attempt to write data to them except via the authorized interfaces; see the chapter “Keeping the Cubes Current
” in Implementing InterSystems Business Intelligence
When you compile a cube definition class, the system also generates a table name Listing, in the same package as the fact table. Do not use the Listing table, which is for internal use only.
When you compile a cube definition class, the system generates the corresponding fact table with the name Package_CubeClass.Fact where Package_CubeClass corresponds to the package and class of the cube definition, following the usual rules for translating package and class name to table names. The system also generates foreign key definitions in the fact table that declare the links to the star table IDs.
The system updates the fact table whenever you build the cube or update it incrementally.
It is useful to examine this table, as well as the class definition that uses it.
The fact table contains the following fields:
ID — ID of this row, assigned when the row is created.
%dspartition — For future use. Ignore this field.
%sourceID — The ID of the record in the base class on which this record is based. This field is a pointer to the source table.
This field stores the actual measure values. The fact table class defines bitslice indices on these values. For example:
/// Index for measure M1.
Index MxAge On MxAge [ Type = bitslice ];
/// Measure: MxAge <br/>
/// Source: Age
Property MxAge As %Integer;
One field for each level, to indicate the level member to which this record belongs. (There is one exception; see “Reuse of a Source Property in the Fact Table
,” later in this appendix.)
For levels in data dimensions, the fact table stores the ID of the level member, which is a pointer to a table that defines the members.
The fact table class defines bitmap indices on these values. For example:
/// Index for fact 1. Index DxGender On DxGender [ Type = bitmap ];
/// Dimension: DxGender <br/>
/// Source: Gender
Property DxGender As Test.TestCube.DxGender;
For a time- or age-type level, the fact table stores an integer.
The fact table class defines the corresponding property as a computed value and defines a bitmap index for it. For example:
/// Index for fact 4. Index DxBirthDateFxYear On DxBirthDateFxYear [ Type = bitmap ];
/// Dimension: DxBirthDateFxMonthYear<br/>
/// Source: BirthDate Property DxBirthDateFxMonthYear As %Integer [ details omitted ];
Property DxBirthDateFxYear As %Integer [ Calculated, SqlComputeCode = ... , SqlComputed ];
For a cube-to-cube <relationship>, if this cube specifies the sourceProperty or sourceExpression for the relationship, the fact table stores the ID of the corresponding row in the other fact table. That ID is a pointer to the other fact table.
An additional field for each time-type or age-type dimension, to contain the complete value of the time or age dimension for this record. (There is one exception; see “Reuse of a Source Property in the Fact Table
,” later in this appendix.)
The class does not include properties for any NLP levels or NLP measures; those are handled differently.
If you specified the dependsOn attribute for any levels, the fact table contains additional indices on the level combinations:
Index DxPostalCodeViaHomeCityANDDx2642257510 On (DxPostalCodeViaHomeCity, Dx2642257510) [ Type = bitmap ];
If you defined any <index> elements in the <cube>, the fact table contains additional, custom indices. For example:
Index %UserIndexName On (MxAge, DxGender) [ Type = bitmap ];
These custom indices are for your own use; the system does not use them.
The following table summarizes how the system determines the names for the measure, level, and dimension fields (in the case where you do not specify the factName attribute):
|Item and Scenario
||Field Name in Fact Table (If Not Overridden by factName attribute)
|Measure based on a source property
||Mxprop_name where prop_name is the name of the property.
|Measure based on a source property in another table, via dot syntax (rare)
||Mxother_prop_nameViaprop_name where other_prop_name is the name of the property in the other class. However, if the resulting field name would be too long, the system generates a unique number.
|Measure based on a source expression
||MxnnnnnnnnnT where nnnnnnnnn is an integer and T indicates the measure type. (For example, I represents an integer measure.)
|Data level based on a source property (with no range expression)
|Data level based on a source property with a range expression
||Dxprop_nameRgnnnnnnnnnn where nnnnnnnnn is an integer.
|Data level based on a source property in another table, via dot syntax
However, if the resulting field name would be too long, the system generates a unique number.
|Data level based on a source expression
|Time- or age-type dimension
||Dxdim_name where dim_name is the name of the dimension. (This field is for use by levels in this dimension.)
|Time- or age-type level
||Dxdim_nameFxfunc_name, where func_name is the name specified in the timeFunction attribute for this level.
||Rxgenerated_name, where generated_name is the name of the source property or a generated name based on the source expression.
If a cube contains multiple measures that use the same property (via sourceProperty), the fact table contains a field for only one of those measures (the last of those measures found in the cube definition). For example, suppose that the cube contained the following measure definitions:
<measure name="Age" sourceProperty="Age" aggregate="SUM" factName="AgeFact"/>
<measure name="Avg Age" sourceProperty="Age" aggregate="AVG" factName="AvgAgeFact"/>
These two measures are different only by how they aggregate across multiple records; the fact table would contain the same value for these measures, for any given record. For efficiency, the fact table would include the field AvgAgeFact, but not the field AgeFact.
The same logic applies when a cube contains multiple levels that use the same property. The logic also applies when a cube contains multiple age or time dimensions that use the same property.
For a given measure, level, or dimension, if you use sourceExpression and access the property via %source.propertyname, the system always generates a separate field for that value.
When you compile a cube definition class, the system also generates a table for each level, other than the age- and time-type levels. These tables are in the same package as the fact table. The system also generates foreign key definitions in the dimension tables that declare links to the other dimension table IDs within the model schema.
The system updates the dimension tables (also called star tables) whenever you build the cube or update it incrementally.
The dimension table for a level contains one row for each member of that level. The dimension tables are created dynamically as the system processes records in the base table. For a given level, each time a new unique value is discovered, that value is added as a new row to the appropriate dimension table. This means that the system automatically adds rows to the dimension tables when needed; no intervention is required.
If the cube definition specifies the factName attribute for the corresponding level, the applicable dimension table uses that value as its name.
Otherwise, the name of the dimension table has the following form:
Where generated_name is the corresponding field name in the fact table, without the Dx at the start. For example, suppose that in the fact table, the field name for the Home City level is DxPostalCodeViaHomeCity. In this case, the corresponding dimension table is named StarPostalCodeViaHomeCity.
The columns in this row are as follows:
ID — ID of this row, assigned when the row is created.
One column to contain the key for this member. The field name is the same as the column name field in the fact table that corresponds to this level; see the previous section.
One column for each property of this level, to contain the actual property value for this member.
The field name starts with Dx and is based either on the source property name or is generated as a unique number, as described earlier.
If you use linkClass and linkProperty to define both the property and the level to which it belongs, the field names for the property and the level would be identical. In this scenario, the system appends _Link to the end of the field name for the property.
One column for the parent level of this level, to contain the ID of the parent of this member.
The field name is the same as the column name in the fact table that corresponds to the parent level.
Depending on how the level is defined, you can find the member names as follows:
By default, the key is used as the name, and the name is not stored separately.
If the level includes a property that is defined with isName="true", then the member names are stored in the column that contains that property (with one exception). The exception is when the property is also defined with isReference="true"; in this case, the field is computed at runtime.
Content Date/Time: 2019-09-19 06:44:29