COURSE OVERVIEW

Photo of Triangle Shape Digital Wallpaper

Training Module:

Slowly changing dimensions

To handle ever-changing data and thereby ensuring accurate reporting and analysis, Slowly Changing Dimensions (SCDs) are a dynamic solution, ensuring accuracy in reporting and analysis by effectively handling evolving dimension attributes over time.

Geometrical Illustration

Table of ContentS

  • Key Concepts: SCD
    • What are Slowly Changing Dimensions (SCD)?
    • History & Types of SCD
    • Enable History, and SCD (Types I & II)
  • Surrogate & Supernatural Keys
    • What are Surrogate Keys?
    • Create the Surrogate Key in MDW
    • Supernatural Keys
    • Add Product_Key as Supernatural Key
  • Track Changes
    • Track Historical and Current Changes
    • Use the History Fields for Current Sales People
    • Use the History Fields for Historical Sales People
Hexagons Background

Key Concepts: SCD

So, What the heck are

slowly changing dimensions (SCD)?

Source

Orange Arrow

SCD

Classy Circle
Orange Arrow
Orange Arrow
Orange Arrow

Update (SCD Type 1)

Insert (SCD Type 2)

Ignore (SCD Type 0)

Slowly Changing Dimensions (SCD) are essential in data warehousing to track historical data that changes gradually over time.


A good example is customer addresses - when a customer moves to a new location, and you need to update their billing address in your ERP or CRM system, the old address data may be lost if the source system doesn't retain it. This can be a challenge if you want to analyze historical data and see what a customer's address was at a certain point in time. The problem is that we want to keep both the original address, the current address, (and any additional addresses in between) due to business rules and deeper understanding of the customer, as a dynamic, and evolving entity.


That's where an SCD comes in - it allows you to keep a record of the customer's address history by creating a new "version" of the customer record that includes both the old and new addresses. This means you can track changes over time and create accurate reports based on historical data.


TimeXtender makes handling SCD a breeze for data engineers and analysts to keep track of historical data and create accurate reports. Using History is how we implement SCD in TimeXtender. Surrogate keys are used to ensure uniqueness, particularly when dealing with Slowly Changing Dimensions in data warehousing.


history & Types of SCD

History is a feature that allows for "tracking changes" for tables. When the history feature is activated for a table in TimeXtender, it compares the incoming records from the data source with the ones already present in the data warehouse. If a new record is found, it is added to the warehouse. If a record with the same key exists, the system compares hashed versions of the two records to identify changes. If there are no changes, nothing is updated. However, if changes are detected, the system takes different actions depending on the type of field that has been changed. TimeXtender supports the following field types.

Types of Slowly Changing Dimensions (SCD)

Type 0 (ignore): Fields that are basically ignored by the history logic.

Type I (update record): Fields that are overwritten with new data when data changes in the source. This means that there will be no history of the change.

Type II (insert new record): Fields that will cause a new record to be created, thus creating history about the change.

Hybrid Types: hybrid types (Type 6 or 7) can be accomplished by combining Type I, Type II, and Type III, providing more flexibility beyond the limitations of solely using any one of these types.

enable history & Natural Key

  1. In the DSA right click Customer table and choose Table Settings.
  2. Go to tab History and set a tick in the Enable History box.
  3. Go to tab Data Extraction and remove the tick at "Truncate valid table before data cleansing" and click OK button.
  4. Expand the table and notice the History Settings folder with new fields and the History Settings pane to the right. CustomerID is automatically chosen as Natural Key, because it is the primary key in the table.
  5. Deploy and Execute the Customer table without making any changes to the settings. This ensures no rows will be deleted from the table and there will be only one row per occurrence of each Natural Key value.


enable SCD: Type I & II

  1. In the DSA expand Customer table and select History Setting.
  2. Select Customer Name as Type I fields and check the checkbox for EmailAddress as Type II fields or select as desired
  3. Deploy and Execute the Customer table.


Enable Type I and II for Customer table


Hexagons Background

Surrogate & Supernatural Keys

what Are surrogate keys?

A surrogate key is a replacement for the primary key in a table. It usually represents a unique row number and is commonly used in referencing a specific record in another table. In TimeXtender, all tables in the data warehouse instance have a field named "DW_ID" which represents the surrogate key in each respective table. The use of surrogate key is a common practice to implement Slowly Changing Dimension.

Create the surrogate key in mdw

  1. Drag the SalesPerson table from the DSA into the MDW.
  2. Drag the SalesPerson.DW_Id from the DSA onto the SalesPerson table in the MDW.
  3. Open Edit Field on the SalesPerson DW_Id and rename it SalesPersonKey.


supernatural keys

Supernatural keys in TimeXtender are durable and independent keys that remain unchanged, unlike natural keys such as customer numbers. To create a supernatural key, a hashing algorithm is applied to other fields in the table, and the resulting hash value is compared to the values in a key store table. If the hash exists, the corresponding key value is returned; otherwise, a new key value is generated. The key store is used to tie the supernatural keys together and should be used consistently for related tables. You can choose to read-only from the key store if you don't want to create a new entry. Using a different hashing algorithm may affect performance and break existing data, so TimeXtender recommends using SHA-1 SQL Server 2005+ for most hashed fields.

Supernatural keys offer better performance, robustness, and quicker processing, and they can handle early arriving facts efficiently. However, when dealing with Slowly Changing Dimensions with Type II Fields, you must use DW_Id instead of supernatural keys.

add product_Key as supernatural key

Add a Supernatural Key Field

  1. In the DSA right click Product table and choose Add Supernatural Key Field
  2. Type Product_Key in the Name box
  3. Click the Add button next to Key store box
  4. In the Name box type Product
  5. Optionally, select a case-insensitive Hashing Algorithm, if the source database is also case-insensitive.
  6. Leave the remaining options at their default values and click OK button
  7. In the Business key selection list click ProductID and the Add button followed by OK button
  8. Notice the new field Product_Key and its datatype.
  9. Deploy and Execute
Hexagons Background

Track Changes

track historical and current changes

Below example highlights tracking of Sales people data as it changes, and relate this historical data to Sales Order data.

Enable Type II dimensions

  1. Open the ODX table selector, search for the Sales.SalesPerson table.
  2. Drag and drop the table into the DSA, enable history on the table with history settings - Natural Key: BusinessEntityID and Type II fields: Bonus, CommisionPct
  3. Deploy and Execute the table to enact the changes.

Use the History fields

for current sales people

  1. Click and drag SalesPerson.BusinessEntityID onto SalesOrderHeader.SalesPersonID, and select Yes to create the relation
  2. Click and drag SalesPerson.DW_Id onto SalesOrderHeader table, and select Use default relation instead of joins, then click OK.
  3. Select Use current values, then click OK. (This is using our Is_Current field to determine the unique, current record.)
  4. Select Yes in the next dialogue box, to set the Lookup Operator to 'None'.
  5. Open Edit Conditional Lookup on the new DW_Id2 field and rename it SalesPersonKey_Current. Click OK.



Use the History fields

for historical sales people


  1. Click and drag SalesPerson.DW_Id onto SalesOrderHeader table, and select Use default relation instead of joins, then click OK.
  2. Select Use the time from this field and select OrderDate from the drop-down menu. Click OK. (This is using our dates fields, valid to and from, to determine the date range limited records.)
  3. Select Yes in the next dialogue box, to set the Lookup Operator to 'None'.
  4. Open Edit Conditional Lookup on the new DW_Id2 field and rename it SalesPersonKey_Historical. Click OK.


exercise handle early arriving facts

Use what you've learned so far to complete the following exercise.

  1. Use the SalesPerson table to create "unknown member" for early arriving facts.
  2. Select Include in Primary Key on the SalesPerson.SalesPerson Key in the MDW
  3. Open Custom Data from the Advanced menu on the SalePerson table in the MDW and Type -1 in the SalesPersonKey field in the Edit Custom Data window.
  4. In the SalesOrderHeader table in the DSA, select Field Transformations under the lookup SalesPersonKey_Current. Select Fixed value in the Operator field, enter -1 in the Value field, and select Add.
  5. Select Add Condition on the Fixed Value -1 field transformation.
  6. Select the SalesPersonKey_Current field in the Conditions pane on the right. Select the operator 'Is empty', and click Add.
  7. Do the same steps for SalesPersonKey_Historical

Section Quiz...

Planning to take the Solution Architect exam? Then you want to be sure these questions are easy to answer.

What table-level feature is used to Implement Slowly Changing Dimensions?

What type of field can be used as a Surrogate key that can handle early arriving facts?

When implementing SCD Type II, what field can be used as a Surrogate Key?

Brushstroke Arrow Smooth Curve Down Small

When you're ready, see Answers Below

Section Quiz Answers

Planning to take the Solution Architect exam? Then you want to be sure these questions are easy to answer.

What table-level feature is used to Implement Slowly Changing Dimensions?






History

What type of field can be used as a Surrogate key that can handle early arriving facts?






Supernatural Key

When implementing SCD Type II, what field can be used as a Surrogate Key?






DW_ID

WANT TO LEARN EVEN MORE?

Learn even more about Slowly Changing Dimension from our TimeXtender Tuesdays

Congratulations! You've completed the training module

Slowly Changing DimensioNS

give FEEDBACK

Thumbs Up Illustration
Thumbs Up Illustration