Training Module:
Slowly changing dimensions
To handle ever-changing data and thereby ensuring accurate reporting and analysis, Slowly Changing Dimensions (SCDs) are a dynamic solution, ensuring accuracy in reporting and analysis by effectively handling evolving dimension attributes over time.
Table of ContentS
Key Concepts: SCD
go to Table of Contents
So, What the heck are
slowly changing dimensions (SCD)?
Source
SCD
Update (SCD Type 1)
Insert (SCD Type 2)
Ignore (SCD Type 0)
Slowly Changing Dimensions (SCD) are essential in data warehousing to track historical data that changes gradually over time.
A good example is customer addresses - when a customer moves to a new location, and you need to update their billing address in your ERP or CRM system, the old address data may be lost if the source system doesn't retain it. This can be a challenge if you want to analyze historical data and see what a customer's address was at a certain point in time. The problem is that we want to keep both the original address, the current address, (and any additional addresses in between) due to business rules and deeper understanding of the customer, as a dynamic, and evolving entity.
That's where an SCD comes in - it allows you to keep a record of the customer's address history by creating a new "version" of the customer record that includes both the old and new addresses. This means you can track changes over time and create accurate reports based on historical data.
TimeXtender makes handling SCD a breeze for data engineers and analysts to keep track of historical data and create accurate reports. Using History is how we implement SCD in TimeXtender. Surrogate keys are used to ensure uniqueness, particularly when dealing with Slowly Changing Dimensions in data warehousing.
history & Types of SCD
History is a feature that allows for "tracking changes" for tables. When the history feature is activated for a table in TimeXtender, it compares the incoming records from the data source with the ones already present in the data warehouse. If a new record is found, it is added to the warehouse. If a record with the same key exists, the system compares hashed versions of the two records to identify changes. If there are no changes, nothing is updated. However, if changes are detected, the system takes different actions depending on the type of field that has been changed. TimeXtender supports the following field types.
Types of Slowly Changing Dimensions (SCD)
Type 0 (ignore): Fields that are basically ignored by the history logic.
Type I (update record): Fields that are overwritten with new data when data changes in the source. This means that there will be no history of the change.
Type II (insert new record): Fields that will cause a new record to be created, thus creating history about the change.
Hybrid Types: hybrid types (Type 6 or 7) can be accomplished by combining Type I, Type II, and Type III, providing more flexibility beyond the limitations of solely using any one of these types.
Read more on using History to implement SCD
enable history & Natural Key
enable SCD: Type I & II
Enable Type I and II for Customer table
Surrogate & Supernatural Keys
go to Table of Contents
what Are surrogate keys?
A surrogate key is a replacement for the primary key in a table. It usually represents a unique row number and is commonly used in referencing a specific record in another table. In TimeXtender, all tables in the data warehouse instance have a field named "DW_ID" which represents the surrogate key in each respective table. The use of surrogate key is a common practice to implement Slowly Changing Dimension.
Create the surrogate key in mdw
supernatural keys
Supernatural keys in TimeXtender are durable and independent keys that remain unchanged, unlike natural keys such as customer numbers. To create a supernatural key, a hashing algorithm is applied to other fields in the table, and the resulting hash value is compared to the values in a key store table. If the hash exists, the corresponding key value is returned; otherwise, a new key value is generated. The key store is used to tie the supernatural keys together and should be used consistently for related tables. You can choose to read-only from the key store if you don't want to create a new entry. Using a different hashing algorithm may affect performance and break existing data, so TimeXtender recommends using SHA-1 SQL Server 2005+ for most hashed fields.
Supernatural keys offer better performance, robustness, and quicker processing, and they can handle early arriving facts efficiently. However, when dealing with Slowly Changing Dimensions with Type II Fields, you must use DW_Id instead of supernatural keys.
add product_Key as supernatural key
Add a Supernatural Key Field
Track Changes
go to Table of Contents
track historical and current changes
Below example highlights tracking of Sales people data as it changes, and relate this historical data to Sales Order data.
Enable Type II dimensions
Use the History fields
for current sales people
Use the History fields
for historical sales people
exercise handle early arriving facts
Use what you've learned so far to complete the following exercise.
Section Quiz...
Planning to take the Solution Architect exam? Then you want to be sure these questions are easy to answer.
What table-level feature is used to Implement Slowly Changing Dimensions?
What type of field can be used as a Surrogate key that can handle early arriving facts?
When implementing SCD Type II, what field can be used as a Surrogate Key?
When you're ready, see Answers Below
Section Quiz Answers
Planning to take the Solution Architect exam? Then you want to be sure these questions are easy to answer.
What table-level feature is used to Implement Slowly Changing Dimensions?
History
What type of field can be used as a Surrogate key that can handle early arriving facts?
Supernatural Key
When implementing SCD Type II, what field can be used as a Surrogate Key?
DW_ID
WANT TO LEARN EVEN MORE?
Learn even more about Slowly Changing Dimension from our TimeXtender Tuesdays
Congratulations! You've completed the training module
Slowly Changing DimensioNS