To handle ever-changing data and thereby ensuring accurate reporting and analysis, Slowly Changing Dimensions (SCDs) are a dynamic solution, ensuring accuracy in reporting and analysis by effectively handling evolving dimension attributes over time.
Slowly Changing Dimensions (SCD) are essential in data warehousing to track historical data that changes gradually over time.
A good exampleis customer addresses - when a customer moves to a new location, and you need to update their billing address in your ERP or CRM system, the old address data may be lost if the source system doesn't retain it. This can be a challenge if you want to analyze historical data and see what a customer's address was at a certain point in time. The problem is that we want to keep both the original address, the current address, (and any additional addresses in between) due to business rules and deeper understanding of the customer, as a dynamic, and evolving entity.
That's where an SCD comes in - it allows you to keep a record of the customer's address history by creating a new "version" of the customer record that includes both the old and new addresses. This means you can track changes over time and create accurate reports based on historical data.
TimeXtender makes handling SCD a breeze for data engineers and analysts to keep track of historical data and create accurate reports. Using History is how we implement SCD in TimeXtender. Surrogate keys are used to ensure uniqueness, particularly when dealing with Slowly Changing Dimensions in data warehousing.
history & Types of SCD
History is a feature that allows for "tracking changes" for tables. When the history feature is activated for a table in TimeXtender, it compares the incoming records from the data source with the ones already present in the data warehouse. If a new record is found, it is added to the warehouse. If a record with the same key exists, the system compares hashed versions of the two records to identify changes. If there are no changes, nothing is updated. However, if changes are detected, the system takes different actions depending on the type of field that has been changed. TimeXtender supports the following field types.
Types of Slowly Changing Dimensions (SCD)
Type 0 (ignore): Fields that are basically ignored by the history logic.
Type I (update record): Fields that are overwritten with new data when data changes in the source. This means that there will be no history of the change.
Type II (insert new record): Fields that will cause a new record to be created, thus creating history about the change.
Hybrid Types: hybrid types (Type 6 or 7) can be accomplished by combining Type I, Type II, and Type III, providing more flexibility beyond the limitations of solely using any one of these types.
In the DSA right click Customer table and choose Table Settings.
Go to tab History and set a tick in the Enable History box.
Go to tab Data Extraction and remove the tick at "Truncate valid table before data cleansing" and click OK button.
Expand the table and notice the History Settings folder with new fields and the History Settings pane to the right. CustomerID is automatically chosen as Natural Key, because it is the primary key in the table.
Deploy and Execute the Customer table without making any changes to the settings. This ensures no rows will be deleted from the table and there will be only one row per occurrence of each Natural Key value.
enable SCD: Type I & II
In the DSA expand Customer table and select History Setting.
Select Customer Name as Type I fields and check the checkbox for EmailAddress as Type II fields or select as desired
A surrogate key is a replacement for the primary key in a table. It usually represents a unique row number and is commonly used in referencing a specific record in another table. In TimeXtender, all tables in the data warehouse instance have a field named "DW_ID" which represents the surrogate key in each respective table. The use of surrogate key is a common practice to implement Slowly Changing Dimension.
Create the surrogate key in mdw
Drag the SalesPerson table from the DSA into the MDW.
Drag the SalesPerson.DW_Id from the DSA onto the SalesPerson table in the MDW.
Open Edit Field on the SalesPerson DW_Id and rename it SalesPersonKey.
Supernatural keys in TimeXtender are durable and independent keys that remain unchanged, unlike natural keys such as customer numbers. To create a supernatural key, a hashing algorithm is applied to other fields in the table, and the resulting hash value is compared to the values in a key store table. If the hash exists, the corresponding key value is returned; otherwise, a new key value is generated. The key store is used to tie the supernatural keys together and should be used consistently for related tables. You can choose to read-only from the key store if you don't want to create a new entry. Using a different hashing algorithm may affect performance and break existing data, so TimeXtender recommends using SHA-1 SQL Server 2005+ for most hashed fields.
Supernatural keys offer better performance, robustness, and quicker processing, and they can handle early arriving facts efficiently. However, when dealing with Slowly Changing Dimensions with Type II Fields, you must use DW_Id instead of supernatural keys.
add product_Key as supernatural key
Add a Supernatural Key Field
In the DSA right click Product table and choose Add Supernatural Key Field
Type Product_Key in the Name box
Click the Add button next to Key store box
In the Name box type Product
Optionally, select a case-insensitive Hashing Algorithm, if the source database is also case-insensitive.
Leave the remaining options at their default values and click OK button
In the Business key selection list click ProductID and the Add button followed by OK button
Notice the new field Product_Key and its datatype.