Training Module:
data quality
Learn how to implement data cleansing rules
and monitor overall data quality.
Table of ContentS
Data Quality
go to Table of Contents
primary key Validation
TimeXtender can automatically check or remove duplicate or empty primary keys and even show these results. By default, primary key behavior is set to error at the instance level, which will remove any records in a table with a null or duplicate primary key value. There are three types of Primary Key Behavior:
error
Invalid records will be removed from the valid table during execution. Each invalid record will be visible in the Errors report.
warning
Invalid records will remain in the valid table. Each invalid record will be visible in the Warnings report.
none
TimeXtender will not remove or flag duplicate or empty primary keys during execution.
instance Level Setting
Affects all tables in the entire Instance.
Table Level Setting
Overrides Instance level setting for a specific table.
Referential Integrity
TimeXtender can use the relations that are defined to perform a referential integrity check, or foreign key constraint check. During execution, TimeXtender compares the values or primary and foreign keys of a relation to ensure a matching value exists. If not, TimeXtender considers the record invalid. There are four options that define how invalid records may be handled:
error
Records without a related primary key will be removed from the valid table. Each invalid record will be visible in the Errors report.
error with physical relation
Similar to Error, However, Foreign Key Relation will be stored in the target database for other tools to see.
warning
Records without a related primary key will remain in the valid table. Each invalid record will be visible in the Warnings report.
relation
Relation will remain in place Referential integrity will not be enforced
instance Level Setting
Affects all tables in the entire Instance.
Table Level Setting
Overrides Instance level setting for a specific table.
Data Validation Rules
go to Table of Contents
validation rules
TimeXtender allows you to set validation rules which ensures a high level of accuracy and reliability of the data in the data warehouse and are used to discover invalid data. You can apply validation rules at the field level. You can make a validation rule conditional if you want the rule to apply in specific situations only.
Here is the list of operators supported for field validation:
For each validation rule you apply to a field, you must also classify the severity of a violation, as an Error or Warning.
Create a data validation rule
Be sure to run a Full Load
access warning and error reports
Data Selection Rules
go to Table of Contents
data selection rules
TimeXtender can filter out specific records loaded into a table using Data Selection Rules. This is similar to using a "WHERE" clause in a SQL statement. It is important to know that Data Selection Rules do not load any data that does not fit into the "rule" that is applied.
This option can be useful for reducing the overall amount of data that is loaded during the transfer step, which can reduce overall load times.
Here is the list of operators supported for data selection:
Create a data selection rule
To apply multiple rules as an "AND" click on the existing data selection statement and add a second rule.
To apply an "OR", right-click on the table and add another data selection rule.
Be sure to run a Full Load
Synchronize Changes
go to Table of Contents
Synchronize Data Types (MDW)
Synchronize Changes in Semantic Model
Synchronize for Data Type Changes
Learn more about Synchronizing Semantic Models.
Synchronize for Remapping TaBles and Fields
exercise TranSform Nulls
Use what you've learned so far to complete the following exercise.
Section Quiz...
Planning to take the Solution Architect exam? Then you want to be sure these questions are easy to answer.
Why might you not see records with duplicate primary keys in your table?
When you specify a validation rule on a field, what options are available to classify the severity of a violation?
How might Data Selection Rules be used to improve performance?
When you're ready, see Answers Below
Section Quiz Answers
Planning to take the Solution Architect exam? Then you want to be sure these questions are easy to answer.
Why might you not see records with duplicate primary keys in your table?
By default, primary key behavior is set to error at the instance level which removes any records with duplicate of null primary keys from the valid table.
When you specify a validation rule on a field, what options are available to classify the severity of a violation?
How might Data Selection Rules be used to improve performance?
Data Selection rules reduce the overall amount of data that is loaded during the transfer step, which can reduce overall load times.
Congratulations! You've completed the training module
Data quality