COURSE OVERVIEW

Photo of Triangle Shape Digital Wallpaper

Training Module:

data quality

Learn how to implement data cleansing rules

and monitor overall data quality.

Geometrical Illustration

Table of ContentS


Hexagons Background

Data Quality

primary key Validation

TimeXtender can automatically check or remove duplicate or empty primary keys and even show these results. By default, primary key behavior is set to error at the instance level, which will remove any records in a table with a null or duplicate primary key value. There are three types of Primary Key Behavior:

Error Sign Icon

er​ror

Invalid records will be removed from the valid table during execution. Each invalid record will be visible in the Errors report.

warning

wa​rning

Invalid records will remain in the valid table. Each invalid record will be visible in the Warnings report.

Circle

no​ne

TimeXtender will not remove or flag duplicate or empty primary keys during execution.

instance Level Setting

Affects all tables in the entire Instance.

  1. Right-click MDW instance > edit instance.
  2. Under primary key violation > Select Desired Option

Table Level Setting

Overrides Instance level setting for a specific table.

  1. Right-click your table > Table Settings
  2. In the General tab, select the Primary Key Behavior

Referential Integrity

TimeXtender can use the relations that are defined to perform a referential integrity check, or foreign key constraint check. During execution, TimeXtender compares the values or primary and foreign keys of a relation to ensure a matching value exists. If not, TimeXtender considers the record invalid. There are four options that define how invalid records may be handled:

Error Sign Icon

error

Records without a related primary key will be removed from the valid table. Each invalid record will be visible in the Errors report.

Error Sign Icon

error with physical relation

Similar to Error, However, Foreign Key Relation will be stored in the target database for other tools to see.

warning

wa​rning

Records without a related primary key will remain in the valid table. Each invalid record will be visible in the Warnings report.

Chain Link Icon

re​lation

Relation will remain in place Referential integrity will not be enforced

instance Level Setting

Affects all tables in the entire Instance.

  1. Right-click MDW instance > edit instance.
  2. Under Relation > Select Desired Option

Table Level Setting

Overrides Instance level setting for a specific table.

  1. At the bottom of a tables list of fields navigate to the desired table relation.
  2. Right-click the relation > Relationship Type.
  3. Select the type of relationship for your use case.
Hexagons Background

Data Validation Rules

validation rules

TimeXtender allows you to set validation rules which ensures ​a high level of accuracy and reliability of the data in the data ​warehouse and are used to discover invalid data. You can ​apply validation rules at the field level. You can make a ​validation rule conditional if you want the rule to apply in ​specific situations only.


Here is the list of operators supported for field validation:

  • Is Null, Is Not Null - Checks for if the field is Null or Not ​Null.
  • Is Empty, Is Not Empty - Checks for if field is empty or ​not.
  • Equal, Not Equal - Checks for if the field is equal or not.
  • Greater than, Greater or equal - Checks for field value ​greater than or greater or equal to the specified value.
  • Less than, Less or equal - Checks for field value less than ​or less or equal to the specified value.
  • In List, Not In List - Checks for field values in the list or ​not specified in the list.
  • Min. Length, Max Length - Checks for minimum or ​maximum length of the field value as specified.


For each validation rule you apply to a field, you must also classify the severity of a violation, as an Error or Warning.

  • Error means the violation is critical to the data quality and requires immediate attention. The data is considered invalid and will ​not be made available to the end users.
  • Warning means the violation is not critical to the data quality and does not require immediate attention. The data is considered ​valid and will still be made available to the end users.


Create a data validation rule

Be sure to run a Full Load

Circle Pin Icon
White Circle Vector
Cute Organic Light Bulb
  1. In DSA right-click the Customer.PhoneNumber table > Field Validations
    1. Operator: Is not empty
    2. Severity: Warning
    3. Click Add
  2. Deploy and Execute the Customer table

access warning and error reports

  1. Click Reports in the application menu > Warnings
  2. Click Reports in the application menu > Errors
Hexagons Background

Data Selection Rules

data selection rules

TimeXtender can filter out specific records loaded into a table using Data Selection Rules. This is similar to using a "WHERE" clause in a SQL statement. It is important to know that Data Selection Rules do not load any data that does not fit into the "rule" that is applied.


This option can be useful for reducing the overall amount of data that is loaded during the transfer step, which can reduce overall load times.

Here is the list of operators supported for data selection:

  • Is Null, Is Not Null - Checks for if the field is Null or Not Null.
  • Is Empty, Is Not Empty - Checks for if field is empty or not.
  • Equal, Not Equal - Checks for if the field is equal or not.
  • Greater than, Greater or equal - Checks for field value greater than or greater or equal to the specified value.
  • Less than, Less or equal - Checks for field value less than or less or equal to the specified value.
  • In List, Not In List - Checks for field values in the list or not specified in the list.
  • Min. Length, Max Length - Checks for minimum or maximum length of the field value as specified.
  • Like, Not like - Checks if a string contains a specified string (or, pattern).


Create a data selection rule

To apply multiple rules as an "AND" click on the existing data selection statement and add a second rule.

To apply an "OR", right-click on the table and add another data selection rule.

  1. In MDW right-click the Product table > Add Data Selection Rule
    1. Select FinishedGoodsFlag
    2. Operator: Equal
    3. Value: 1
    4. Click Add
  2. Deploy and Execute

Be sure to run a Full Load

Circle Pin Icon
White Circle Vector
Cute Organic Light Bulb
  1. Add Field Transformation to Product.ProductModelName
    1. Select: ProductModelName
    2. Operator > Fixed Value
    3. Value: None
  2. Add Condition to Fixed value 'none'
    1. Select: ProductModelName
    2. Operator: Is empty
  3. Deploy and Execute
  4. Preview the Product table
    1. You can view ProductModelName with a value of None
Hexagons Background

Synchronize Changes

Synchronize Data Types (MDW)

  1. In MDW right-click the Data Warehouse, and select Advanced > Synchronize Data Types

Synchronize Changes in Semantic Model

Synchronize for Data Type Changes

  1. Right-click SSL instance (’Sales Dev’) and select Synchronize.
  2. Click OK.

Synchronize for Remapping TaBles and Fields

  1. Right-click SSL instance (’Sales Dev’) and select Synchronize with Remapping.
  2. Review the tables and fields, then click OK.

exercise TranSform Nulls

Use what you've learned so far to complete the following exercise.

  1. Use the "Transform Nulls to None" guide above to ensure the ProductCategory and ProductSubcategory Field values in the Product Table show "None" instead of NULL

Section Quiz...

Planning to take the Solution Architect exam? Then you want to be sure these questions are easy to answer.

Why might you not see records with duplicate primary keys in your table?

When you specify a ​validation rule on a ​field, what options are ​available to classify ​the severity of a ​violation?

How might Data Selection Rules be used to improve performance?

Brushstroke Arrow Smooth Curve Down Small

When you're ready, see Answers Below

Section Quiz Answers

Planning to take the Solution Architect exam? Then you want to be sure these questions are easy to answer.

Why might you not see records with duplicate primary keys in your table?





By default, primary key behavior is set to error at the instance level which removes any records with duplicate of null primary keys from the valid table.

When you specify a validation rule on a field, what options are available to classify the severity of a violation?



  1. Error
  2. Warning

How might Data Selection Rules be used to improve performance?





Data Selection rules reduce the overall amount of data that is loaded during the transfer step, which can reduce overall load times.

Congratulations! You've completed the training module

Data quality

give FEEDBACK

Thumbs Up Illustration
Thumbs Up Illustration