Pages

Monday, April 25, 2011

ETL Testing

ETL basically stands for Extract Transform Load - which simply implies the process where you extract data from Source Tables, transform them in to the desired format based on certain rules and finally load them onto Target tables. There are numerous tools that help you with ETL process - Informatica, Control-M being a few notable ones.

So ETL Testing implies - Testing this entire process using a tool or at table level with the help of test cases and Rules Mapping document.

In ETL Testing, the following are prima facie validated -
1) Data File loads from Source system on to Source Tables.
2) The ETL Job that is designed to extract data from Source tables and then move them to staging tables. (Transform process)
3) Data validation within the Staging tables to check all Mapping Rules / Transformation Rules are followed.
4) Data Validation within Target tables to ensure data is present in required format and there is no data loss from Source to Target tables.

check out Service Oriented Architecture (SOA) testing. This will help you better understand how Informatica works and how you should test them.
--------------------------------------------
ETL stands for extract, transform, and load. It can consolidate the scattered data for any organization while working with different departments. It can very well handle the data coming from different departments.
For example, a health insurance organization might have information on a customer in
several departments and each department might have that customer's information listed in a different way. The membership department might list the customer by name, whereas the claims department might list the customer by number. ETL can bundle all this data and consolidate it into a uniform presentation, such as for storing in a database or data warehouse.
ETL can transform not only data from different departments but also data from different sources altogether. For example, any organization is running its business on different environments like SAP and Oracle Apps for their businesses. If the higher management wants to take discussion on their business, they want to make the data integrated and used it for their reporting purposes. ETL can take these two source system data and make it integrated in to single format and load it into the tables.

Generally the normal testing steps are:
• Requirements Analysis
• Testing Methodologies
• Test Plans and approach
• Test Cases
• Test Execution
• Verification and Validation
• Reviews and Walkthroughs

The main difference in testing a ETL is that we basically involve the SQL queries in our test case documents. It is vital to test both the initial loads of the Data
Warehouse from the source i.e. when it gets extracted and then updating it on the target table i.e. the loading step. In specific cases, where trouble shooting is required, we verify intermediate steps as well.

A defect or bug detection can be appreciated if and only if it is detected early and is fixed at the right time without leading to a high cost. So to achieve it, it is very important to set some basic testing rules. They are:
• No Data losses
• Correct transformation rules
• Data validation
• Regression Testing
• Oneshot/ retrospective testing
• Prospective testing
• View testing
• Sampling
• Post implementation

No comments:

Post a Comment