Best Practices in ETL
ETL Process: –
- An ETL tool is used to extract data from different data sources, transform the data, and load it into a DW system
- Extracting the Data
It involves extracting the data from different heterogeneous data sources. Data
extraction from a transactional system varies as per the requirement and the ETL tool in
use. It is normally done by running scheduled jobs in off-business hours like running
jobs at night or over the weekend.
- Transforming the Data
It involves transforming the data into a suitable format that can be easily loaded into a
DW system. Data transformation involves applying calculations, joins, and defining
primary and foreign keys on the data. For example, if you want % of total revenue which
is not in database, you will apply % formula in transformation and load the data.
- Loading the Data into a DW System
It involves loading the data into a DW system for analytical reporting and information.
The target system can be a simple delimited flat file or a data warehouse.
- ETL testing starts with understanding the business requirements till the generation of a summary report.
The common steps under ETL Testing lifecycle are listed below:
- Understanding the business requirement
- Validation of the business requirement
- Test Estimation is used to provide the estimated time to run test-cases and to
- complete the summary report.
- Test Planning involves finding the Testing technique based on the inputs as per
- business requirement.
- Creating test scenarios and test cases
- Once the test-cases are ready and approved, the next step is to perform pre-execution check
- Execute all the test-cases.
- The last step is to generate a complete summary report and file a closure process.
ETL testing is categorized into four different engagements: –
- New Data Warehouse Testing – New DW is built and verified from scratch. Data input is taken from customer
requirements and different data sources and new data warehouse is build and verified with the help of ETL
2. Migration Testing– In this type of project customer will have an existing DW and ETL performing the job but
they are looking to bag new tool to improve efficiency.
3. Change Request– In this type of project new data is added from different sources to an existing DW. Also,
there might be a condition where customer needs to change their existing business rule, or they might integrate
the new rule.
4. Report Testing– Report are the result of any Data Warehouse and the basic propose for which DW is build.