ETL vs. ELT: How Data Transformation Changes in the Cloud
The old saying “garbage in, garbage out” goes back more than 60 years in computing, but it's as relevant today as ever. ETL helps fix that age-old problem.
ETL stands for Extract/Transform/Load, and it’s essential in data warehousing. It’s the three-step process of extracting data from source databases; transforming raw data into a format for analysis; and loading the reformatted data into the data warehouse.
ETL is vital for data quality because it involves creating a “clean” version of data that has been integrated, joined, deduped, etc. And, significantly for anyone building a cloud data warehouse, ETL is different in the cloud.
With on-premises data warehouses, data transformation typically happens in a staging environment prior to being moved into the warehouse. In the cloud, that often (but not always) happens inside the cloud data warehouse itself, where there is plenty of processing power.
In other words, in the cloud, data transformation is the last step in the three-step pro…