🎁 New User? Get 20% off your first purchase with code NEWUSER20 Register Now →
Menu

Categories

Databases Intermediate

What is ETL (Extract, Transform, Load)?

A data pipeline process that extracts data from sources, transforms it into a suitable format, and loads it into a destination system.

ETL processes move data between systems. Extract pulls data from databases, APIs, files, or streams. Transform cleans, validates, enriches, and restructures the data. Load writes the processed data to a data warehouse or target system.

Modern variations include ELT (load raw data first, transform in the warehouse) and real-time streaming. Tools include Apache Airflow, dbt (transform layer), Apache Spark, and cloud services like AWS Glue. ETL is fundamental to data warehousing and analytics.

Related Terms

Database Constraint
Rules enforced by the database to maintain data integrity, including NOT NULL, UNIQUE, CHECK, PRIMARY KEY, and FOREIGN KEY.
NoSQL
A category of databases that store data in non-tabular formats, optimized for specific data models and access patterns.
Transaction
A sequence of database operations that are treated as a single unit — either all succeed or all are rolled back.
Replication
The process of copying and maintaining database data across multiple servers for redundancy, failover, and read scaling.
Index
A data structure that improves the speed of data retrieval operations on database tables at the cost of additional storage.
Write-Ahead Log (WAL)
A technique where changes are first written to a log before being applied to the database, ensuring crash recovery and data integrity.
View All Databases Terms →