🎁 New User? Get 20% off your first purchase with code NEWUSER20 Register Now →
Menu

Categories

Databases Intermediate

What is ETL (Extract, Transform, Load)?

A data pipeline process that extracts data from sources, transforms it into a suitable format, and loads it into a destination system.

ETL processes move data between systems. Extract pulls data from databases, APIs, files, or streams. Transform cleans, validates, enriches, and restructures the data. Load writes the processed data to a data warehouse or target system.

Modern variations include ELT (load raw data first, transform in the warehouse) and real-time streaming. Tools include Apache Airflow, dbt (transform layer), Apache Spark, and cloud services like AWS Glue. ETL is fundamental to data warehousing and analytics.

Related Terms

Prepared Statement
A pre-compiled SQL template that uses parameters instead of literal values, preventing SQL injection and improving performance.
Materialized View
A database object that stores the precomputed result of a query, offering faster reads at the cost of periodic refresh.
MVCC (Multi-Version Concurrency Control)
A technique where the database maintains multiple versions of data to allow concurrent reads and writes without locking.
Crosstab Query
A query that transforms rows into columns, creating a pivot table view of aggregated data.
Time-Series Database
A database optimized for storing and querying timestamped data points like metrics, sensor readings, and event logs.
Database Connection Pooling
A technique that maintains a cache of database connections for reuse, reducing the overhead of creating new connections.
View All Databases Terms →