🎁 New User? Get 20% off your first purchase with code NEWUSER20 Register Now →
Menu

Categories

Databases Intermediate

What is ETL (Extract, Transform, Load)?

A data pipeline process that extracts data from sources, transforms it into a suitable format, and loads it into a destination system.

ETL processes move data between systems. Extract pulls data from databases, APIs, files, or streams. Transform cleans, validates, enriches, and restructures the data. Load writes the processed data to a data warehouse or target system.

Modern variations include ELT (load raw data first, transform in the warehouse) and real-time streaming. Tools include Apache Airflow, dbt (transform layer), Apache Spark, and cloud services like AWS Glue. ETL is fundamental to data warehousing and analytics.

Related Terms

Graph Database
A database that uses graph structures with nodes, edges, and properties to store and query highly connected data.
Connection String
A formatted string containing all parameters needed to establish a connection to a database server.
Database Constraint
Rules enforced by the database to maintain data integrity, including NOT NULL, UNIQUE, CHECK, PRIMARY KEY, and FOREIGN KEY.
MVCC (Multi-Version Concurrency Control)
A technique where the database maintains multiple versions of data to allow concurrent reads and writes without locking.
Stored Procedure
A precompiled collection of SQL statements stored in the database that can be executed as a single unit.
Database Index Types
Different index structures (B-tree, Hash, GIN, GiST, BRIN) optimized for various query patterns and data types.
View All Databases Terms →