What is Observability?
The ability to understand a system's internal state from its external outputs through metrics, logs, and traces.
Observability rests on three pillars: Metrics (numerical measurements over time — CPU, request rate, error rate), Logs (discrete events with context — error messages, audit trails), and Traces (request flow across services — distributed tracing).
OpenTelemetry is the emerging standard for instrumentation. Tools include Prometheus (metrics), ELK/Loki (logs), Jaeger/Zipkin (traces), and Datadog/New Relic (all-in-one). Observability goes beyond monitoring — it helps answer questions you did not anticipate.