What is Auto Scaling?
Automatically adjusting the number of computing resources based on current demand to maintain performance and optimize costs.
Auto scaling adds servers when demand increases and removes them when demand decreases. This ensures applications handle traffic spikes without manual intervention and avoids paying for idle resources during quiet periods.
Scaling can be horizontal (adding more instances) or vertical (increasing instance size). Policies are based on metrics like CPU usage, request count, or queue depth. Cloud providers offer managed auto-scaling groups.