βοΈ
Load Balancing and Auto Scaling
Load balancers and automatic scaling
β±οΈ Estimated reading time: 20 minutes
Elastic Load Balancer - Introduction
Load balancers distribute incoming traffic across multiple targets (EC2 instances, containers, IPs) in one or more Availability Zones.
ELB Types:
- Application Load Balancer (ALB): Layer 7 (HTTP/HTTPS), content-based routing
- Network Load Balancer (NLB): Layer 4 (TCP/UDP/TLS), ultra-high performance
- Gateway Load Balancer (GWLB): Layer 3 (IP), for virtual appliances
- Classic Load Balancer (CLB): Legacy, layer 4 and 7 (not recommended)
Benefits:
- Automatic multi-AZ high availability
- Health checks to detect unhealthy instances
- SSL/TLS termination
- Public/private traffic separation
- Automatic capacity scaling
- Integration with AWS services (Auto Scaling, CloudWatch, Route53)
ELB Types:
- Application Load Balancer (ALB): Layer 7 (HTTP/HTTPS), content-based routing
- Network Load Balancer (NLB): Layer 4 (TCP/UDP/TLS), ultra-high performance
- Gateway Load Balancer (GWLB): Layer 3 (IP), for virtual appliances
- Classic Load Balancer (CLB): Legacy, layer 4 and 7 (not recommended)
Benefits:
- Automatic multi-AZ high availability
- Health checks to detect unhealthy instances
- SSL/TLS termination
- Public/private traffic separation
- Automatic capacity scaling
- Integration with AWS services (Auto Scaling, CloudWatch, Route53)
π― Key Points
- β Load Balancers are AWS-managed (HA, automatic scaling)
- β ALB for HTTP/HTTPS, NLB for TCP/UDP extreme performance
- β Health checks determine if target can receive traffic
- β ELB only works in one region (not cross-region)
- β Targets can be in multiple AZ (recommended)
Application Load Balancer (ALB)
ALB operates at layer 7 (HTTP/HTTPS) and enables advanced content-based routing.
Features:
- Target Groups: Groups targets (EC2, ECS, Lambda, IP)
- Routing rules: Based on path, hostname, query strings, headers
- HTTP/2 and WebSocket support
- Redirects: Automatic HTTP to HTTPS
- Fixed response: Static responses without backend
Advanced Routing:
- Path-based: /api/* β Target Group A, /images/* β Target Group B
- Hostname-based: api.example.com β TG A, www.example.com β TG B
- Query string: ?platform=mobile β TG Mobile
- Headers: User-Agent: iPhone β TG iOS
Use Cases:
- Microservices and container-based applications
- Lambda functions as targets
- Multiple applications on same instances (different paths)
- Intelligent routing based on request content
Features:
- Target Groups: Groups targets (EC2, ECS, Lambda, IP)
- Routing rules: Based on path, hostname, query strings, headers
- HTTP/2 and WebSocket support
- Redirects: Automatic HTTP to HTTPS
- Fixed response: Static responses without backend
Advanced Routing:
- Path-based: /api/* β Target Group A, /images/* β Target Group B
- Hostname-based: api.example.com β TG A, www.example.com β TG B
- Query string: ?platform=mobile β TG Mobile
- Headers: User-Agent: iPhone β TG iOS
Use Cases:
- Microservices and container-based applications
- Lambda functions as targets
- Multiple applications on same instances (different paths)
- Intelligent routing based on request content
π― Key Points
- β ALB can route to multiple target groups
- β Health checks configured at target group level
- β Supports SSL/TLS termination with ACM certificates
- β Connection draining prevents failed requests during scaling
- β X-Forwarded-For header contains original client IP
Network Load Balancer (NLB)
NLB operates at layer 4 (TCP/UDP/TLS) and is designed to handle millions of requests per second with ultra-low latency.
Features:
- Ultra-high performance: Millions RPS, <100ΞΌs latency
- Static IP: One elastic IP per AZ
- Preserves source IP: Targets see real client IP
- Protocols: TCP, UDP, TLS
- Zonal isolation: AZ failure doesn't affect other AZ
Differences from ALB:
- NLB = Layer 4 (doesn't see HTTP content)
- ALB = Layer 7 (HTTP-based routing)
- NLB has much lower latency
- NLB preserves source IP (ALB uses X-Forwarded-For)
- NLB supports static IP (ALB only DNS)
Use Cases:
- Gaming (critical low latency)
- IoT (millions of TCP connections)
- Applications requiring static IP
- Extreme non-HTTP traffic (e.g., databases)
Features:
- Ultra-high performance: Millions RPS, <100ΞΌs latency
- Static IP: One elastic IP per AZ
- Preserves source IP: Targets see real client IP
- Protocols: TCP, UDP, TLS
- Zonal isolation: AZ failure doesn't affect other AZ
Differences from ALB:
- NLB = Layer 4 (doesn't see HTTP content)
- ALB = Layer 7 (HTTP-based routing)
- NLB has much lower latency
- NLB preserves source IP (ALB uses X-Forwarded-For)
- NLB supports static IP (ALB only DNS)
Use Cases:
- Gaming (critical low latency)
- IoT (millions of TCP connections)
- Applications requiring static IP
- Extreme non-HTTP traffic (e.g., databases)
π― Key Points
- β NLB for extreme performance and ultra-low latency
- β ALB for intelligent HTTP/HTTPS routing
- β NLB has static IP, ideal for whitelisting
- β Security Groups don't apply to NLB (only on targets)
- β NLB can route to targets outside AWS (on-prem via IP)
SSL/TLS on Load Balancers
Load Balancers can terminate SSL/TLS, decrypting HTTPS traffic and sending HTTP to targets.
SSL Termination:
- HTTPS Listener: LB decrypts inbound traffic
- HTTP Backend: LB β Targets traffic unencrypted (private AWS network)
- HTTPS Backend: End-to-end encryption (more CPU)
Certificate Management:
- AWS Certificate Manager (ACM): Free certificates, automatic renewal
- Import: Upload third-party certificate
- SNI (Server Name Indication): Multiple SSL certificates on same listener
SNI (Server Name Indication):
- Enables multiple HTTPS sites with different certificates
- Client indicates hostname in SSL handshake
- LB selects correct certificate
- Supported by: ALB, NLB, CloudFront (NOT Classic LB)
SSL Policies:
- Define which protocols and ciphers are allowed
- Recommended: TLS 1.2+, disable SSL 3.0, TLS 1.0, TLS 1.1
SSL Termination:
- HTTPS Listener: LB decrypts inbound traffic
- HTTP Backend: LB β Targets traffic unencrypted (private AWS network)
- HTTPS Backend: End-to-end encryption (more CPU)
Certificate Management:
- AWS Certificate Manager (ACM): Free certificates, automatic renewal
- Import: Upload third-party certificate
- SNI (Server Name Indication): Multiple SSL certificates on same listener
SNI (Server Name Indication):
- Enables multiple HTTPS sites with different certificates
- Client indicates hostname in SSL handshake
- LB selects correct certificate
- Supported by: ALB, NLB, CloudFront (NOT Classic LB)
SSL Policies:
- Define which protocols and ciphers are allowed
- Recommended: TLS 1.2+, disable SSL 3.0, TLS 1.0, TLS 1.1
π― Key Points
- β SSL termination reduces CPU load on backend instances
- β ACM provides free certificates with automatic renewal
- β SNI enables multiple domains with different certificates
- β Classic LB: one certificate per LB (no SNI)
- β ALB/NLB: multiple certificates with SNI
Auto Scaling Groups (ASG)
Auto Scaling Groups automatically adjust number of EC2 instances based on demand.
Components:
- Launch Template/Configuration: Defines what to launch (AMI, type, storage, SG)
- Min/Max/Desired Capacity: Scaling limits
- Scaling Policies: When and how to scale
- Health Checks: EC2 (default) or ELB (recommended)
Scaling Types:
Manual Scaling:
- Manually change desired capacity
Scheduled Scaling:
- Based on predictable patterns (e.g., increase at 9am)
Dynamic Scaling:
- Target Tracking: Maintain metric at target value (e.g., CPU 50%)
- Step Scaling: Scale in steps based on CloudWatch alarms
- Simple Scaling: Legacy, one change per alarm with cooldown
Predictive Scaling:
- ML predicts future traffic
- Provisions instances proactively
Components:
- Launch Template/Configuration: Defines what to launch (AMI, type, storage, SG)
- Min/Max/Desired Capacity: Scaling limits
- Scaling Policies: When and how to scale
- Health Checks: EC2 (default) or ELB (recommended)
Scaling Types:
Manual Scaling:
- Manually change desired capacity
Scheduled Scaling:
- Based on predictable patterns (e.g., increase at 9am)
Dynamic Scaling:
- Target Tracking: Maintain metric at target value (e.g., CPU 50%)
- Step Scaling: Scale in steps based on CloudWatch alarms
- Simple Scaling: Legacy, one change per alarm with cooldown
Predictive Scaling:
- ML predicts future traffic
- Provisions instances proactively
π― Key Points
- β ASG is free, only pay for launched instances
- β Launch Template is newer and more flexible than Launch Configuration
- β ELB health checks replace instances failing health check
- β Cooldown period prevents excessive scaling (default 300s)
- β Target Tracking is simplest and recommended
π» Creating Auto Scaling Group
# Create Launch Template
aws ec2 create-launch-template \n --launch-template-name my-template \n --version-description v1 \n --launch-template-data '{
"ImageId": "ami-0c55b159cbfafe1f0",
"InstanceType": "t3.micro",
"SecurityGroupIds": ["sg-0123456789abcdef0"]
}'
# Create Auto Scaling Group
aws autoscaling create-auto-scaling-group \n --auto-scaling-group-name my-asg \n --launch-template LaunchTemplateName=my-template \n --min-size 1 --max-size 5 --desired-capacity 2 \n --vpc-zone-identifier "subnet-aaa,subnet-bbb" \n --target-group-arns arn:aws:elasticloadbalancing:...