AWS provides powerful services –
Elastic Load Balancing (ELB) and
Auto Scaling Groups (ASG) – that work in tandem to achieve high availability and scalability for your applications running on EC2 instances.
Two key concepts:
- Vertical Scaling: This involves increasing the resources of a single instance (e.g., upgrading CPU, RAM). While simpler to implement initially, it has limitations in terms of maximum capacity and can lead to single points of failure.
- Horizontal Scaling (Elasticity): This strategy involves distributing load across multiple smaller instances. ELB facilitates this distribution, while ASG automates the process of adding or removing instances based on demand or health, providing true elasticity.
Let's explore these critical services in detail.
Elastic Load Balancing (ELB): Your Traffic Manager in the Cloud
ELB acts as a single point of contact for your application traffic, distributing incoming requests across multiple backend instances (typically EC2 instances) to improve availability and fault tolerance. AWS offers different types of load balancers tailored to specific needs:
- ALB - Application Load Balancer: Operating at the HTTP/S layer (Layer 7 of the OSI model), the ALB is ideal for routing traffic to web applications. It can make intelligent routing decisions based on the content of the request, such as the hostname, path, or HTTP headers.
- NLB - Network Load Balancer: Operating at the TCP (Transmission Control Protocol) and UDP (User Datagram Protocol) layer (Layer 4), the NLB is designed for high-performance and low-latency applications. It can handle millions of requests per second and is well-suited for workloads like email services (TCP) and streaming or gaming applications (UDP). NLBs also integrate seamlessly with health checks for TCP, HTTP, and HTTPS, similar to ALBs.
- GLB - Gateway Load Balancer: This specialized load balancer is designed for deploying and managing third-party network virtual appliances, such as firewalls, intrusion detection and prevention systems. It uses the GENEVE (Generic Network Virtualization Encapsulation) protocol on port 6081 to forward traffic to these appliances.
Key ELB Features:
- Sticky Sessions: Also known as session affinity, this feature allows you to bind a user's session to a specific backend instance. The load balancer uses a session cookie (either a custom cookie you define or an application cookie generated by the target) to ensure subsequent requests from the same client are routed to the same instance. This is useful for applications that maintain session state locally on an instance.
- Cross-Zone Load Balancing: This feature controls how the load balancer distributes traffic across instances in different Availability Zones.
- Application Load Balancer: Cross-zone load balancing is enabled by default at the load balancer level (though it can be disabled at the Target Group level). Importantly, there are no charges for inter-AZ data transfer when using an ALB with cross-zone load balancing enabled. With cross-zone load balancing, each ALB instance distributes traffic evenly across all registered instances in all enabled AZs. Without it, each load balancer node only distributes traffic to instances within its own AZ.
- Network & Gateway Load Balancers: Cross-zone load balancing is disabled by default. If you enable it, you will incur charges for inter-AZ data transfer. Similar to the ALB, with cross-zone load balancing enabled, each NLB or GLB node distributes traffic evenly across all registered instances in all enabled AZs. Without it, traffic is only distributed to instances within the same AZ as the load balancer node.
Securing Your Connections with SSL/TLS
- SSL (Secure Sockets Layer) and its newer version TLS (Transport Layer Security) are cryptographic protocols used to encrypt connections between clients (e.g., web browsers) and servers, ensuring data confidentiality and integrity.
- SNI (Server Name Indication): This is an extension to the TLS protocol that allows a client to specify the hostname it is trying to connect to at the beginning of the SSL/TLS handshake. This is crucial for hosting multiple SSL certificates on a single web server with a single IP address. The server can then present the correct certificate for the requested hostname. SNI is supported by ALB, NLB (newer generation), and CloudFront.
Managing Instance Transitions with Deregistration Delay
- For Application and Network Load Balancers, the Deregistration Delay setting specifies the amount of time the load balancer waits for in-flight requests to complete when an instance is being de-registered (either manually or because it's unhealthy). This helps ensure a smoother transition and prevents interruption of user requests.
Auto Scaling Group (ASG): Dynamic Capacity Management
An
Auto Scaling Group (ASG) automates the process of maintaining a desired number of EC2 instances and automatically adjusts capacity based on defined scaling policies, instance health, and schedules. This ensures both
high availability (by replacing unhealthy instances) and
scalability (by adding or removing instances to handle fluctuating demand).
- Scale Out (Increase): When demand increases (e.g., high CPU utilization, increased request latency), the ASG can automatically launch new EC2 instances to handle the additional load.
- Scale In (Decrease): When demand decreases, the ASG can automatically terminate instances to optimize costs.
By combining the traffic distribution capabilities of ELB with the dynamic capacity management of ASG, you can build highly resilient and scalable applications on AWS that can seamlessly handle both predictable and unexpected changes in traffic while maintaining optimal performance and availability. This powerful combination is a cornerstone of modern cloud-native architectures.