π Understanding Advanced Load Balancing and System Design Concepts
π The Role of DNS in Load Balancing and Scaling
1. Randomized IP Lists and the Waterfall Model
For global-scale products like Google or Facebook:
DNS does not return a single IP address but instead provides a randomized list of IPs for redundancy.
The waterfall model ensures that the client attempts to connect to the first IP. If it fails, the client automatically retries the next IP, ensuring resilience and uptime.
2. Domain Aliases and Shared IPs
Domain Aliases: Multiple domains (e.g.,
FB.com
andFacebook.com
) can point to the same backend services via DNS aliasing.Unique IPs: Each domain must map to a unique IP for proper resolution, ensuring no conflicts between websites.
3. DNS Caching in Incognito Mode
Incognito mode prevents the storage of browser history, but the browser can still use existing DNS cache to improve response times.
The cache may not be updated during incognito sessions, but previously cached entries remain accessible.
π Authentication Mechanisms in Distributed Systems
1. Access and Refresh Tokens
Access Token: Short-lived credentials issued at login to authenticate requests.
Refresh Token: Allows a client to request a new access token without requiring a full re-login, maintaining seamless user sessions.
2. Session Expiry
If a refresh token is not requested within its validity period, the session expires, and the user must log in again.
This behavior ensures security while offering flexibility for active users.
π‘οΈ Gateways, Load Balancers, and Their Configurations
1. Gateway vs. Load Balancer
Gateway:
The first point of contact for external requests.
Focused on security (accept/reject logic) and forwarding traffic.
Load Balancer:
Distributes incoming requests across multiple backend machines to ensure optimal utilization and fault tolerance.
Can operate with stateful or stateless configurations.
2. Active-Passive Configuration
The active gateway handles all traffic.
The passive gateway remains on standby and takes over immediately if the active one fails, ensuring high availability.
π‘ Public and Private IPs in Hybrid Networks
A single machine can have both public (external) and private (internal) IPs.
This setup allows communication within a local network while enabling access from external clients via the public IP.
The MAC address (unique hardware identifier) remains constant, irrespective of the IP assignments.
π‘ Stateful vs. Stateless Load Balancing Revisited
Stateless Load Balancing
Requests can go to any backend machine because all are equally capable of handling them.
Strategies:
Round Robin: Distribute requests sequentially across machines.
Least Response Time First: Direct traffic to the machine responding fastest.
Stateful Load Balancing
Requests for a specific session or user must go to the same machine to preserve context.
Strategies:
Mapping Table: Direct user requests based on a stored map.
Pros: Simple lookup.
Cons: Can grow unmanageable with large user bases.
Modulo Operation: Calculate the machine ID using a modulo function based on user ID.
Pros: Lightweight and easy to implement.
Cons: Inefficient when adding/removing machines, as all assignments must be recalculated.
π Preparing for Consistent Hashing
The challenges of reassigning requests in modulo-based systems are mitigated by consistent hashing, which:
Reduces the data movement required when adding or removing machines.
Ensures scalability and flexibility in dynamic environments.
π οΈ Practical Insights
1. Distributed DNS Responses
For globally distributed systems:
DNS handles large-scale load balancing by providing multiple IP addresses for redundancy.
Clients use the waterfall model, attempting connections sequentially until successful.
2. Context Awareness in Stateful Systems
Stateful load balancing directs traffic based on session-specific context.
Example: A chatbot like ChatGPT must route follow-up queries to the same machine that processed the initial request.
3. Gateway and API Gateway Distinctions
A gateway manages general traffic and acts as a security barrier.
An API gateway focuses on routing API-specific traffic to appropriate services (e.g., user service, product service).
π Whatβs Next?
In the following session, weβll dive into consistent hashing, exploring how it addresses the limitations of stateful systems and facilitates scalable, robust architectures.
This marks a pivotal step in understanding advanced system design principles that underpin todayβs distributed computing landscape.
Last updated