Added contents for scalability

pull/1070/head
J-Atulya 2025-04-12 23:36:50 +05:30
parent 40d5d2edcc
commit bf648502ed
1 changed files with 106 additions and 7 deletions

113
README.md
View File

@ -379,13 +379,112 @@ First, you'll need a basic understanding of common principles, learning about wh
[Scalability Lecture at Harvard](https://www.youtube.com/watch?v=-W9F__D3oY4) [Scalability Lecture at Harvard](https://www.youtube.com/watch?v=-W9F__D3oY4)
* Topics covered:
* Vertical scaling ## What is Scalability?
* Horizontal scaling
* Caching **Scalability** is the capability of a system to handle a growing amount of work or its potential to accommodate growth. A scalable system maintains or improves performance as load increases by proportionally increasing system resources.
* Load balancing
* Database replication
* Database partitioning ## Vertical Scaling
**Vertical scaling** (scale-up) increases the capacity of a single server by adding more CPU, RAM, or storage.
**Example**: Upgrading your DB server from 8GB RAM to 64GB RAM.
**✅ Pros**
- Easy to implement
- No application code changes needed
**❌ Cons**
- Physical hardware limits
- Downtime may be required
- Becomes expensive quickly
---
## Horizontal Scaling
**Horizontal scaling** (scale-out) adds more servers to the system and distributes the workload across them.
**Example**: Adding more web servers behind a load balancer.
**✅ Pros**
- Scales better for large systems
- Enables high availability and redundancy
**❌ Cons**
- More complex to manage
- Requires stateless architecture and coordination
---
## Caching
**Caching** stores frequently accessed data in memory for faster retrieval, reducing load on backend systems.
### Common caching types:
- **Browser cache**: Your browser saves website files like images, CSS, or JavaScript so it doesnt have to download them again the next time you visit. This makes websites load faster.
- **CDN cache**: A Content Delivery Network (CDN) stores copies of your content in many places around the world. When a user visits your site, the CDN gives them the closest copy, which loads faster.
- **Server-side cache**: This is caching done on the backend using tools like Redis or Memcached. For example, if a database query is expensive (slow or heavy), the result can be saved in memory so it doesnt need to be repeated.
**✅ Pros**
- Significantly improves response times
- Reduces backend load
**❌ Cons**
- Sometimes the data in the cache is old and doesnt match the latest data in the database.
- It's tricky to know when to delete or update the cached data. If you do it too soon, you lose the benefit of caching; if too late, users may see outdated info.
## Load Balancing
**Load balancing** distributes incoming traffic across multiple servers to ensure no one server is overwhelmed.
### Common strategies:
- **Round-robin** : Requests go to each server one by one, in a circle (like taking turns).
- **Least connections** : Requests go to the server that is currently handling the fewest active connections. This helps keep load balanced more fairly.
- **IP hashing** : The system uses the user's IP address to decide which server handles their requests. This way, the same user often gets routed to the same server.
**✅ Pros**
- High availability
- Fault tolerance
- Enables horizontal scaling
**❌ Cons**
- Can become a single point of failure (use redundant balancers)
## 🛢️ Database Replication
**Database replication** copies data from a primary (master) DB to one or more replicas (slaves).
### Types:
- **Master-slave**: Writes to master, reads from replicas
- **Master-master**: Multiple writable nodes (more complex)
**✅ Pros**
- Improved read scalability
- Redundancy and failover support
**❌ Cons**
- **Replication lag** : When you copy data from the main database to replicas, there's a small delay. The replicas might not have the very latest updates right away.
- **Consistency issues in write-heavy apps** : If your app writes a lot of data (e.g., saving user actions), the replicas may fall behind, and different servers might show different versions of the data for a short time.
## Database Partitioning [Sharding](https://learn.microsoft.com/en-us/azure/architecture/patterns/sharding)
**Sharding** splits a large database into smaller parts, called shards, each stored on separate machines.
### Types:
- **Horizontal partitioning**: Splits by rows (e.g., user_id ranges)
- **Vertical partitioning**: Splits by columns (e.g., profile vs activity data)
**✅ Pros**
- Improves performance and scaling
- Avoids overloading a single node
**❌ Cons**
- Querying across shards is difficult
- Requires smart shard key design
- Rebalancing shards can be tricky
### Step 2: Review the scalability article ### Step 2: Review the scalability article