ArchitectureEngineering
System Design: From Zero to Production
A practical guide to designing scalable systems. Learn real-world patterns used by companies like Netflix, Uber, and Stripe.
Ioodu · · 30 分钟阅读
#System Design#Architecture#Engineering
System Design Fundamentals
Every senior engineer must think about system design. Whether you’re building a startup’s first product or scaling to millions of users, the principles remain the same.
The Five Pillars
┌─────────────────────────────────────────────────────────────┐
│ SCALABILITY │
│ Handle growth in users, data, and traffic │
└────────────────────────┬────────────────────────────────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│RELIABILITY│ │AVAILABILITY│ │MAINTAINABILITY│
│ Fix bugs │ │ Never fail │ │ Code clarity │
└─────────┘ └─────────┘ └─────────┘
1. Load Balancing
The Problem
A single server can’t handle millions of users. We need multiple servers with traffic distribution.
The Solution
interface LoadBalancer {
registerServer(server: Server): void;
removeServer(server: Server): void;
getNextServer(request: Request): Server;
}
class RoundRobinLB implements LoadBalancer {
private servers: Server[] = [];
private currentIndex = 0;
getNextServer(): Server {
const server = this.servers[this.currentIndex];
this.currentIndex = (this.currentIndex + 1) % this.servers.length;
return server;
}
}
class WeightedRoundRobinLB implements LoadBalancer {
// Servers with more capacity get more traffic
getNextServer(): Server {
// Weight-based selection
}
}
class LeastConnectionsLB implements LoadBalancer {
// Route to server with fewest active connections
getNextServer(): Server {
// Select server with min connections
}
}
Health Checks
class HealthCheck {
async check(server: Server): Promise<boolean> {
try {
const response = await fetch(server.healthEndpoint, {
method: 'GET',
timeout: 5000
});
return response.ok;
} catch {
return false;
}
}
async monitor(servers: Server[], interval: number) {
setInterval(async () => {
for (const server of servers) {
const healthy = await this.check(server);
server.setHealthy(healthy);
}
}, interval);
}
}
2. Caching Strategies
Cache Patterns
interface Cache<T> {
get(key: string): Promise<T | null>;
set(key: string, value: T, ttl?: number): Promise<void>;
delete(key: string): Promise<void>;
}
// Read-through cache
class ReadThroughCache<T> implements Cache<T> {
constructor(
private cache: Cache<T>,
private dataSource: DataSource<T>
) {}
async get(key: string): Promise<T | null> {
// Check cache first
let value = await this.cache.get(key);
if (value) return value;
// Load from source and cache
value = await this.dataSource.load(key);
await this.cache.set(key, value, 3600); // 1 hour TTL
return value;
}
}
// Write-through cache
class WriteThroughCache<T> implements Cache<T> {
async set(key: string, value: T): Promise<void> {
await Promise.all([
this.cache.set(key, value),
this.database.save(key, value)
]);
}
}
Cache Invalidation
┌─────────────────────────────────────────┐
│ Cache Invalidation │
├─────────────────────────────────────────┤
│ 1. TTL-based (simple, eventual) │
│ cache.set(key, value, ttl=300) │
│ │
│ 2. Write-through (consistent, slower) │
│ DB write → Cache update │
│ │
│ 3. Write-behind (fast, complex) │
│ DB write → Queue → Cache update │
│ │
│ 4. Delete-based (eventual) │
│ Delete cache → DB write │
└─────────────────────────────────────────┘
3. Database Design
Sharding Strategies
// Horizontal sharding by user ID
class ShardedDatabase {
private shards: Map<number, Database> = new Map();
private shardCount: number = 4;
constructor() {
for (let i = 0; i < this.shardCount; i++) {
this.shards.set(i, new Database(`shard_${i}`));
}
}
private getShard(userId: string): number {
// Consistent hashing
return hash(userId) % this.shardCount;
}
async saveUser(user: User): Promise<void> {
const shardId = this.getShard(user.id);
await this.shards.get(shardId)!.save(user);
}
async getUser(userId: string): Promise<User | null> {
const shardId = this.getShard(userId);
return this.shards.get(shardId)!.find(userId);
}
}
CQRS Pattern
// Separate read and write models for scale
class CQRSStore {
// Write model - optimized for writes
private writeDb: SQLDatabase;
private eventStore: EventStore;
async saveOrder(order: Order): Promise<void> {
await this.writeDb.transaction(async (trx) => {
await trx.orders.insert(order);
await this.eventStore.publish('OrderCreated', order);
});
}
// Read model - optimized for reads
private readDb: ReadDatabase;
private readReplicas: ReadDatabase[];
async getOrderSummary(orderId: string): Promise<OrderSummary> {
// Read from replica for scale
const replica = this.getLeastLoadedReplica();
return replica.query(`
SELECT o.*, u.name as user_name
FROM orders o
JOIN users u ON o.user_id = u.id
WHERE o.id = ?
`, [orderId]);
}
}
4. Message Queues
Event-Driven Architecture
interface MessageQueue {
publish(topic: string, message: any): Promise<void>;
subscribe(topic: string, handler: Handler): Promise<void>;
}
class OrderService {
constructor(private queue: MessageQueue) {}
async createOrder(order: Order): Promise<Order> {
// Create order
const saved = await this.orderRepo.save(order);
// Publish events (async, decoupled)
await this.queue.publish('order.created', {
orderId: saved.id,
userId: saved.userId,
total: saved.total,
items: saved.items
});
return saved;
}
}
class NotificationService {
constructor(private queue: MessageQueue) {}
async start(): Promise<void> {
await this.queue.subscribe('order.created', async (event) => {
await this.sendConfirmationEmail(event.userId, event.orderId);
await this.updateInventory(event.items);
});
}
}
Handling Failures
class ReliableMessageHandler {
private deadLetterQueue: MessageQueue;
async handle(message: Message): Promise<void> {
const maxRetries = 3;
let attempts = 0;
while (attempts < maxRetries) {
try {
await this.process(message);
return;
} catch (error) {
attempts++;
if (attempts < maxRetries) {
// Exponential backoff
await this.sleep(Math.pow(2, attempts) * 1000);
}
}
}
// Send to dead letter queue after all retries fail
await this.deadLetterQueue.publish(message.topic, {
originalMessage: message,
failedAttempts: attempts,
lastError: error.message
});
}
}
5. API Design
Versioning Strategy
// URL-based versioning
/app/v1/users
/app/v2/users
// Header-based versioning
GET /users
Accept-Version: v1
// GraphQL
POST /graphql
{ "query": "{ users { id name } }" }
Rate Limiting
class RateLimiter {
private redis: Redis;
async isAllowed(
userId: string,
limit: number,
window: number
): Promise<boolean> {
const key = `rate:${userId}`;
const current = await this.redis.incr(key);
if (current === 1) {
await this.redis.expire(key, window);
}
return current <= limit;
}
async handleRequest(req: Request): Promise<Response> {
const allowed = await this.isAllowed(
req.userId,
limit = 1000,
window = 60 // 1 minute
);
if (!allowed) {
return new Response('Rate limit exceeded', {
status: 429,
headers: {
'Retry-After': '60'
}
});
}
return this.handler.handle(req);
}
}
6. Real-World Architecture Example
E-commerce Platform
┌──────────────────────────────────────────────────────────────┐
│ CDN (CloudFlare) │
└────────────────────────┬─────────────────────────────────────┘
│
┌────────────────────────▼─────────────────────────────────────┐
│ Load Balancer (ALB/Nginx) │
│ Health checks, SSL termination │
└────────────────────────┬─────────────────────────────────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Web │ │ API │ │ Admin │
│ Tier │ │ Tier │ │ Tier │
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
└────────────────────┼────────────────────┘
│
┌─────────────────┼─────────────────┐
│ │ │
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ Redis │ │ Kafka │ │ search │
│ Cache │ │ Events │ │ (ES) │
└───────────┘ └─────┬─────┘ └───────────┘
│
┌────────────────┼────────────────┐
│ │ │
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ Primary │ │ Replicas │ │ Analytics │
│ Postgres │ │ (Read) │ │ (Click) │
└───────────┘ └───────────┘ └───────────┘
Conclusion
System design isn’t about memorizing solutions—it’s about understanding trade-offs:
- Consistency vs Availability: CAP theorem
- Latency vs Throughput: Bulk vs real-time
- Complexity vs Reliability: More components = more failure points
Practice by designing systems for familiar products. Start simple, iterate, and always measure in production.
Next: Deep dive into specific technologies and their trade-offs.