Phase 4 of 5  ·  Modern Engineering
Week 16 / 20   ·   Ch 17

Distributed
Systems

"CAP theorem in 30 seconds — and the 8 fallacies every dev must know"

📚 Ch 17 — Distributed SE📐 CAP Theorem💀 8 Fallacies⏱ ~20 min read

🔍Concept Deep Dives

Click each concept to expand — real examples, diagrams, pros & cons.

📐

CAP Theorem

You can only guarantee 2 of 3: Consistency, Availability, Partition Tolerance. You must choose.

When to Use

Designing any distributed database or service — you must choose your trade-off.

Real-World Example

DynamoDB: AP (available + partition-tolerant). PostgreSQL: CP (consistent + partition-tolerant). No real CA system exists at scale.

✓ Advantages

  • Forces explicit trade-off decisions
  • Explains why distributed DBs differ

⚠ Watch Out

  • Oversimplified — PACELC extends it
Consistency / \ CP CA / \ Partition T. Availability \ / AP ── \ / (choose 2) DynamoDB=AP, Postgres=CP, Zookeeper=CP
💀

8 Fallacies of Distributed Computing

False assumptions developers make about distributed systems. By Peter Deutsch, Sun Microsystems.

When to Use

Every time you design a distributed system — check against these.

Real-World Example

Every distributed systems outage is caused by someone violating one of these. The network is NOT reliable.

✓ Advantages

  • Prevents architectural mistakes
  • Exposes hidden assumptions

⚠ Watch Out

  • Easy to forget under deadline pressure
1. The network is reliable 2. Latency is zero 3. Bandwidth is infinite 4. The network is secure 5. Topology doesn't change 6. There is one administrator 7. Transport cost is zero 8. The network is homogeneous → ALL are false in production
🌐

Middleware

Software layer between distributed components — handles communication, transaction management, security.

When to Use

Any distributed system with multiple services that need to communicate reliably.

Real-World Example

Kafka (message queue), gRPC (RPC framework), Redis (distributed cache), Consul (service discovery).

✓ Advantages

  • Hides network complexity
  • Standard communication patterns
  • Enables loose coupling

⚠ Watch Out

  • Another layer to fail
  • Learning curve
  • Cost and operational overhead
Service A → [Kafka] → Service B Service C → [Redis cache] ← Service D Service E → [gRPC] → Service F → Middleware handles delivery guarantees
🏗️

Distributed Architectures

Client-server, multi-tier, peer-to-peer, service-oriented — different models for different needs.

When to Use

Choosing the right distributed model for your scale and requirements.

Real-World Example

Spotify: microservices. BitTorrent: P2P. AWS: multi-tier (UI tier → app tier → data tier).

✓ Advantages

  • Match architecture to problem
  • Well-understood patterns

⚠ Watch Out

  • Wrong model = wrong trade-offs
2-tier: Client ↔ Server 3-tier: UI ↔ App Server ↔ Database N-tier: add caching, queue, etc. layers P2P: every node is client + server Microservices: many small services

📋Quick Reference

θ Ch 17 Cheat Sheet — Distributed Systems
CAP Theorem
Consistency, Availability, Partition Tolerance — pick 2. Real choice is CP vs AP (P is mandatory).
Consistency
Every read gets the most recent write. No stale reads.
Availability
Every request gets a response (possibly stale). Always responsive.
Partition Tolerance
System continues despite network splits. ALWAYS required in real distributed systems.
8 Fallacies
Network is NOT reliable, zero latency, infinite bandwidth, secure, static, one admin, free, homogeneous.
Eventual Consistency
Nodes will eventually agree on the same value — but not immediately. DynamoDB default.
Idempotency
An operation can be applied multiple times without changing result beyond first application. Critical for retries.
θ
Sommerville's Key Points — Ch 17
Author's own summary from the end of the chapter.
  • 1Distributed systems: software that runs on multiple computers communicating over a network.
  • 2CAP theorem: can only guarantee 2 of: Consistency, Availability, Partition Tolerance.
  • 38 Fallacies: network is NOT reliable, zero-latency, infinite bandwidth, secure, homogeneous.
  • 4Middleware: software layer handling communication, transactions, security between components.
  • 5Client-server, multi-tier, P2P, service-oriented — different distributed architectures.
  • 6Design for failure: network partitions WILL happen. Plan timeouts, retries, circuit breakers.

🧠Quiz — Test Yourself

Think through your answer first, then reveal.

Q1
Recall
Explain the CAP theorem. Which two properties do most modern web-scale databases choose?
CAP: in a distributed system, during a network partition, you must choose between Consistency (all nodes return the same data) or Availability (all requests get a response). Partition Tolerance is mandatory in real networks. Most web-scale databases choose AP (DynamoDB, Cassandra, CouchDB) — they prioritize availability with eventual consistency. Strongly consistent systems (PostgreSQL, ZooKeeper, etcd) choose CP.
Q2
Apply
Name 3 of the 8 Fallacies and explain why each matters in practice.
1. 'Network is reliable' — it's not; design with timeouts, retries, circuit breakers. 2. 'Latency is zero' — network calls take time; avoid synchronous calls in hot paths. 3. 'The network is secure' — it's not; encrypt all inter-service communication (mTLS, TLS).
Q3
Analyze
What is eventual consistency and when is it acceptable?
Eventual consistency: nodes will eventually converge to the same value, but reads may return stale data temporarily. Acceptable for: social media likes/views counts (slightly stale is fine), product catalog prices (few seconds delay OK), user profiles (not mission-critical). Not acceptable for: bank balances, inventory reservation, medical records.
Up Next → Week 17
Microservices Architecture
When to choose microservices (and when NOT to)
Continue → Week 17