Skip to content

Case 1: Monolithic Architecture


title: System Design Introduction: Building and Scaling High-Availability Web Applications - Course 1 description: Learn system design fundamentals, from monolithic applications to database read-write splitting and caching, understanding how to build and progressively scale a high-availability web service.


Evolutionary Learning Plan for System Architecture

Goal: Many complex systems evolve gradually from a simple starting point. This course uses a progressive scenario-driven approach to simulate the architectural challenges a typical web application (blog platform) might encounter during real business growth, guiding you step-by-step to master classic problem-solving approaches and think like an architect.


Phase 0: Initial System (Monolithic Architecture, The Starting Point)

System Description

  • Imagine starting with the simplest blog platform, basic but functional:
  • User registration, login
  • Publish, edit, delete blogs
  • View blog lists and details
  • Tech Stack: Also the most common combination:
  • Frontend: HTML + JavaScript (maybe jQuery early on, later perhaps React/Vue)
  • Backend: Python Flask or Java Spring Boot (classic monolithic application)
  • Database: MySQL (single instance, sufficient initially)

Current Architecture Diagram

[User Browser] → [Web Server (Monolithic App)] → [MySQL (Single Instance)]
Characteristics at this point: - All code is in one project, simple and direct. - Application directly connects to the database for reads/writes, no middle layer. - Scalability? Not considered yet.


Phase 1: Traffic Arrives → Database is the First Bottleneck

Challenge Emerges

  • Platform popularity is good, but as user numbers grow, problems follow: database queries slow down, especially on high-traffic blog list pages.
  • Server CPU spikes during peak hours, page response time deteriorates from tens of milliseconds to half a second or more.

❓ Architect's Thinking Moment: Bottleneck is the database, what to do?

(What's usually the first reaction for performance optimization? Add cache? Index? Or directly implement read-write splitting?)

✅ Evolution Direction: Cache First, Consider Indexing

  1. Introduce Cache for Rescue: Facing read pressure, introducing memory cache (like Redis) is the most cost-effective method.
    • Cache popular blog lists to significantly reduce database queries.
    • Initially adopt the common Cache-Aside pattern: application reads from cache first, if not found, queries the database, then writes back to the cache. (Cache consistency issues need attention later.)
  2. Database Self-Optimization: Don't forget the basics.
    • Check indexes on the blogs table, ensure fields used for sorting and querying like created_at have appropriate indexes.
  3. Minor Architecture Tweak:
    [User Browser] → [Web Server] → [Redis (Cache)] → [MySQL]
    

Phase 2: User Surge → Limits of a Single Monolithic Web Server

New Bottleneck

  • Traffic continues to rise, peak QPS jumps from hundreds to thousands, the single web server's CPU is maxed out.
  • Users start experiencing frequent access timeouts.

❓ Architect's Thinking Moment: Single point can't handle it, how to scale horizontally?

(Adding machines is certain, but how to make multiple machines work together? How to handle user state?)

✅ Evolution Direction: Load Balancing + Statelessness

  1. Horizontally Scale the Web Layer:
    • Deploy multiple web server instances.
    • Introduce Nginx at the front as a load balancer to distribute requests to multiple backend instances (common strategies include round-robin, least connections).
  2. Stateless Service is Key: For easy scaling, the web server itself should not store user session state.
    • Migrate session data to external shared storage, like Redis. This allows any web server to handle any user's request.
  3. Architecture Evolves To:
    [User Browser] → [Nginx (Load Balancer)] → [Web Server Instance x N] → [Redis (Cache+Session)] → [MySQL]
    

Phase 3: Increased Write Pressure → Database Read-Write Splitting

Write Operations Become the New Focus

  • As user activity increases, blog publishing and editing become more frequent, highlighting the write pressure on the MySQL primary database, even causing increased master-slave replication lag.
  • Users start complaining: "Just published a blog, refreshed several times but still can't see it!"

❓ Architect's Thinking Moment: Read performance solved, what about the write bottleneck?

(Is master-slave replication the standard answer? Is sharding too early? Is there an intermediate solution?)

✅ Evolution Direction: Implement Master-Slave Replication and Read-Write Splitting

  1. Enable MySQL Master-Slave Replication:
    • Configure one primary (Master) to handle all write operations.
    • Configure one or more replicas (Slaves) to handle read operations, distributing the read load.
    • Introduce database middleware (like ShardingSphere-JDBC/Proxy or ProxySQL) to automatically route read/write requests, transparently to the application layer.
  2. Addressing Master-Slave Lag:
    • After splitting reads/writes, master-slave lag needs attention. For read requests requiring high consistency (like viewing immediately after publishing), forcing routing to the master might be necessary.
    • Cache update strategies also need adjustment, e.g., invalidate cache after write instead of updating, reducing the inconsistency window.
  3. Architecture Adjusts Again:
    [User Browser] → [Nginx] → [Web Server x N] → [Redis]
                                          ↘ [DB Middleware/Proxy] ↘ [MySQL Master] → [MySQL Slave x N]
    

Phase 4: Increasing Business Complexity → The Choice of Splitting the Monolith: Microservices

The "Growing Pains" of a Monolith

  • Business continues to grow, adding new modules like "comment system," "user recommendations," etc. The monolithic application's codebase becomes increasingly large and difficult to maintain.
  • Development and deployment of different features interfere with each other, reducing team collaboration efficiency and lengthening release cycles.

❓ Architect's Thinking Moment: Time to "split the family," how to do it gracefully?

(Microservices are the trend, but how to define service boundaries? How should services communicate? How to introduce asynchronicity?)

✅ Evolution Direction: Embrace Microservices, Introduce Message Queues and API Gateway

  1. Split Services by Business Capability:
    • Decompose the monolith into independent services like Blog Service, User Service, Comment Service, etc. Each service can be developed, deployed, and scaled independently.
    • Service Communication: Initially, REST API can be used. For high performance or internal calls, consider RPC (gRPC/Dubbo).
  2. Introduce Message Queue for Asynchronous Decoupling:
    • For non-core, asynchronously processable flows (like "notify followers after blog publication," "trigger recommendation calculation"), introduce Kafka or RabbitMQ. Producers send messages, consumers process them asynchronously, improving system resilience and response speed.
  3. Build an API Gateway:
    • All external requests (from clients) enter through a unified API Gateway (like Kong, Spring Cloud Gateway, Nginx+Lua).
    • The gateway handles: routing, authentication/authorization, rate limiting/circuit breaking, logging/monitoring, and other common functions, simplifying backend services.
  4. Emerging Microservice Architecture Outline:
    graph TD
        UserBrowser --> APIGateway[API Gateway]
        APIGateway --> UserService[User Service] --> RedisCache[Redis]
        APIGateway --> BlogService[Blog Service] --> MySQLDB[MySQL]
        APIGateway --> CommentService[Comment Service] --> MySQLDB
        BlogService --> KafkaMQ[Kafka]
        CommentService --> KafkaMQ
        KafkaMQ --> RecommendationService[Recommendation Calc Service] --> Others[...]
        KafkaMQ --> NotificationService[Notification Service] --> Others[...]
    (Mermaid diagram translated and simplified for clarity)

Phase 5: Data Volume Explosion → Database Sharding and Search Engine

Challenge of Massive Data

  • Blog content and user data continue explosive growth. The core blogs table reaches TB levels. Even with read-write splitting, single-database or single-table query performance drops sharply.
  • User demand for content search increases; simple LIKE queries are no longer sufficient.

❓ Architect's Thinking Moment: Database capacity and query performance are critical again, what now?

(Is sharding inevitable? By what dimension? What technology for full-text search?)

✅ Evolution Direction: Data Sharding + Introduce Professional Search Engine

  1. Implement Database Sharding:
    • Horizontally split the largest tables (like blogs, users). Common strategies include sharding by User ID or Content ID hash.
    • Introduce database sharding middleware (like ShardingSphere, Vitess, MyCat) to manage sharding routing rules, shielding low-level details from the application layer.
  2. Introduce Elasticsearch for Full-Text Search:
    • Synchronize searchable blog content (title, body, etc.) into an Elasticsearch cluster.
    • Utilize ES's powerful inverted index and tokenization capabilities for efficient, accurate full-text search. Sync mechanisms could be CDC (Canal/Debezium) or dual writes.
  3. Consider Hot/Cold Data Separation:
    • For infrequently accessed old blog data, consider archiving from primary storage (MySQL/ES) to lower-cost object storage (like AWS S3, Alibaba Cloud OSS), reducing online storage pressure.
  4. Evolution of Data Storage Layer:
    Primary Online Storage: [MySQL (Sharded Cluster)] + [Elasticsearch (Search Cluster)]
    Archive Storage: [Object Storage (S3/OSS)]
    

Phase 6: Journey to Globalization → Challenge of Multi-Active Architecture

New Requirements for Business Expansion Abroad

  • The platform needs to serve global users, but access latency for overseas users to domestic data centers is too high, affecting experience.
  • Higher availability requirements: service must not be interrupted even if a single data center fails.

❓ Architect's Thinking Moment: How to achieve low global latency and cross-region disaster recovery?

(Multi-region deployment is necessary, but how to sync data? How to route user requests?)

✅ Evolution Direction: Build Multi-Active Data Centers + CDN Acceleration

  1. Multi-Region Deployment (Multi-Active Architecture):
    • Deploy independent, fully functional service clusters in different geographic regions (e.g., US East, Europe, Singapore).
    • The database layer needs solutions supporting cross-region replication and consistency, such as Global Databases (AWS Aurora Global Database, Google Spanner, CockroachDB) or self-built sync solutions (possibly sacrificing strong consistency).
  2. CDN Acceleration for Static Resources:
    • Deploy static assets like images, CSS, JavaScript to a CDN (Content Delivery Network). Utilize its global edge nodes to provide users with nearby access, significantly reducing latency.
  3. Global Traffic Management:
    • Use Intelligent DNS or Global Server Load Balancing (GSLB) services to route user requests to the nearest or healthiest regional data center based on user location, network latency, or service health.
  4. Final Architecture Form (Schematic):
    graph LR
        UserNA[North America User] --> GSLB --> DCNorthAmerica[NA Datacenter Cluster]
        UserAsia[Asia User] --> GSLB --> DCAsia[Asia Datacenter Cluster]
        UserEU[Europe User] --> GSLB --> DCEurope[Europe Datacenter Cluster]
        DCNorthAmerica <--> GlobalDB[(Global Database / Sync)]
        DCAsia <--> GlobalDB
        DCEurope <--> GlobalDB
        UserNA --> CDN[(CDN Edge)]
        UserAsia --> CDN
        UserEU --> CDN
        CDN --> ObjectStorage[Object Storage Origin]
    (Mermaid diagram translated and simplified)

Summary: A Typical Architectural Evolution Path

Phase Core Problem Key Solution Representative Tech/Pattern
0. Monolith Simple Business Monolith App + Single DB Flask/Spring Boot, MySQL
1. Cache Read Performance Bottleneck Intro Cache + DB Index Opt. Redis, Cache-Aside
2. Horizontal Scale Web Server Pressure Load Balancer + Stateless Nginx, Redis (Session)
3. R/W Split DB Write Bottleneck Master-Slave + Middleware MySQL Replication, ProxySQL/ShardingSphere
4. Microserv Monolith Complexity Service Split + Async RPC/REST, Kafka/RabbitMQ, API Gateway
5. Data Scale Huge Data / Search Needs Sharding + Search Engine ShardingSphere/Vitess, Elasticsearch
6. Global Low Latency / High Avail Multi-Active + CDN Global DB, GSLB, CDN

Learning Method Suggestions

  1. Hands-on Practice is Crucial: For each phase, try building a minimal demo using cloud services (AWS/Azure/Alibaba Cloud free tiers or small instances) or local Docker/K8s to experience configuration and effects firsthand.
  2. In-depth Comparative Thinking: Actively compare pros and cons of similar technologies, e.g., "What scenarios are Kafka vs. RabbitMQ suitable for?", "What's the difference between Redis Sentinel and Cluster modes?".
  3. Simulate Failure Scenarios: If possible, try using chaos engineering tools (like Chaos Mesh) or manual methods to simulate node failures, network latency, etc., and observe the system's reaction and recovery capabilities.

By simulating this real path of business growth and technological evolution, we can better understand the trade-offs behind various architectural design decisions, which is the core value of an architect.