Core Components
In-Depth Analysis of Core Service Components
1. Load Balancing and Reverse Proxy: Nginx
Nginx (pronounced "engine-x") is a high-performance open-source HTTP server, reverse proxy server, and generic TCP/UDP proxy server.
1.1 Nginx In-Depth Analysis
- Q1: What does it do?
- High-performance HTTP and reverse proxy server, also an email proxy server.
- Q2: Use Cases?
- Web server (serving static content), reverse proxy, load balancing (HTTP/TCP/UDP), caching, SSL/TLS termination, API gateway (with Lua/OpenResty).
- Q3: Problems Solved?
- Addresses the bottlenecks of traditional servers like Apache regarding the C10K problem with its high concurrency and low resource consumption.
-
Q4: Pros & Cons
- Pros: Extremely high performance, stable and reliable, simple and flexible configuration, event-driven architecture.
- Cons: Weaker dynamic content processing capability (requires FastCGI/uWSGI), advanced features need third-party modules or OpenResty.
-
Core Principles and Features
- Event-Driven Architecture: Nginx uses an asynchronous non-blocking event-driven model (based on epoll, kqueue, etc., I/O multiplexing technologies), handling numerous concurrent connections with fewer Worker processes, resulting in low resource consumption and high performance.
- Master-Worker Process Model: One Master process manages configuration loading, Worker process startup, and management; multiple Worker processes handle actual client requests.
- Modular Design: Functionality is implemented through modules; the core is very lightweight. Features can be extended by loading different modules (e.g., SSL module, Rewrite module, Proxy module, Gzip module).
- Simple and Flexible Configuration: Uses a clear and concise configuration file syntax.
- High Performance: Known for high concurrency and low memory consumption.
-
Main Functions and Use Cases (with Course Examples)
- HTTP Server: Directly hosts static files (HTML, CSS, JS, images, etc.), enabling separation of static and dynamic content. (Courses 1, 3: Accelerate static resource access)
- Reverse Proxy: Acts as an entry point for backend application servers (like Node.js, Python Flask/Django, Java Tomcat/Spring Boot), forwarding client requests to backend applications and returning responses. Hides backend server details, provides a unified entry point.
- Load Balancing: Distributes requests to multiple backend application server instances, achieving horizontal scaling and high availability. (Courses 1, 3, 9: Load balancing for web servers, database proxies, TiDB SQL layer)
- Common Load Balancing Algorithms:
- Round Robin: Default algorithm, distributes sequentially.
- Least Connections: Assigns requests to the server with the fewest active connections.
- IP Hash: Assigns based on the hash of the client's IP address, ensuring requests from the same client always go to the same server (for session persistence).
- Weighted Round Robin / Weighted Least Connections: Allows setting different weights for servers with different performance capabilities.
- Common Load Balancing Algorithms:
- SSL/TLS Termination: Nginx handles HTTPS encryption/decryption; backend applications only need to handle HTTP requests, reducing backend server load and simplifying certificate management.
- HTTP Caching: Caches responses from backend servers to accelerate access to static or infrequently changing content.
- Compression (Gzip/Brotli): Compresses response content to reduce network traffic.
- Rate Limiting: Limits the request rate from a single IP or specific URL to prevent malicious attacks or abuse. (Courses 2, 3: API entry point rate limiting)
- Access Control: Controls access based on IP address, HTTP Basic Auth, etc.
- URL Rewrite: Modifies request URLs based on rules.
- API Gateway (with Lua/JavaScript/OpenResty): By extending Nginx, more complex API gateway functions like authentication, authorization, request/response transformation can be implemented. (Course 1)
-
Nginx vs. Other Load Balancing Solutions
- HAProxy: Another high-performance open-source load balancer, very strong in TCP and HTTP load balancing, slightly more complex configuration than Nginx.
- LVS (Linux Virtual Server): Layer 4 (transport layer) load balancer working in the Linux kernel, extremely high performance, but relatively complex configuration and management.
- Cloud Provider Load Balancers (ELB/ALB/NLB, CLB, etc.): Offer managed services, elastic scaling, good integration with cloud ecosystems, but potentially higher cost and limited flexibility.
- Hardware Load Balancers (F5 BIG-IP, A10): Powerful features, high performance, but very expensive.
-
Key Configuration Directives and Considerations
- Core Module Directives:
worker_processes
: Number of Worker processes, usually set to the number of CPU cores orauto
.worker_connections
: Maximum concurrent connections each Worker process can handle.events { use epoll; }
: Specify the event model.
- HTTP Module Directives (
http { ... }
):server { ... }
: Defines a virtual host.listen
: Listening port.server_name
: Virtual host name.location /path { ... }
: Configures handling rules for specific URL paths.root
/alias
: Specifies the root directory for static files.index
: Default index file.proxy_pass http://backend_servers;
: Configures reverse proxy.upstream backend_servers { ... }
: Defines a group of backend servers for load balancing.server backend1.example.com weight=3;
: Defines a server and its weight within an upstream block.ssl_certificate
,ssl_certificate_key
: Configure SSL certificates.gzip on;
: Enable Gzip compression.limit_req_zone
,limit_req
: Configure rate limiting.
- Health Check: Configure
server
directive parameters in theupstream
block (likemax_fails
,fail_timeout
) or use thehealth_check
directive (Nginx Plus or third-party module) to automatically remove faulty backend servers. - High Availability: Typically achieved using Keepalived with a Virtual IP (VIP) for Nginx itself.
- Core Module Directives:
Nginx, with its high performance, stability, and rich features, is an indispensable frontend proxy and load balancing solution in modern web architectures.
2. Service Gateway: API Gateway
An API Gateway is a server or cluster of servers located between clients and backend services (especially in microservice architectures), acting as a unified entry point for all API requests.
2.1 API Gateway Concept Analysis
- Q1: What does it do?
- Acts as a unified entry point for backend services, handling API request routing, protocol translation, aggregation, and common cross-cutting concerns.
- Q2: Use Cases?
- Entry point for microservice architectures, unified access for mobile/web APIs, third-party open platform APIs.
- Q3: Problems Solved?
- Simplifies the complexity of client interaction with backend microservices, centralizes handling of common logic like authentication, rate limiting, monitoring.
-
Q4: Pros & Cons
- Pros: Decouples clients and backends, centralizes common logic handling, simplifies backend services.
- Cons: Can become a performance bottleneck and single point of failure, requires additional development and operational costs.
-
Core Principles and Features
- Single Entry Point: Clients only interact with the API Gateway, needing no knowledge of backend service addresses or protocol details.
- Request Routing: Forwards requests to the correct backend microservice instance based on request URL, HTTP method, Headers, etc.
- Protocol Translation: Can translate protocols used by external clients (e.g., HTTP/REST) to those used by internal services (e.g., RPC/gRPC), and vice versa.
- Aggregation & Orchestration: (Optional) The gateway can aggregate responses from multiple microservices or orchestrate calls to multiple microservices to simplify client logic (similar to the BFF pattern).
- Cross-Cutting Concerns Handling: Centralizes common functionalities (like authentication, authorization, rate limiting, circuit breaking, logging, monitoring, caching, security protection) at the gateway layer, avoiding repetitive implementation in each microservice.
- API Lifecycle Management: Provides features like API version control, publishing, documentation generation.
-
Common Use Cases (with Course Examples)
- Microservice Architecture Entry Point: Unified entry for all external requests (from web, mobile apps, third-party apps) accessing backend microservices. (Course 1: Necessary component after splitting into microservices; Scenarios involving microservices in Courses 2, 3, 5, 7, etc.)
- Authentication & Authorization: Centralized user authentication (e.g., validating JWT Token) and permission checks at the gateway layer; only legitimate requests reach backend services. (Courses 1, 2)
- Traffic Control & Security Protection:
- Rate Limiting: Prevents malicious attacks or abuse by individual users. (Courses 2, 3)
- Circuit Breaking: Avoids continuous calls to faulty services. (Courses 4, 11)
- IP Blacklist/Whitelist / WAF Integration: Basic security protection.
- Logging & Monitoring: Centralized logging of all API requests, collection of metrics like request latency, status codes for monitoring and troubleshooting. (Course 4)
- Protocol Translation: E.g., External uses REST API, internal microservices communicate via gRPC.
- Gray Release/Canary Release: The gateway can direct partial traffic to new service versions based on rules (e.g., user ID, Header).
- Static Content Response / Mock: For certain requests (like OPTIONS probes, Mock APIs), the gateway can respond directly without forwarding to the backend.
-
Implementation Technology Choices
- Nginx-based:
- Nginx + Lua (OpenResty): Very flexible, high performance, allows complex logic implementation via Lua scripts.
- Nginx's native reverse proxy and load balancing features.
- Dedicated Open Source Gateways:
- Kong: Based on OpenResty, feature-rich, plugin-based, provides Admin UI and API.
- Tyk: Developed in Go, comprehensive features.
- APISIX: Based on OpenResty, high performance, dynamic, real-time, highly available.
- Framework-Integrated Gateways (Common in Java ecosystem):
- Spring Cloud Gateway: Second-gen gateway in the Spring Cloud ecosystem, based on Netty and Reactor, asynchronous non-blocking, good performance.
- Zuul (1.x/2.x): Gateway developed by Netflix; Zuul 1.x is blocking IO, Zuul 2.x improved to async non-blocking based on Netty.
- Cloud Provider API Gateway Services:
- AWS API Gateway, Azure API Management, Google Cloud API Gateway: Offer managed services, elastic scaling, tight integration with cloud ecosystems, pay-as-you-go.
- Service Mesh Ingress Gateway (e.g., Istio Gateway): Service meshes often include an Ingress Gateway component as the entry point for external traffic into the mesh, implementing some gateway functions and coordinating with internal traffic management policies. (Course 11)
- Nginx-based:
-
Key Configuration and Considerations
- High Availability: The API Gateway is a critical entry point and must be highly available. Typically requires deploying multiple instances with load balancers (like LVS/Keepalived, Cloud Provider LB) for load distribution and failover.
- Performance & Scalability: Gateway performance is crucial. Choosing async non-blocking I/O based gateways (like Nginx, Spring Cloud Gateway, Kong) usually yields better performance. Must be horizontally scalable.
- Routing Rule Configuration: How to define and manage routing rules (static config files vs. dynamic config center).
- Plugin/Filter Management: How to develop, deploy, and manage plugins or filters for implementing cross-cutting concerns.
- Observability: Gateway needs to provide detailed logs, metrics, and tracing information.
- Security: Securing the gateway itself against attacks.
- Latency: As an extra hop, the gateway introduces some latency; its impact on overall performance needs attention.
The API Gateway is a key component in microservice architectures, simplifying client interaction and centralizing common functions, but it can also become a new performance bottleneck and single point of failure risk, requiring careful design and operation.
3. Relational Databases and Middleware
3.1 MySQL In-Depth Analysis
- Q1: What does it do?
- The most popular open-source relational database management system (RDBMS), providing ACID transaction support.
- Q2: Use Cases?
- Backend storage for web applications (users, products, orders, etc.), core for Online Transaction Processing (OLTP).
- Q3: Problems Solved?
- Provides structured data storage, transactional consistency, SQL query capabilities, replacing simple file storage.
-
Q4: Pros & Cons
- Pros: Mature and stable, large community, rich ecosystem, ACID support.
- Cons: Limited single-machine performance and capacity, complex horizontal scaling (requires sharding), weaker support for massive data analysis.
-
Core Architecture and Mechanisms (from Appendices Two/Five)
- Design Philosophy: A classic implementation of ACID in relational databases.
- Key Mechanisms:
- B+ Tree Index: Balances query efficiency and write cost.
- Undo/Redo Log: Implements transaction rollback and crash recovery (WAL).
- Buffer Pool: Reduces disk IO via memory buffer pool.
- MVCC: Implements isolation levels like Repeatable Read, improving concurrency performance.
- Evolution Path:
graph LR A[Single MySQL] --> B[Master-Slave Replication] B --> C[Read-Write Splitting] C --> D["Semi-Sync/Group Replication(MGR)"] D --> E[Database/Table Sharding] E --> F["Cloud-Native DB(e.g.,PolarDB/Aurora) / Distributed DB(e.g.,TiDB)"]
- High Availability and Read-Write Splitting (from Appendices Four/Five)
- Problems: Single point of failure, read performance bottleneck.
- Solutions:
- Master-Slave Replication (Async/Semi-Sync/MGR).
- Read-write splitting middleware (ProxySQL, ShardingSphere-JDBC).
- Pros & Cons:
- Pros: Improves read performance and availability, failover (MGR) < 30 seconds.
- Cons: Risk of data inconsistency due to master-slave lag, async replication may lose data, split-brain risk.
- Performance Tuning Recommendations (from Appendices Two/Five)
# my.cnf key parameters innodb_buffer_pool_size = # 50%-70% of physical memory innodb_flush_log_at_trx_commit = 1 # (highest safety) or 2 (better performance, may lose second-level data) sync_binlog = 1 # (ensure binlog safety) # Parallel Replication (MySQL 5.6+) slave_parallel_workers = # (CPU cores) slave_parallel_type = LOGICAL_CLOCK # (or DATABASE)
3.2 ShardingSphere
- Q1: What does it do?
- Apache Top-Level Project, positioned as Database Plus, aiming to build an ecosystem on top of databases, providing data sharding, read-write splitting, distributed transactions, database governance, etc.
- Q2: Use Cases?
- Addressing performance and capacity bottlenecks of single databases, implementing transparent database/table sharding and read-write splitting.
- Scenarios requiring distributed transaction support.
- Q3: Problems Solved?
- Avoids hardcoding complex sharding logic in the application layer, providing a unified SQL access entry point.
- Simplifies the implementation complexity of distributed transactions.
-
Q4: Pros & Cons
- Pros: Feature-rich, active community, supports multiple databases, low intrusiveness to business code (JDBC/Proxy modes).
- Cons: Proxy mode has some performance overhead and additional operational costs, support for some complex SQL might be incomplete.
-
Core Modes
- ShardingSphere-JDBC: Lightweight Java library, enhanced at the JDBC layer, application connects directly to the database.
- ShardingSphere-Proxy: Independently deployed database proxy, supports heterogeneous languages.
- ShardingSphere-Sidecar (Planned): Kubernetes Sidecar mode.
3.3 ProxySQL
- Q1: What does it do?
- High-performance MySQL protocol-aware proxy, offering connection pooling, read-write splitting, query routing, firewall capabilities.
- Q2: Use Cases?
- MySQL read-write splitting and load balancing.
- Restrict harmful SQL, protect the database.
- Failover (in conjunction with MHA/Orchestrator).
- Q3: Problems Solved?
- Replace complex read-write splitting logic in the application layer, providing a transparent proxy.
- Offers more powerful management capabilities than MySQL's built-in connection pool.
- Q4: Pros & Cons
- Pros: Extremely high performance, flexible configuration (SQL-based), lightweight.
- Cons: Does not provide data sharding capabilities itself, high availability requires additional components.
3.4 MyCat
- Q1: What does it do?
- Widely used open-source distributed database middleware in China, focusing on MySQL database/table sharding.
- Q2: Use Cases?
- Sharding management for large-scale MySQL clusters.
- Q3: Problems Solved?
- Simplifies data routing and management after sharding.
- Q4: Pros & Cons
- Pros: Relatively mature community (in China), supports rich routing rules.
- Cons: Slower feature iteration compared to ShardingSphere, configuration is relatively complex, performance might not be as good as ProxySQL.
3.5 Seata
- Q1: What does it do?
- Alibaba open-sourced distributed transaction solution, providing high-performance and easy-to-use distributed transaction services.
- Q2: Use Cases?
- Scenarios requiring atomicity across multiple services (or database shards) in a microservice architecture, like inter-bank transfers, order and inventory coordination.
- Q3: Problems Solved?
- Solves the problems of poor 2PC performance and complex Saga/TCC coding, provides a low-intrusive AT mode for business.
-
Q4: Pros & Cons
- Pros: Supports multiple transaction modes (AT, TCC, Saga, XA), integrates well with frameworks like Spring Cloud, active community.
- Cons: AT mode has certain database requirements (needs global lock support), introduces additional TC Server operational costs.
-
Core Modes
- AT (Automatic Transaction): Based on 2PC ideas, automatically generates compensation SQL, non-intrusive to business.
- TCC (Try-Confirm-Cancel): Business needs to implement Try, Confirm, Cancel interfaces.
- Saga: Long transaction solution, splits transaction into multiple local transactions and compensation operations.
- XA: Based on database XA protocol.
3.6 Vitess
- Q1: What does it do?
- CNCF graduated project, initially developed by YouTube, a database solution for deploying, scaling, and managing large clusters of MySQL instances. Essentially a MySQL sharding middleware, designed to run in a cloud environment.
- Q2: Use Cases?
- Large-scale MySQL clusters (e.g., million QPS level) requiring extremely high scalability, availability, and online DDL capabilities.
- Database deployment in cloud-native environments.
- Q3: Problems Solved?
- Provides seamless MySQL horizontal scaling capabilities, simplifies shard management, connection pooling, failover.
- Supports online, lock-free DDL operations.
-
Q4: Pros & Cons
- Pros: Extremely strong scalability and availability, mature online DDL, cloud-native design, excellent performance.
- Cons: Complex architecture, higher learning and operational costs, more suitable for ultra-large-scale scenarios.
-
Core Components
- VTGate: Lightweight stateless proxy, routes queries to the correct Vttablet.
- VTTablet: Proxy in front of each MySQL instance, manages queries, connection pools, DDL, etc.
- VTctld: Cluster management daemon.
- Topology Service: Stores topology metadata (usually using etcd/ZooKeeper).
3.7 TiDB (incl. TiKV, TiFlash)
- Q1: What does it do?
- Open-source distributed HTAP (Hybrid Transactional/Analytical Processing) database, compatible with MySQL protocol.
- Q2: Use Cases?
- Scenarios requiring elastic scaling, strong consistency, and support for both high-concurrency OLTP and real-time OLAP.
- Replaces traditional sharding solutions, simplifying business development.
- Q3: Problems Solved?
- Solves MySQL's horizontal scaling difficulties and application complexity caused by sharding.
- Solves data synchronization latency issues caused by separating traditional TP and AP systems.
-
Q4: Pros & Cons
- Pros: Horizontal scaling, strong consistency (Raft), HTAP capability, compatible with MySQL ecosystem.
- Cons: More complex deployment and operation compared to single-node MySQL, higher resource consumption, optimization for some complex SQL might not be as good as mature RDBMS.
-
Core Architecture
- TiDB Server: Stateless SQL computation layer, responsible for parsing SQL, generating execution plans.
- TiKV Server: Distributed Key-Value storage layer (row store), responsible for storing actual data, uses Raft for consistency.
- TiFlash: Distributed columnar storage engine, synchronizes data from TiKV in real-time via Raft Learner, accelerates OLAP queries.
- PD (Placement Driver): Cluster metadata management and scheduling center.
4. NoSQL Databases and Caching
4.1 Redis In-Depth Analysis
- Q1: What does it do?
- High-performance in-memory data structure store, supporting multiple data types (String, Hash, List, Set, Sorted Set, Bitmap, HyperLogLog, Stream), commonly used as cache, message queue, distributed lock, etc.
- Q2: Use Cases?
- Web application cache, session storage, leaderboards, counters, Publish/Subscribe, Feed streams, flash sale inventory.
- Q3: Problems Solved?
- Greatly improves data access speed (orders of magnitude faster than disk-based databases), alleviates database pressure.
- Provides rich data structures, simplifying development.
-
Q4: Pros & Cons
- Pros: Extremely high read/write performance (100K+ QPS single node), rich data structures, atomic operation support.
- Cons: Data stored in memory, higher cost, limited capacity; persistence (RDB/AOF) impacts performance; some commands are limited in cluster mode (e.g., cross-slot transactions).
-
Memory Management (from Appendices Two/Five)
- Progressive Rehashing: Non-blocking scaling.
- Eviction Policies: LRU, LFU, Random, TTL, etc.
- Memory Allocator: jemalloc (default) / tcmalloc.
-
Persistence Schemes (from Appendix Five)
Scheme RPO (Recovery Point Objective) RTO (Recovery Time Objective) Performance Impact Use Cases RDB Minutes Seconds Low Disaster recovery AOF Seconds Minutes Medium Financial transactions Hybrid Seconds Seconds Medium-High Critical business -
Cluster Mode Comparison (from Appendix Two)
- Sentinel Mode: CP system, automatic master-slave failover, suitable for small to medium scale.
- Cluster Mode: AP system (eventual consistency), decentralized sharding (16384 slots), supports horizontal scaling.
- Typical Issues (from Appendices Two/Five)
- Cache Penetration/Breakdown/Avalanche: See Glossary.
- Large Key (Big Key): Value too large (e.g., >10MB) causes network congestion, uneven distribution. Solution: Split.
- Hot Key: Single key access frequency too high causes single shard/CPU bottleneck. Solutions: Local cache, multi-level cache, replicate with random suffix.
- Historical Version Evolution (from Appendix Two)
- 3.0: Cluster Mode.
- 4.0: Hybrid Persistence,
PSYNC 2.0
. - 5.0: Stream data type.
- 6.0: Multi-threaded IO (improves network processing capability), ACL.
- 7.0: Function API (Server-side scripting).
4.2 Elasticsearch (ES)
- Q1: What does it do?
- Distributed, RESTful search and analytics engine based on the Lucene library.
- Q2: Use Cases?
- Full-text search (website search, product search), log analysis (core of the ELK stack), business monitoring, security intelligence analysis.
- Q3: Problems Solved?
- Replaces inefficient LIKE queries in relational databases, provides powerful full-text search and aggregation capabilities.
- Easier horizontal scaling and management compared to Solr.
-
Q4: Pros & Cons
- Pros: Powerful search and aggregation capabilities, Near Real-Time (NRT), good horizontal scalability, easy-to-use RESTful API.
- Cons: Slower writes compared to databases, higher resource consumption (memory/CPU), risk of split-brain (requires proper configuration), not suitable for frequent update scenarios.
-
Core Concepts
- Index: Similar to a database.
- Type (Deprecated): Similar to a table (After 7.x, an Index has only one Type
_doc
). - Document: Similar to a row.
- Field: Similar to a column.
- Mapping: Defines field types, analyzers, etc.
- Shard: Index shard, enables horizontal scaling.
- Replica: Shard replica, ensures high availability.
- Performance Optimization (from Appendix Five)
- Index Design: Set appropriate number of shards (empirical value:
nodes * cpu_cores * 1.5
), optimize Mapping (avoid dynamic mappingdynamic: strict
, choose wisely betweenkeyword
vstext
). - Query Optimization: Use
filter
context (leverages cache), avoid deep pagination (from
+size
vssearch_after
), use_source
wisely to filter returned fields, userouting
. - Write Optimization: Bulk write (
bulk
API), adjustrefresh_interval
. - Cluster Optimization: Hot/cold data separation (hot on SSD, warm on HDD), JVM tuning, operating system tuning.
- Index Design: Set appropriate number of shards (empirical value:
4.3 Neo4j
- Q1: What does it do?
- Leading open-source graph database, uses the property graph model (nodes, relationships, properties).
- Q2: Use Cases?
- Social network analysis (friend recommendations, common follows), fraud detection (relationship analysis), knowledge graphs, recommendation systems (relationship-based recommendations).
- Q3: Problems Solved?
- Compared to relational databases, extremely high performance for complex, multi-level relationship queries (e.g., "find friends of friends of a user"), avoids numerous JOIN operations.
- Q4: Pros & Cons
- Pros: Efficiently handles graph data and relationship queries, intuitive data model, expressive Cypher query language.
- Cons: Not suitable for storing large amounts of unstructured data or performing aggregate statistical analysis (compared to document/time-series databases), single-point write performance might be a bottleneck.
4.4 JanusGraph
- Q1: What does it do?
- An open-source, distributed graph database designed to store and query graphs containing billions of vertices and edges. Characterized by its pluggable storage and index backends.
- Q2: Use Cases?
- Scenarios requiring processing of ultra-large-scale graph data (tens or hundreds of billions), such as large social networks, knowledge graphs.
- Q3: Problems Solved?
- Offers better horizontal scalability and flexibility than single-node graph databases (like Neo4j Community Edition).
-
Q4: Pros & Cons
- Pros: Extremely strong scalability (depends on underlying storage like Cassandra/HBase), supports multiple storage and index backends (high flexibility), compatible with TinkerPop Gremlin query language.
- Cons: Higher deployment and operational complexity, query performance might not match specifically optimized graph databases.
-
Pluggable Backends
- Storage Backend: Cassandra, HBase, Google Cloud Bigtable, BerkeleyDB.
- Index Backend: Elasticsearch, Solr, Lucene.
4.5 InfluxDB
- Q1: What does it do?
- An open-source time-series database (TSDB) specifically for handling time-stamped data, like monitoring metrics, IoT sensor data, real-time analytics data.
- Q2: Use Cases?
- System monitoring (storing metrics collected by Prometheus, etc.), Application Performance Monitoring (APM), IoT data storage and analysis, real-time data dashboards.
- Q3: Problems Solved?
- Optimized for time-series data compared to general-purpose databases, offering high write performance, high compression ratio, fast time-range queries.
-
Q4: Pros & Cons
- Pros: High-performance time-series data read/write, high compression ratio, built-in Data Retention Policies and Continuous Queries, SQL-like query languages (InfluxQL/Flux).
- Cons: Not suitable for storing non-time-series data, weak JOIN capabilities, high cardinality issues can lead to performance degradation.
-
Core Concepts (v1.x)
- Database: Database.
- Measurement: Similar to a table.
- Tags: Indexed fields (key-value), used for grouping and filtering.
- Fields: Data fields (key-value), non-indexed.
- Point: A single data record, containing timestamp, measurement, tags, fields.
- Series: Unique time series determined by the combination of measurement and tags.
4.6 ClickHouse
- Q1: What does it do?
- Yandex open-sourced high-performance columnar database management system (DBMS), primarily used for Online Analytical Processing (OLAP).
- Q2: Use Cases?
- Real-time analytical queries on massive data, BI reporting, user behavior analysis, log analysis.
- Q3: Problems Solved?
- Offers extremely fast OLAP query speeds (often orders of magnitude faster) compared to traditional row-based databases or Hadoop ecosystem (like Hive/Impala).
-
Q4: Pros & Cons
- Pros: Extremely fast query speed (utilizes vectorized execution, columnar compression), high data compression ratio, supports SQL, good horizontal scalability.
- Cons: Not suitable for high-concurrency point lookups or transactional updates (OLTP), JOIN operations are relatively weaker, higher hardware resource requirements.
-
Core Features
- Columnar Storage: Data stored by column, beneficial for compression and queries (only need to read relevant columns).
- Vectorized Execution: Processes data in batches (vectors) instead of row-by-row, reducing function call overhead.
- Data Compression: Supports various efficient compression algorithms like LZ4, ZSTD.
- MergeTree Engine: Primary table engine family, supports data sorting, partitioning, primary key index, TTL, etc.
4.7 RocksDB
- Q1: What does it do?
- Facebook open-sourced high-performance embedded Key-Value store library, based on LSM-Tree architecture.
- Q2: Use Cases?
- Serves as the underlying storage engine for many distributed systems (like TiKV, Flink state backend, Kafka Streams, CockroachDB).
- Applications requiring high-performance local persistent storage.
- Q3: Problems Solved?
- Offers higher write performance and better space utilization on SSDs compared to traditional B-Tree databases (optimized for HDD).
-
Q4: Pros & Cons
- Pros: Extremely high write performance, high compression ratio, optimized for flash storage, flexible configuration.
- Cons: Read performance (especially range queries) might not match B+Tree, background Compaction can impact performance stability, complex tuning.
-
Core Architecture: LSM-Tree (Log-Structured Merge Tree)
- Writes go to the in-memory MemTable.
- When MemTable is full, it's flushed to disk as immutable SSTable (Sorted String Table) files (Level 0).
- Background Compaction tasks merge SSTables from different levels, eliminating redundant data and optimizing reads.
5. Distributed Storage Systems
5.1 HDFS (Hadoop Distributed File System)
- Q1: What does it do?
- One of the core components of the Hadoop project, a distributed file system designed to run on commodity hardware, highly fault-tolerant, suitable for storing very large datasets.
- Q2: Use Cases?
- Large-scale data storage (TB/PB level), especially as underlying storage for batch processing frameworks like MapReduce, Spark, Hive.
- Storage foundation for data warehouses, data lakes.
- Q3: Problems Solved?
- Addresses the bottlenecks of single-machine storage capacity and performance, provides reliable, scalable large-scale data storage capabilities.
-
Q4: Pros & Cons
- Pros: High throughput (suitable for batch processing), high fault tolerance (multiple replicas), good scalability, relatively low cost (uses commodity hardware).
- Cons: High latency (not suitable for low-latency random read/write), not suitable for storing numerous small files (high metadata pressure), write-once-read-many model (does not support file modification).
-
Core Architecture
- NameNode: Master node, manages the file system namespace (metadata), maintains file-to-block mapping, manages DataNodes. Is the single point of failure in HDFS (requires HA solution).
- DataNode: Worker node, responsible for storing actual data blocks (Block, default 128MB/256MB), executes NameNode's read/write instructions, sends heartbeats and block reports periodically to NameNode.
- Secondary NameNode: Assists NameNode with metadata checkpoint operations, reduces NameNode startup time, but is not a hot standby.
- Read/Write Flow (Simplified)
- Write: Client requests NameNode to create file -> NameNode returns list of writable DataNodes -> Client writes file blocks to the first DataNode -> First DataNode replicates block to the second, second to the third (Pipeline replication) -> DataNodes return confirmation -> Client notifies NameNode file write complete.
- Read: Client requests NameNode for file block locations -> NameNode returns DataNode list -> Client chooses the nearest DataNode to read the block.
6. Message Queues
6.1 Kafka In-Depth Analysis
- Q1: What does it do?
- Distributed, high-throughput, persistent publish-subscribe messaging system (or event streaming platform).
- Q2: Use Cases?
- Log collection, user activity tracking, metrics monitoring, stream processing data source, event-driven architecture, microservice decoupling.
- Q3: Problems Solved?
- Replaces performance bottlenecks of traditional message queues (like RabbitMQ) in high-throughput scenarios like logging.
- Provides reliable, horizontally scalable message storage and transport.
-
Q4: Pros & Cons
- Pros: Extremely high throughput (millions QPS single node possible), persistent storage (allows re-consumption), high availability (replication mechanism), good horizontal scalability, rich ecosystem (Connect/Streams).
- Cons: Does not support complex message routing (compared to RabbitMQ Exchanges), relatively higher latency (ms level), depends on ZooKeeper (removable after 3.x).
-
Core Concepts
- Broker: Kafka server node.
- Topic: Message category, logical concept.
- Partition: Physical partition of a Topic, enables concurrency and scalability, messages within each partition are ordered.
- Producer: Message producer.
- Consumer: Message consumer.
- Consumer Group: Multiple Consumers form a group to consume a Topic together; each Partition is consumed by only one Consumer within the group.
- Offset: Records the Consumer's consumption position in each Partition.
- Storage Design (from Appendices Two/Five)
- Data within a Partition is stored as Segment files, each Segment contains
.log
(data) and.index
(index) files. - Sequential disk writes, utilizes Page Cache.
- Zero-copy: Uses the
sendfile
system call, data is sent directly from Page Cache to the network card, avoiding data copying between kernel and user space.
- Data within a Partition is stored as Segment files, each Segment contains
- Reliability Mechanisms
- Replication: Each Partition can have multiple replicas, distributed across different Brokers.
- Leader/Follower: Each Partition has only one Leader responsible for reads/writes; Followers sync data from the Leader.
- ISR (In-Sync Replicas): Set of replicas that are in-sync with the Leader. Producer can configure the
acks
parameter to determine the write reliability level (acks=0, 1, all
).
- Producer Tuning (from Appendix Five)
Properties props = new Properties(); props.put("bootstrap.servers", "kafka1:9092,..."); props.put("acks", "1"); // or "all" for highest reliability props.put("retries", 3); // Number of retries on failure props.put("batch.size", 16384); // Batch size (bytes) props.put("linger.ms", 5); // Delay time (ms) to accumulate batch props.put("compression.type", "lz4"); // Compression type (none, gzip, snappy, lz4, zstd)
- Consumer Group Management (from Appendix Five)
- Rebalance: Triggered when the number of Consumers in a Group or the number of Topic Partitions changes, reassigns Partitions to Consumers.
- Rebalance Strategy: Range, RoundRobin, Sticky. Sticky strategy tries to maintain previous assignments, reducing state migration.
6.2 RabbitMQ
- Q1: What does it do?
- Open-source message broker software implementing AMQP (Advanced Message Queuing Protocol).
- Q2: Use Cases?
- Business notifications, task queues (asynchronous processing), scenarios requiring complex routing and message acknowledgment mechanisms.
- Q3: Problems Solved?
- Provides reliable message delivery, flexible routing mechanisms, application decoupling.
-
Q4: Pros & Cons
- Pros: Feature-rich (various Exchange types, TTL, Dead Letter Queue, Priority Queue), user-friendly management interface, supports multiple protocols (AMQP, MQTT, STOMP).
- Cons: Lower throughput compared to Kafka, performance limited by memory (during persistence), Erlang stack might have higher maintenance costs.
-
Core Concepts
- Broker: RabbitMQ server.
- Virtual Host: Virtual host, isolates environments.
- Connection: TCP connection.
- Channel: Channel, multiplexes connection.
- Exchange: Exchange, receives Producer messages and routes them to Queues based on rules. Types: Direct, Fanout, Topic, Headers.
- Queue: Queue, stores messages.
- Binding: Rule connecting an Exchange and a Queue.
6.3 Pulsar
- Q1: What does it do?
- Apache Top-Level Project, next-generation cloud-native distributed messaging and streaming platform.
- Q2: Use Cases?
- Scenarios requiring high throughput, low latency, strong consistency, multi-tenancy, geo-replication for messaging and streaming.
- Scenarios unifying message queuing and stream storage.
- Q3: Problems Solved?
- Addresses Kafka's ZooKeeper dependency, partition count limitations, operational complexity, etc.
- Offers a more flexible architecture and richer features.
-
Q4: Pros & Cons
- Pros: Compute-storage separation architecture (Broker + BookKeeper), Segmented Stream storage (infinite storage), multi-tenancy support, geo-replication, supports multiple consumption modes (Exclusive/Shared/Failover), supports multiple protocols (Pulsar/Kafka/MQTT/AMQP).
- Cons: Community and ecosystem not as mature as Kafka's, architecture is relatively more complex.
-
Core Architecture (from Appendix Two)
- Broker: Stateless compute layer, handles message routing, protocol processing.
- BookKeeper: Stateful storage layer, handles message persistence (based on Log segment storage).
- ZooKeeper: Stores metadata (may be removed in the future).
6.4 NATS
- Q1: What does it do?
- An open-source, high-performance, lightweight cloud-native messaging system. Designed for simplicity, speed, and security.
- Q2: Use Cases?
- Microservice communication, IoT device messaging, command and control systems, scenarios requiring extremely low latency.
- Q3: Problems Solved?
- Provides simpler and faster messaging mechanism than traditional MQs.
-
Q4: Pros & Cons
- Pros: Extremely high performance (millions QPS, sub-millisecond latency), simple deployment, low resource consumption, supports multiple messaging patterns (Request/Reply, Pub/Sub, Queue).
- Cons: Core NATS does not provide persistence (requires NATS Streaming/JetStream), fewer features compared to Kafka/Pulsar.
-
NATS JetStream: NATS' built-in persistent streaming storage layer, offering Kafka-like features.
7. Computation Frameworks
7.1 Spark In-Depth Analysis
- Q1: What does it do?
- Fast, general-purpose, scalable big data processing engine, supporting batch processing, interactive queries (SQL), stream processing, machine learning, and graph computation.
- Q2: Use Cases?
- Replaces Hadoop MapReduce for more efficient batch processing, ETL, data warehouse queries (Spark SQL), machine learning (MLlib), graph analysis (GraphX), real-time stream processing (Structured Streaming).
- Q3: Problems Solved?
- Greatly improved MapReduce's computation performance (especially for iterative computations), reduced disk IO through in-memory computation.
- Provides a unified API stack, simplifying big data processing.
-
Q4: Pros & Cons
- Pros: High performance (in-memory computation), easy-to-use API (supports Scala/Java/Python/R), rich ecosystem, supports multiple deployment modes (Standalone/YARN/Mesos/K8s).
- Cons: Stream processing capability weaker than Flink (micro-batch model Structured Streaming), higher memory consumption, tuning is relatively complex.
-
Core Concepts
- RDD (Resilient Distributed Dataset): Core abstraction, elastic distributed dataset.
- DataFrame/Dataset: Higher-level abstraction based on RDD, with Schema information, provides SQL-like operations.
- DAG (Directed Acyclic Graph): Directed Acyclic Graph for task execution.
- Driver: Process running the
main
function, responsible for creating SparkContext, building DAG, scheduling Tasks. - Executor: Process running on worker nodes, responsible for executing Tasks, storing data.
- Memory Management (from Appendices Two/Five)
- Unified Memory Management Model (Spark 1.6+): On-heap memory divided into Execution Memory (Shuffle/Join/Sort/Aggregation) and Storage Memory (RDD cache/broadcast variables), can borrow from each other.
- Off-Heap Memory: Used to reduce GC overhead.
- Optimization Cases (from Appendix Two)
- Broadcast Variables: Broadcast small tables or variables to all Executors, avoiding redundant fetching by each Task.
- Accumulators: Distributed counter or sum, readable only by the Driver.
- Data Skew Handling: Salting, Key aggregation, using Skew Join optimization.
- Serialization: Use Kryo serialization library (faster and more compact than Java serialization).
persist
/cache
: Appropriately cache frequently reused RDDs/DataFrames.- Shuffle Optimization: Adjust
spark.sql.shuffle.partitions
, use appropriate Shuffle Manager.
7.2 Flink In-Depth Analysis
- Q1: What does it do?
- Open-source stream processing framework and distributed computation engine, known for its low latency, high throughput, exactly-once semantics, and strong support for event time processing. Also supports batch processing (unified stream/batch).
- Q2: Use Cases?
- Real-time data analysis, real-time ETL, real-time risk control, real-time recommendation, Complex Event Processing (CEP), Internet of Things (IoT) data processing.
- Q3: Problems Solved?
- Provides true event-driven stream processing (compared to Spark's micro-batching), lower latency.
- Powerful state management and Checkpoint mechanism ensure exactly-once processing semantics.
- Excellent event time processing capabilities, accurately handles out-of-order data.
-
Q4: Pros & Cons
- Pros: Low latency, high throughput, exactly-once semantics, powerful state management, excellent event time processing, unified stream/batch API (DataStream/Table API/SQL).
- Cons: Batch processing performance might be slightly weaker than Spark, community and ecosystem slightly smaller than Spark's, API learning curve might be slightly steeper.
-
Core Concepts
- DataStream API: Core stream processing API.
- Table API / SQL: Higher-level declarative API, achieving stream/batch unity.
- Event Time / Processing Time / Ingestion Time: Time Semantics.
- State: Intermediate results stored by operators, supports multiple state types (ValueState, ListState, MapState, etc.).
- State Backend: Where state is stored (Memory, FsStateBackend (based on file system), RocksDBStateBackend).
- Checkpoint: Persistent state snapshot for fault recovery, achieving exactly-once/at-least-once.
- Savepoint: Manually triggered Checkpoint for task upgrades, migrations.
- Watermark: Marker for event time progress, used to trigger window computations.
- Window: Divides an unbounded stream into bounded chunks (Tumbling, Sliding, Session, Global).
-
State Backend Selection (from Appendices Two/Five)
Type Characteristics Use Cases MemoryStateBackend Fast but state size limited, prone to loss Local testing FsStateBackend Based on persistent storage like HDFS Large state, HA RocksDBStateBackend Based on local RocksDB, incremental Checkpoint Very large state, Recommended -
Time Semantics and Watermark (from Appendices Two/Five)
- Event time is key to ensuring result accuracy.
- Watermark informs Flink about event time progress, handles out-of-order and late data.
- Watermark Generation Strategy: Periodic vs Punctuated (based on specific events).
- Unified Stream and Batch (from Appendix Ten)
- Flink treats batch processing as a special case of stream processing (bounded stream).
- Using Table API / SQL allows processing stream and batch sources with the same code.
7.3 Hadoop MapReduce
- Q1: What does it do?
- The earliest, highly influential distributed computation framework, used for processing and generating large datasets.
- Q2: Use Cases?
- Large-scale offline batch data processing (e.g., log analysis, data mining, index building).
- Q3: Problems Solved?
- First to implement reliable, fault-tolerant large-scale parallel computation on commodity hardware clusters.
-
Q4: Pros & Cons
- Pros: High reliability, good fault tolerance, capable of handling PB-scale data.
- Cons: Programming model is relatively complex, lower performance (heavy disk IO), high latency, not suitable for iterative computation and real-time processing.
-
Core Flow (from Appendix Ten)
- Input: Read input split from HDFS.
- Map: User writes
map
function to process input, outputs<Key, Value>
. - Combine (Optional): Local Reduce on the Mapper side, reduces Shuffle data volume.
- Shuffle & Sort: Sort Map output by Key and transfer to Reducer nodes.
- Reduce: User writes
reduce
function to process list of Values for the same Key, outputs final result. - Output: Write results to HDFS.
8. Cloud-Native Infrastructure
8.1 Kubernetes (K8s) In-Depth Analysis
- Q1: What does it do?
- Open-source container orchestration system for automating deployment, scaling, and management of containerized applications. Has become the de facto standard for cloud-native applications.
- Q2: Use Cases?
- Microservice deployment and management, CI/CD, elastic scaling, service discovery, load balancing, state management (StatefulSet).
- Q3: Problems Solved?
- Solves the complexity of manually managing numerous containers, provides declarative API and automated operations capabilities.
- Abstracts underlying infrastructure differences, enabling cross-cloud, cross-environment application deployment.
-
Q4: Pros & Cons
- Pros: Powerful features, rich ecosystem, active community, highly scalable, declarative configuration.
- Cons: Complex architecture, steep learning curve, high operational barrier, diverse and complex network and storage options.
-
Core Principles and Features
- Declarative API: User describes the desired application state via YAML or JSON files (e.g., required replicas, image used, ports exposed); K8s continuously works to achieve and maintain this state.
- Master-Node Architecture:
- Master Node (Control Plane): Responsible for managing the entire cluster. Core components include:
kube-apiserver
: Provides the entry point for K8s API, handles REST requests.etcd
: Highly available key-value store, saves all cluster configuration and state data.kube-scheduler
: Responsible for scheduling newly created Pods onto suitable Node nodes.kube-controller-manager
: Runs various controllers (e.g., replica controller, node controller), responsible for maintaining cluster state.cloud-controller-manager
(Optional): Interacts with cloud provider API (e.g., creating load balancers, storage volumes).
- Node Node (Worker Node): Responsible for running containerized applications. Core components include:
kubelet
: Communicates with Master, manages Pod and container lifecycle on this node.kube-proxy
: Maintains network rules, implements K8s Service network proxy and load balancing.- Container Runtime: Responsible for actually running containers, e.g., Docker, containerd, CRI-O.
- Master Node (Control Plane): Responsible for managing the entire cluster. Core components include:
- Core Objects:
- Pod: Smallest deployable unit in K8s, contains one or more closely related containers, sharing network and storage volumes.
- Service: Provides a stable IP address and DNS name for a set of Pods, and offers load balancing. Solves the problem of non-fixed Pod IPs.
- Deployment: Defines the desired state of Pods (e.g., replica count, image version), manages Pod creation, updates, and rolling upgrades.
- ReplicaSet: Ensures a specified number of Pod replicas are running (usually managed by Deployment).
- StatefulSet: Used for managing stateful applications (like databases), ensuring ordered, unique Pod deployment and stable network identifiers.
- DaemonSet: Ensures all (or some) Node nodes run a specific Pod replica (e.g., log collection Agent, monitoring Exporter).
- Namespace: Divides cluster resources into logically isolated groups.
- ConfigMap / Secret: Used to store configuration and sensitive information, decoupled from Pods.
- Volume: Provides persistent storage for Pods.
- Ingress: Manages external access to the cluster (HTTP/HTTPS routing).
- Autoscaling:
- HPA (Horizontal Pod Autoscaler): Automatically adjusts Pod count managed by controllers like Deployment based on CPU/memory usage or custom metrics. (Courses 3, 10)
- VPA (Vertical Pod Autoscaler): Automatically adjusts Pod resource requests and limits.
- CA (Cluster Autoscaler): Automatically adjusts the number of Node nodes in the cluster based on Pod scheduling needs (depends on cloud provider support).
- Service Discovery and Load Balancing: Service objects provide basic internal service discovery and load balancing. Ingress objects are used to expose HTTP/HTTPS services externally.
- Self-healing Capability: K8s continuously monitors node and Pod status, automatically restarts failed containers, replaces or reschedules failed Pods.
-
Common Use Cases (with Course Examples)
- Microservice Deployment & Management: K8s is ideal for deploying, scaling, and managing microservices. (Later stages of numerous examples like Courses 1, 2, 3, 5, 6, 7, 9, 11)
- Stateless Application Deployment: Deploy stateless apps like web servers, API gateways, using Deployment and HPA for elastic scaling. (Course 3)
- Stateful Application Deployment (Use with caution): Deploy stateful apps like databases, message queues using StatefulSet and Persistent Volumes, but requires more complex operational management.
- Batch Processing Jobs: Run one-off or scheduled batch jobs using Job and CronJob objects.
- Big Data & AI Platforms: Run computation frameworks like Spark, Flink, TensorFlow, leveraging K8s resource scheduling and elasticity. (Courses 10, 12)
- CI/CD: Target platform for continuous integration and deployment pipelines.
- Serverless Platforms: Infrastructure for Serverless frameworks like Knative, OpenFaaS. (Course 11)
- Service Mesh: Istio, Linkerd, etc., are often deployed on top of K8s clusters. (Course 11)
-
Kubernetes vs Other Orchestration Tools
- Docker Swarm: Docker's official orchestrator, simpler, but less feature-rich and smaller ecosystem than K8s.
- Apache Mesos: Lower-level cluster resource manager, can run K8s, Hadoop, Spark, etc., but complex to configure and use itself.
- Nomad (HashiCorp): Another flexible cluster scheduler, can orchestrate containerized and non-containerized applications.
-
Key Concepts & Best Practices
- Resource Requests & Limits: Set CPU and memory request values (for scheduling) and limit values (enforced constraints) for containers in a Pod, ensuring reasonable and stable resource allocation.
- Health Checks (Probes):
- Liveness Probe: Detects if a container is alive; restarts if unhealthy.
- Readiness Probe: Detects if a container is ready to receive traffic; removes from Service Endpoints if unhealthy.
- Startup Probe: Detects if a container has finished starting (for slow-starting apps).
- Labels & Selectors: Group K8s objects via labels, filter objects via selectors; core mechanism for resource organization and association in K8s.
- Annotations: Used to attach arbitrary non-identifying metadata to objects.
- Network Policies: Control network access permissions between Pods, enabling network isolation.
- RBAC (Role-Based Access Control): Manages access permissions for users and service accounts to the K8s API.
- Helm: Package manager for K8s, simplifies application definition, installation, and upgrades.
- Operator Pattern: Encapsulates operational knowledge for a specific application into a K8s controller, enabling automated operations.
Kubernetes greatly simplifies the deployment and management of distributed applications, a key technology for building cloud-native applications and modern infrastructure.
8.2 Istio Service Mesh
- Q1: What does it do?
- Open-source service mesh platform that layers transparently onto existing distributed applications. Provides a uniform way to connect, secure, control, and observe services.
- Q2: Use Cases?
- Traffic management for microservice architectures (canary releases, A/B testing, timeout retries), inter-service security (mTLS encryption, authorization policies), policy enforcement (rate limiting, access control), observability (metrics, tracing, logs).
- Q3: Problems Solved?
- Decouples service governance capabilities from application code, moving them down to the infrastructure layer (Sidecar), allowing developers to focus on business logic.
- Provides a cross-language, unified service governance solution.
-
Q4: Pros & Cons
- Pros: Powerful and comprehensive features, non-intrusive adoption, supports multiple protocols, integrates well with K8s.
- Cons: Complex architecture, introducing Sidecar adds performance overhead (latency, resource consumption), high operational barrier.
-
Core Architecture (Simplified after v1.5+)
- Data Plane: Composed of a set of intelligent proxies (Envoy) deployed as Sidecars, mediates and controls all network communication between microservices.
- Control Plane:
istiod
, manages and configures proxies to route traffic and enforce policies. Includes Pilot (config push), Citadel (security certs), Galley (config validation).
- Core Feature Implementation (from Appendix Eleven)
- Traffic Management: VirtualService (routing rules), DestinationRule (destination policies, e.g., load balancing, circuit breaking).
- Security: PeerAuthentication (mTLS), RequestAuthentication (JWT), AuthorizationPolicy (access control).
- Observability: Automatically generates Metrics (Prometheus), Traces (Jaeger/Zipkin), Access Logs.
- Envoy Proxy: Core of Istio data plane, high-performance C++ network proxy. Supports dynamic configuration (xDS API), hot restarts, rich L4/L7 protocol support, advanced load balancing, strong observability.
8.3 etcd
- Q1: What does it do?
- A strongly consistent, highly available distributed Key-Value store.
- Q2: Use Cases?
- Used as the core data store for Kubernetes (storing cluster state and configuration).
- Metadata storage, service discovery, distributed locking for distributed systems.
- Q3: Problems Solved?
- Provides a reliable, consistent way to store and access critical configuration information for distributed systems.
- Q4: Pros & Cons
- Pros: Strong consistency (Raft-based), high availability, simple API (gRPC/HTTP).
- Cons: Lower performance compared to in-memory KVs like Redis, limited storage capacity, sensitive to disk IO.
8.4 ZooKeeper
- Q1: What does it do?
- An open-source distributed coordination service, providing consistency services for distributed applications.
- Q2: Use Cases?
- Distributed locking, configuration management, cluster membership management, Master election, service discovery (Dubbo, Kafka early versions).
- Q3: Problems Solved?
- Solves common coordination problems in distributed systems, simplifying distributed application development.
-
Q4: Pros & Cons
- Pros: Mature, stable, feature-rich (Watch mechanism), sequential consistency.
- Cons: Relatively complex architecture, performance can be a bottleneck, operational challenges, Java stack.
-
Core Concepts: Znode (Data node, similar to a file system path), Watch (One-time triggered event notification).
- Consistency Protocol: ZAB (ZooKeeper Atomic Broadcast).
9. Distributed Consistency Protocols: Paxos & Raft
Distributed consistency protocols are the foundation for building reliable distributed systems, used to ensure agreement (Consensus) on a value or state among a group of potentially failing nodes. Paxos and Raft are two of the most famous and widely used algorithms.
9.1 Consistency Protocol Overview
- Core Problem: Consensus
In distributed systems, multiple nodes need to coordinate. When a decision needs to be agreed upon (e.g., which node is Leader, what a value should be), consensus algorithms are needed. Challenges include node crashes, message loss or delay, network partitions.
Consensus algorithms must guarantee:
- Agreement/Consistency: All non-faulty nodes eventually agree on the same value.
- Termination/Liveness: All non-faulty nodes eventually decide on a value (within finite time).
- Validity/Integrity: If all nodes proposed the same value v, then the agreed-upon value must be v.
9.2 Paxos Protocol
- Proposed by: Leslie Lamport (Turing Award winner).
- Core Idea: Uses a "two-phase commit"-like protocol for Proposers to propose a value and Acceptors to accept it. Ingenious protocol design, tolerates minority node failures.
- Roles:
- Proposer: Proposes a Proposal, containing a proposal number (usually increasing) and a proposed value.
- Acceptor: Accepts proposals. Records the highest proposal number promised and the value accepted (if any).
- Learner: Learns the finally chosen value.
- Basic Flow (Basic Paxos):
- Prepare Phase: Proposer chooses a proposal number N, sends
Prepare(N)
request to a quorum (> half) of Acceptors. - Promise Phase: Acceptor receives
Prepare(N)
:- If N is greater than any Prepare request number it has responded to, it promises not to accept proposals numbered less than N and replies with the highest-numbered proposal it has accepted (value included, if any).
- Otherwise, ignore or reject.
- Accept Phase: If Proposer receives Promise responses from a quorum:
- It chooses a value V: If responses included accepted values, choose the value from the highest-numbered proposal; otherwise, it can choose its own initially proposed value.
- Sends
Accept(N, V)
request to the Acceptors that responded with Promise.
- Accepted Phase: Acceptor receives
Accept(N, V)
:- If N is not less than the lower bound it promised, accept proposal (N, V) and notify Learners.
- Otherwise, ignore.
- Prepare Phase: Proposer chooses a proposal number N, sends
- Characteristics: Theoretically important, basis for many systems. But Basic Paxos is complex to understand and implement, potential for livelock. Multi-Paxos optimizes by electing a stable Leader.
- Application: Google Chubby (distributed lock service), a classic Paxos implementation.
9.3 Raft Protocol
- Proposed by: Diego Ongaro and John Ousterhout (Stanford University).
- Design Goal: Understandability, while providing fault tolerance and performance comparable to Paxos.
- Core Idea: Decomposes consensus into three relatively independent subproblems:
- Leader Election: Elects a single Leader to manage log replication.
- Log Replication: Leader accepts client requests, appends them as log entries, replicates to Followers, ensuring eventual log consistency.
- Safety: Ensures no incorrect Leader is elected and state machine executes logs in consistent order.
- Roles:
- Leader: Handles all client requests, manages log replication.
- Follower: Passively receives log replication requests and heartbeats from Leader. Becomes Candidate and starts election on timeout.
- Candidate: Initiates election, competes to become new Leader.
- Flow Overview:
- Leader Election: Follower becomes Candidate after Election Timeout, sends
RequestVote
RPCs. Candidate receiving votes from quorum becomes Leader. Leader sends periodic heartbeats (emptyAppendEntries
RPCs) to maintain status. - Log Replication: Leader appends client request to its log, sends
AppendEntries
RPCs to Followers in parallel. Once replicated by quorum, entry is Committed. Leader informs client and notifies Followers of committed entries. - Safety Guarantees: Raft ensures safety through carefully designed election restrictions (Candidate must have latest committed logs) and commit rules (Leader only commits entries from its current term).
- Leader Election: Follower becomes Candidate after Election Timeout, sends
- Characteristics: Compared to Paxos, Raft has clearer states, easier flow to understand, more engineer-friendly implementation.
-
Application: Core consensus mechanism in etcd, Consul, TiKV, CockroachDB, NATS Streaming, and many other modern distributed systems. Often used for metadata management in distributed databases (Course 9) and distributed file systems (Course 8).
-
Paxos vs Raft Comparison
Feature Paxos (Multi-Paxos) Raft Core Goal Fault Tolerance Understandability + Fault Tolerance Leader Usually has Leader, complex election Strong Leader model, clear election Decomposition Not explicitly decomposed Decomposed into Election, Replication, Safety Understandability Higher Lower Implementation Higher Lower Industry Use Chubby, some earlier systems etcd, Consul, TiKV, many modern systems
10. Monitoring and Observability
10.1 Prometheus In-Depth Analysis
- Q1: What does it do?
- CNCF graduated project, open-source systems monitoring and alerting toolkit.
- Q2: Use Cases?
- Infrastructure monitoring (servers, network devices), container & K8s monitoring, application performance metrics monitoring, alerting.
- Q3: Problems Solved?
- Provides a powerful multi-dimensional data model and flexible query language (PromQL), addressing shortcomings of traditional monitoring systems (like Nagios, Zabbix) in dynamic, large-scale environments.
-
Q4: Pros & Cons
- Pros: Powerful data model and PromQL, Pull model is easy to deploy and manage, rich ecosystem (numerous Exporters), integrates perfectly with Grafana.
- Cons: Does not natively provide long-term storage or clustering (requires extensions like Thanos/Cortex/VictoriaMetrics), weak support for logs and traces, limited support for push scenarios (requires Pushgateway).
-
Core Architecture
- Prometheus Server: Core service, responsible for pulling and storing time-series data, provides query interface.
- Exporters: Agent program exposing metrics (e.g., Node Exporter, MySQLd Exporter).
- Pushgateway: Supports short-lived jobs pushing metrics.
- Alertmanager: Handles alerting rules, sends notifications (deduplication, grouping, inhibition).
- Client Libraries: Embedded in application, directly exposes metrics.
- Data Model: Time series data uniquely identified by metric name and a set of key-value labels.
- Storage (TSDB) Design (from Appendices Two/Five)
- Divided into blocks by time.
- Data within blocks organized by Series, uses compression algorithms (e.g., Gorilla TSDB's XOR compression, Snappy).
- Quickly locate Series via index.
- High Availability and Scaling Solutions (from Appendix Two)
- Prometheus HA: Run two identically configured Prometheus instances.
- Federation: Hierarchical aggregation, global Prometheus pulls metrics from lower-level Prometheus instances.
- Remote Storage (Remote Read/Write): Write data to long-term storage systems (Thanos, Cortex, M3DB, InfluxDB).
- Thanos: Provides global query view, long-term storage (integrates with object storage), downsampling.
- VictoriaMetrics: High-performance, high-compression Prometheus-compatible storage.
10.2 Grafana
- Q1: What does it do?
- Open-source data visualization and monitoring platform, supports connecting to multiple data sources.
- Q2: Use Cases?
- Create rich, interactive monitoring dashboards.
- Unify display of data from different sources like Prometheus, InfluxDB, Elasticsearch, MySQL.
- Alerting (limited capability, usually paired with Alertmanager).
- Q3: Problems Solved?
- Offers more powerful and flexible data visualization capabilities than Kibana/Prometheus UI.
- Q4: Pros & Cons
- Pros: Supports many data sources, good visualization, flexible dashboard config, active community (many plugins).
- Cons: Does not handle data collection or storage itself.
10.3 ELK Stack (Elasticsearch, Logstash, Kibana)
- Q1: What does it do?
- A popular open-source solution for log collection, analysis, and visualization.
- Elasticsearch: Stores and searches logs.
- Logstash: (or Filebeat) Log collection, processing, forwarding.
- Kibana: Log visualization and query interface.
- Q2: Use Cases?
- Centralized log management, log search and analysis, application and system troubleshooting.
- Q3: Problems Solved?
- Solves the problem of managing and querying logs scattered across servers.
-
Q4: Pros & Cons
- Pros: Powerful features, mature ecosystem, good community support.
- Cons: Higher resource consumption (especially ES), relatively complex deployment and maintenance, Logstash performance can be a bottleneck (can be replaced by Filebeat+Kafka).
-
Architecture Evolution (from Appendix Four)
- Basic: Filebeat (Collect) -> Elasticsearch -> Kibana.
- Adding Logstash: Filebeat -> Logstash (Filter/Transform) -> Elasticsearch -> Kibana.
- Adding Kafka: Filebeat -> Kafka (Buffer/Decouple) -> Logstash -> Elasticsearch -> Kibana.
10.4 Filebeat
- Q1: What does it do?
- Part of the Elastic Stack (ELK), a lightweight log data shipper used for forwarding and centralizing log data.
- Q2: Use Cases?
- Collects log files (or other event data) in real-time from servers, VMs, containers, and sends them to Elasticsearch, Logstash, or Kafka.
- Q3: Problems Solved?
- Replaces heavyweight Logstash Agent, lower resource consumption, focuses on log collection and forwarding.
-
Q4: Pros & Cons
- Pros: Lightweight, low resource usage, simple config, supports backpressure, integrates well with ELK ecosystem.
- Cons: Less data processing and transformation capability than Logstash.
-
Core Features: Inputs, Processors, Outputs, Modules.
10.5 Alertmanager
- Q1: What does it do?
- Official Prometheus component for handling alerts.
- Q2: Use Cases?
- Receives alert information from Prometheus (or other clients), performs deduplication, grouping, silencing, inhibition, and routes alerts to the correct notification receivers (e.g., Email, PagerDuty, Slack, Webhook).
- Q3: Problems Solved?
- Separates alert rule definition (in Prometheus) from alert notification management (in Alertmanager), providing more flexible and powerful alert handling.
-
Q4: Pros & Cons
- Pros: Powerful alert grouping, silencing, inhibition capabilities, flexible routing config, supports high availability.
- Cons: Does not generate alerts itself, relies on clients like Prometheus.
-
Core Functions: Grouping, Inhibition, Silences, Routing.
10.6 OpenTelemetry Collector
- Q1: What does it do?
- Part of the OpenTelemetry project, a standalone service for receiving, processing, and exporting telemetry data (Traces, Metrics, Logs).
- Q2: Use Cases?
- Acts as a unified agent and processing pipeline for telemetry data, decoupling applications from backend monitoring systems.
- Data format conversion, data sampling, attribute addition, batch exporting.
- Q3: Problems Solved?
- Avoids direct integration of applications with multiple monitoring backends, simplifying configuration and management.
- Provides a standardized data processing pipeline.
-
Q4: Pros & Cons
- Pros: Vendor-neutral, supports multiple protocols for receiving and exporting, flexible configuration (based on Pipelines and Processors).
- Cons: Introduces an additional component to deploy and operate.
-
Data Pipeline (from Appendix Two)
graph LR Application/SDK -->|OTLP/Jaeger/Zipkin etc.| Collector Receiver --> Processor(Filter/Sample/...) --> Exporter --> Backend Storage(Jaeger/Prometheus/...)
10.7 Jaeger
- Q1: What does it do?
- CNCF graduated project, open-source end-to-end distributed tracing system.
- Q2: Use Cases?
- Request tracing, performance analysis, troubleshooting, distributed transaction monitoring in microservice architectures.
- Q3: Problems Solved?
- Solves the difficulty of tracing the complete path and latency of a single request in complex distributed systems.
-
Q4: Pros & Cons
- Pros: Adheres to OpenTracing standard (now moving to OpenTelemetry), clear architecture, intuitive UI, integrates well with K8s/Istio.
- Cons: Storage dependency (usually Elasticsearch or Cassandra), sampling strategy needs careful configuration under high traffic.
-
Core Components: Agent, Collector, Query Service, UI.
11. AI and Machine Learning Platform Components
11.1 TensorFlow Serving
- Q1: What does it do?
- High-performance machine learning model serving system developed by Google, designed for production environments.
- Q2: Use Cases?
- Deploy models trained with TensorFlow (or other frameworks via conversion) as online prediction services.
- Q3: Problems Solved?
- Provides a more efficient, stable model deployment solution with version management and hot updates compared to simple Flask wrappers.
- Q4: Pros & Cons
- Pros: High performance, supports model version control & multi-model serving, supports A/B testing, supports batch requests.
- Cons: Primarily optimized for TensorFlow models, relatively complex deployment and configuration.
11.2 Triton Inference Server
- Q1: What does it do?
- Open-source inference serving software developed by NVIDIA, aimed at simplifying and accelerating AI model deployment in production.
- Q2: Use Cases?
- Scenarios requiring high performance, support for multiple frameworks (TensorFlow, PyTorch, TensorRT, ONNX, etc.), and multi-GPU inference.
- Q3: Problems Solved?
- Provides a unified interface to serve models from different frameworks, optimizes GPU utilization.
- Q4: Pros & Cons
- Pros: Extremely high performance (especially with TensorRT), broad framework support, Dynamic Batching, model management and version control.
- Cons: Relatively complex configuration and usage, more biased towards NVIDIA GPU environments.
11.3 KFServing (now KServe)
- Q1: What does it do?
- Platform built on Kubernetes for deploying and managing Serverless inference workloads.
- Q2: Use Cases?
- Deploy, scale, and manage machine learning models in a Serverless manner on Kubernetes.
- Provides advanced features like model explainability, Canary deployments, autoscaling.
- Q3: Problems Solved?
- Simplifies the complexity of deploying and operating models on K8s, provides a standardized Serverless inference solution.
- Q4: Pros & Cons
- Pros: Cloud-native design, Serverless autoscaling, supports multiple frameworks, feature-rich (Canary, Explainability).
- Cons: Depends on Knative or other Serverless frameworks, relatively heavyweight architecture.
11.4 Feast
- Q1: What does it do?
- Open-source Feature Store system for managing, discovering, and serving machine learning features.
- Q2: Use Cases?
- Unify management of online (low-latency serving) and offline (batch training/analysis) feature data.
- Solves problems like inconsistent feature computation, difficulty in discovering and reusing features.
- Q3: Problems Solved?
- Provides centralized feature management, ensuring consistency between training and serving features (Training-Serving Skew).
- Promotes feature sharing and reuse.
-
Q4: Pros & Cons
- Pros: Addresses feature management pain points, supports multiple data sources and online stores, provides point-in-time join.
- Cons: Relatively new project, ecosystem and best practices still evolving, deployment and integration have some complexity.
-
Core Concepts: Feature View, Entity, Feature Service, Online/Offline Store.
11.5 MLflow
- Q1: What does it do?
- An open-source platform for managing the end-to-end machine learning lifecycle.
- Q2: Use Cases?
- Tracking: Track experiment parameters, code versions, metrics, and output files.
- Projects: Package code for reproducibility.
- Models: Manage and deploy models from various ML libraries.
- Registry: Model version management, stage transitions (Staging, Production), annotations.
- Q3: Problems Solved?
- Solves the difficulty of tracking, reproducing, and managing machine learning experiments.
- Q4: Pros & Cons
- Pros: Lightweight, easy to integrate, comprehensive features (covers key lifecycle stages), broad framework support.
- Cons: Does not provide compute resource scheduling or pipeline orchestration itself (can integrate with Kubeflow/Airflow).
11.6 Kubeflow
- Q1: What does it do?
- Open-source project dedicated to making ML workflow deployment on Kubernetes simple, portable, and scalable. Goal is to provide a toolkit and services for ML to run on K8s.
- Q2: Use Cases?
- End-to-end MLOps platform, including interactive development (Jupyter Notebooks), pipeline orchestration (Kubeflow Pipelines), distributed training (TFJob, PyTorchJob), model serving (KFServing/KServe), metadata management.
- Q3: Problems Solved?
- Provides the full suite of tools and processes needed for ML on Kubernetes, lowering the MLOps implementation barrier.
- Q4: Pros & Cons
- Pros: Comprehensive features, cloud-native, componentized (use as needed), tightly integrated with K8s ecosystem.
- Cons: Complex deployment and operation, numerous components with complex dependencies, high K8s skill requirement.
11.7 Horovod
- Q1: What does it do?
- Uber open-sourced distributed deep learning training framework, supporting TensorFlow, Keras, PyTorch, MXNet.
- Q2: Use Cases?
- Utilizes multiple GPUs or machines for data-parallel training, accelerating deep learning model training.
- Q3: Problems Solved?
- Simplifies the implementation complexity of distributed training (compared to native framework APIs), provides efficient AllReduce communication.
- Q4: Pros & Cons
- Pros: Easy to use (few code changes), high performance (based on Ring-AllReduce), supports multiple frameworks.
- Cons: Primarily supports data parallelism, limited support for model parallelism.
11.8 Milvus
- Q1: What does it do?
- An open-source, cloud-native vector database designed for large-scale vector similarity search and analysis.
- Q2: Use Cases?
- Scenarios requiring efficient similarity search on high-dimensional vectors, such as image/video retrieval, recommendation systems (Embedding-based recall), text semantic search, molecular structure analysis, audio analysis.
- Q3: Problems Solved?
- Solves the problem of traditional databases struggling to efficiently store and retrieve massive high-dimensional vector data. Provides more complete database management features than libraries like Faiss.
-
Q4: Pros & Cons
- Pros: High-performance vector retrieval (supports multiple index types like HNSW, IVF_FLAT), cloud-native architecture (K8s-based deployment), high availability and scalability, supports multiple distance metrics, active community.
- Cons: As a database system, deployment and operation are relatively complex, mainly focuses on vector data, limited scalar data query capabilities.
-
Core Concepts: Collection (like table), Partition (collection partition), Segment (data file), Index (vector index type).
- Architecture (v2.x): Compute-storage separation, includes Proxy (access), Query Node (query), Index Node (index build), Data Node (data management), Root Coord/Data Coord/Query Coord/Index Coord (coordination services), ETCD (metadata), MinIO (object storage).
12. Other Important Components
12.1 JMeter
- Q1: What does it do?
- Apache open-source load testing tool written purely in Java, used for testing the performance and load of static and dynamic resources (Web services, databases, FTP, etc.).
- Q2: Use Cases?
- Web application performance testing, API load testing, database load testing.
- Q3: Problems Solved?
- Provides GUI and scripting to simulate large numbers of concurrent users, testing system bottlenecks.
- Q4: Pros & Cons
- Pros: Open-source, free, powerful features, supports many protocols, easy-to-use GUI, good extensibility.
- Cons: Limited single-machine concurrency (requires distributed testing), higher memory consumption, reporting not modern (can integrate with InfluxDB+Grafana).
12.2 Arthas
- Q1: What does it do?
- Alibaba open-sourced Java online diagnostic tool.
- Q2: Use Cases?
- Real-time viewing of application status, diagnosing CPU/memory issues, analyzing classloading conflicts, method call tracing, hot-swapping code without JVM restart.
- Q3: Problems Solved?
- Solves the inefficient and lagging problems of traditional Java troubleshooting requiring restarts, adding logs, analyzing dump files, etc.
- Q4: Pros & Cons
- Pros: Powerful features, non-intrusive diagnosis, high real-time capability, convenient command-line interaction.
- Cons: Mainly targets Java applications, requires some understanding of JVM internals.