Architecture notes - part 1

Topic Description Key Concepts & Strategies Recommended Resources / Links
Distributed Systems Fundamentals Core principles behind designing systems that work across multiple nodes in a scalable and resilient way. • Load balancing & scalability
• Data partitioning (sharding)
• Fault tolerance & recovery
• CAP theorem & consistency models
Distributed Systems for Fun and Profit
CAP Twelve Years Later
Caching Strategies Techniques to improve performance by temporarily storing data closer to the application/user. • In‑memory vs. distributed caching
• Cache invalidation & expiration policies
• Balancing consistency vs. performance
An Introduction to Caching – DigitalOcean
Redis Introduction
Architectural Patterns High‑level approaches to system organization for scalability, maintainability, and resilience. • Microservices vs. monolithic architectures
• Event‑driven architecture
• Service-oriented patterns
• When to centralize cross‑cutting concerns vs. decentralize them
Microservices vs Monolith: Which Architecture is Right for You
Event‑Driven Architecture: A Key to Modern Software
Rate Limiter Pattern Patterns for controlling the rate of incoming requests to prevent system overload. • Fixed window, sliding window, and token bucket algorithms
• Trade‑offs between simple implementation and smooth request distribution
How to Build a Rate Limiter
Implementing a Rate Limiter in Node.js
API Gateway & Service Mesh Patterns Patterns for managing and routing traffic to microservices and handling cross‑cutting concerns. • Centralized API gateway vs. decentralized service mesh
• Authentication, routing, and load balancing at the gateway
• Observability and security for inter‑service calls
What is an API Gateway? – NGINX
What is a Service Mesh? – Istio
Distributed Cache Pattern Patterns for caching data across multiple nodes to improve system performance and reduce latency. • Ensuring cache coherence
• Invalidation strategies and expiration policies
• Balancing data freshness with performance benefits
Distributed Cache in Spring – Baeldung
Redis Caching – Redis Documentation

Distributed Systems

  • Levels of Abstraction: Distributed programming involves managing the consequences of distribution by finding abstractions that balance what is possible with what is understandable and performant. Abstractions are fundamentally “fake” but make the world manageable by simplifying problem statements.
  • System Model: A system model specifies the characteristics considered important in a distributed system, including node capabilities, communication link operations, and system properties like time and order. A robust system model makes weak assumptions to be tolerant of different environments, while a system model that makes strong assumptions is easier to reason about but harder to apply in practice.
  • Nodes: Nodes in the system model have the ability to execute a program, store data, and have a clock. They can fail by crashing and possibly recover later.
  • Communication Links: Communication links connect nodes and allow message sending. While some algorithms assume a reliable network, it is generally preferable to consider the network unreliable, with potential message loss and delays.
  • Timing/Ordering Assumptions: Timing assumptions capture how the reality of unique experiences at each node is taken into account. The two main alternatives are synchronous (processes execute in lock-step with known upper bounds on message transmission delay and accurate clocks) and asynchronous system models (no timing assumptions).
  • Consensus Problem: The consensus problem, at the core of many commercial distributed systems, involves multiple computers agreeing on a value.
  • FLP Impossibility Result: This result states that in an asynchronous system, there is no deterministic algorithm for the consensus problem, even with reliable networks and at most one process failing by crashing.
  • CAP Theorem: This theorem states that only two of the following three properties can be satisfied simultaneously: Consistency, Availability, and Partition Tolerance. This leads to different system types: CA (Consistency + Availability), CP (Consistency + Partition Tolerance), and AP (Availability + Partition Tolerance). The theorem highlights the tension between strong consistency and high availability during network partitions.
  • Consistency Models: “Consistency” is not a singular property but a guarantee that a data store provides to programs. Strong consistency models (Linearizable and Sequential) maintain a single copy, while weak consistency models (Client-centric and Eventual) do not.
Written on February 10, 2025