Common Algorithms and Techniques Every Developer Should Know

As a software developer, having a good grasp of common algorithms and system design techniques is crucial for building efficient, scalable, and robust applications. These concepts appear in various forms across different domains. Here’s a look at some fundamental ones:

Load Management / Rate Limiting

Techniques to control traffic and prevent system overload.

Token Bucket Algorithm: Allows burst traffic up to the bucket size, then enforces an average rate by requiring tokens to process requests.
Leaky Bucket Algorithm: Smooths out bursts of traffic by processing requests at a fixed rate, like water leaking from a bucket.
Fixed Window Counter: Counts requests within fixed time intervals (e.g., per minute). Simple but can lead to spikes at window edges.
Sliding Window Log: Keeps a log of request timestamps. Offers high accuracy but can be memory-heavy.
Cost-Based Rate Limiting: Assigns different weights or costs to requests, useful in APIs where some operations are more resource-intensive.

Load Balancing Algorithms

Distribute incoming network traffic across multiple servers.

Round Robin: Rotates through servers, sending each new request to the next server in the list.
Least Connections: Sends traffic to the server currently handling the fewest active connections.
IP Hashing: Assigns a client to a specific server based on the client’s IP address, useful for maintaining sticky sessions.
Weighted Round Robin: Servers with more capacity (higher weight) receive proportionally more requests.

Caching Algorithms

Strategies for managing data in a cache to improve performance.

LRU (Least Recently Used): Evicts the least recently accessed item when the cache is full.
LFU (Least Frequently Used): Evicts the least frequently accessed item over time.
TTL (Time-to-Live): Items auto-expire from the cache based on a predefined time.
Write-Through / Write-Behind:
- Write-Through: Data is written to both cache and backing store simultaneously.
- Write-Behind: Data is written to cache first, and then asynchronously to the backing store. Improves write performance but risks data loss if cache fails before sync.

Networking & Distributed Systems

Core concepts for building systems that span multiple machines.

Consistent Hashing: A hashing technique that minimizes data remapping when nodes are added or removed in a distributed system (e.g., used in Redis Cluster, Cassandra).
Gossip Protocol: A peer-to-peer communication mechanism for sharing state or information across a distributed system.
Paxos / Raft: Consensus algorithms for achieving agreement on state among distributed servers (e.g., used in etcd, Consul).
Vector Clocks: A mechanism for ordering events in a distributed system to detect causality violations and inconsistencies.

Queueing & Scheduling

Managing tasks and their execution order.

Priority Queues: Tasks are processed based on their priority, not just arrival time.
Time-Partitioned Queues: Used for scheduled jobs, ensuring tasks run at specific times or intervals (e.g., cron tweets).
Exponential Backoff: A retry strategy for failed tasks where the delay between retries increases exponentially.
Circuit Breaker Algorithm: Halts retries to a service when it’s detected as down or failing, preventing further load and allowing it to recover.

Security / Auth Algorithms

Methods for authentication, authorization, and ensuring data integrity.

HMAC (Hash-based Message Authentication Code): Used for API key verification and ensuring message integrity and authenticity.
OAuth 2.0 Flow: An authorization delegation framework commonly used for third-party application access.
JWT (JSON Web Tokens): Used for stateless session validation and securely transmitting information between parties.
Captcha Challenge Algorithms: Techniques for distinguishing human users from bots to prevent abuse.

Data Sync & Consistency

Ensuring data is synchronized and consistent across replicas or systems.

Eventual Consistency: Guarantees that, if no new updates are made, all replicas will eventually converge to the same state (common in NoSQL databases).
Read Repair / Hinted Handoff: Techniques used in distributed databases (like Cassandra, Dynamo) to improve data consistency and availability during node failures.
CRDTs (Conflict-free Replicated Data Types): Data structures designed to be replicated across multiple computers in a network, where replicas can be updated independently and concurrently without coordination, and where it is always mathematically possible to resolve any inconsistencies that may result. Used in real-time collaboration apps.
Merkle Trees: Used to efficiently synchronize differences across replicas by comparing hashes of data blocks (e.g., in Git, Cassandra, Bitcoin).

Storage & Compression

Techniques for storing and reducing the size of data.

Delta Encoding: Storing only the differences between versions of data rather than full copies.
Run-Length Encoding / Huffman Coding: Basic lossless data compression algorithms.
Zstandard / Brotli: Modern, high-performance lossless compression algorithms.
Parquet / Avro / Protobuf: Efficient data serialization formats, often used in big data and distributed systems.

Monitoring / Metrics

Techniques for observing system health and performance.

Histogram Binning: Used in systems like Prometheus to group measurements into buckets for analyzing distributions.
Percentile Estimation (e.g., t-digest): Algorithms for tracking latency SLOs (Service Level Objectives) and understanding performance distributions.
EWMA (Exponential Weighted Moving Average): Used to smooth trend lines in time-series data, giving more weight to recent data points.

Build / Compilation / Infra

Algorithms and patterns used in software development and deployment infrastructure.

Topological Sorting: Used for dependency resolution in build tools (like Maven, Gradle) and task schedulers.
Dependency Graph Traversal: Algorithms for resolving npm or Maven dependencies, understanding project structure.
Blue-Green / Canary Deployments: Strategies for controlled software releases to minimize risk and downtime.
Rolling Update Algorithms: Gradually deploy new versions of an application across servers without downtime.

Familiarizing yourself with these algorithms and techniques will not only make you a better problem solver but also enable you to design more sophisticated and efficient software systems.