System Design of a URL Shortener Overview
1. Goal of the System
Convert a long URL like: https://example.com/articles/life-hacks-2025
into a short, shareable link like: short.ly/abc123
2. Core Backend Workflow
a. User Submits a Long URL
- A long URL is sent to the server via an API.
b. Generate a Unique Short Code
Two common methods:
- Counter + Base62: Start from 1, 2, 3… and convert to Base62 (e.g., a, b, Z, 10, etc.)
- Random Base62 string: Generate a 6–8 character string like abc123, check database for uniqueness.
Method | Pros | Cons |
---|---|---|
Counter + Base62 | Simple, no duplicates | Needs centralized counter |
Random String | Easy to scale, looks clean | Low chance of collision, still needs check |
Hashing Long URL | Same input = same code | Collisions if truncated, not customizable |
c. Save to Database
Store mapping in a table:
ShortCode | LongURL | Clicks |
---|---|---|
abc123 | https://example.com/articles/life-hacks-2025 | 0 |
3. Redirection Flow
When someone opens short.ly/abc123:
- Server extracts the abc123 code.
- Looks it up in the database.
- Increments the click counter (optional).
- Redirects the user to the long URL instantly.
4. Scaling the System (How to Handle Millions of Users Smoothly)
-
Caching (e.g., Redis): Store popular short codes in memory for faster lookup. Reduces DB load and speeds up redirection.
-
Database Sharding: Split large databases into smaller chunks (shards) based on short code ranges or user IDs to spread the load.
-
Read Replicas: Use secondary databases for read-heavy operations like redirection, while writes go to the primary DB.
-
Load Balancers: Distribute incoming traffic across multiple servers to avoid bottlenecks.
-
Stateless Backend Servers: Keep servers lightweight and stateless so they can scale horizontally (just add more servers when needed).
-
Asynchronous Processing: For analytics or logging (click tracking), use background jobs to avoid slowing down redirection.
-
CDN Integration (Optional): If static redirection rules become common, use Content Delivery Networks to handle them globally with ultra-low latency.
-
Rate Limiting: Prevent abuse by limiting how many short URLs a user can generate in a certain time.
-
Monitoring & Alerts: Track system health (latency, error rates, traffic spikes) and set up alerts for anomalies.
5. Data Schema
Main Table: url_mappings
CREATE TABLE url_mappings (
short_code VARCHAR(10) PRIMARY KEY,
original_url TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
user_id INTEGER, -- if user authentication is implemented
expiry_date TIMESTAMP NULL, -- for temporary links
click_count INTEGER DEFAULT 0
);
-- Index on user_id for faster lookup of a user's URLs
CREATE INDEX idx_user_id ON url_mappings (user_id);
Analytics Table (Optional): click_analytics
CREATE TABLE click_analytics (
id SERIAL PRIMARY KEY,
short_code VARCHAR(10) REFERENCES url_mappings(short_code),
clicked_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
referrer VARCHAR(255) NULL,
user_agent TEXT NULL,
ip_address VARCHAR(45) NULL
);
-- Index on short_code for faster analytics queries
CREATE INDEX idx_short_code ON click_analytics (short_code);
6. Security and Considerations
- URL Validation: Verify that submitted URLs are valid to prevent abuse.
- Custom Short URLs: Optionally allow users to customize their short codes.
- Expiration Dates: Support temporary links that expire after a certain time.
- Analytics: Track clicks, referrers, and geographic data for insights.
- Anti-Abuse Measures: Prevent creation of malicious or phishing URLs.
- Privacy: Consider what user data to store and how long to retain it.
7. API Design
Create Short URL
POST /api/shorten
Body: { "url": "https://example.com/very/long/url" }
Response: { "shortUrl": "short.ly/abc123" }
Get Original URL Info
GET /api/info/abc123
Response: {
"originalUrl": "https://example.com/very/long/url",
"created": "2023-12-15T10:30:00Z",
"clicks": 42
}
Conclusion
A URL shortener is a classic system design problem that touches on many aspects of scalable web architecture. The core logic is straightforward, but designing for scale introduces interesting challenges around code generation, database design, and caching strategies. When building a production system, choose the appropriate trade-offs based on your expected traffic patterns and required features.