URL Shortener
Questions
What
- What does the URL shortener do? (Shortens long URLs, redirects short URLs to the original URLs)
- What are the basic use cases?
- URL shortening: given a long URL, return a much shorter URL
- URL redirecting: given a short URL, redirect to the original URL
- What characters are allowed in the shortened URL? (Numbers 0-9, characters a-z, A-Z)
- What is the traffic volume? (100 million URLs generated per day)
How
- How long is the shortened URL? (As short as possible)
Why
- Why can’t shortened URLs be deleted or updated? (Assumption for simplicity)
Overview
API Endpoints for URL Shortener
-
URL Shortening:
-
Purpose: Create a new short URL.
-
Request Type: POST
-
Endpoint:
POST api/v1/data/shorten -
Request Parameter:
{ "longUrl": "longURLString" } -
Response:
{ "shortURL": "shortURLString" }
-
-
URL Redirecting:
- Purpose: Redirect a short URL to the corresponding long URL.
- Request Type: GET
- Endpoint:
GET api/v1/shortUrl/{shortURL} - Response: HTTP redirection to the long URL.
Hash considerations
-
High-Level Design:
- Storage: Use a relational database to store
<shortURL, longURL>mappings instead of a hash table due to memory constraints.
- Storage: Use a relational database to store
-
Hash Function:
- Purpose: Hash a long URL to a short URL (hashValue).
-
Hash Value Length:
- Characters used: [0-9, a-z, A-Z] (62 possible characters).
- Length Calculation: Find smallest n such that ( 62^n \geq 365 ) billion.
- Calculation: ( 62^7 \approx 3.5 ) trillion, which is sufficient for 365 billion URLs.
- Conclusion: Length of hashValue is 7 characters.
-
Hash + Collision Resolution:
- Hash Function: Use well-known hash functions like CRC32, MD5, or SHA-1.
- Shortening the Hash:
- Direct Approach: Take the first 7 characters of the hash value.
- Collision Resolution:
- If collision occurs, append a predefined string to the hash value and check again.
- Repeat until a unique hash is found.
-
Summary of Steps:
- Implement a hash function to generate a hash value.
- Truncate the hash value to the first 7 characters.
- Check for collisions in the database.
- If collision occurs, append a predefined string and recheck.
- Store the
<shortURL, longURL>mapping in the relational database.
Last updated on