Content Delivery Network
- chapter 6 of System Design Interview – An insider’s guide
What
- A content delivery network (CDN) is a group of geographically distributed nodes which deliver content efficiently. You can cache static contents such as images, videos, and static files.
- The access pattern is similar to a read-through cache.
- The request for the static files will be routed to the most optimal CDN using anycast.
- If the file doesn’t exist, it will be requested from the server and stored in the CDN.
- If the content already exists in the CDN, the CDN will return the file to the client.
Why
- Because the content is closer to the user, the architecture will reduce the latency.
- Because a CDN now serves the contents, it significantly reduces the amount of data that goes through the main servers and routers. Reducing the bandwidth to the origin server can be cost-saving because there’s less distance to transfer the bytes to the client.
- You can distribute a CDN in multiple geographical locations with multiple redundancies. If a request fails to one of the CDNs, you can route to another CDN, which enhance availability.
Potential challenges of CDN
- These are some questions you should consider to show more depth to CDNs on top of just “CDNs are closer to the user, so it’s great.”
- The answer to the questions heavily depends on your application.
- We are caching static data in a CDN, and CDNs cost money. What should we store?
- What happens if there’s a CDN cache miss? How should the request be routed?
- Since there are many CDN nodes, how do you keep the network updated?
- How do you pre-populate the static content in a CDN?
- If a CDN node fails, what should happen?
- Does data differ between different regional CDNs?
- Since CDNs are like a cache, what are the cache considerations we should think about?
Last updated on