Realtime data update

Highlighted from System Design Interview Fundamentals, Chapter 5

Highlighted from Code Karle blog

WebSocket

WebSocket is a bidirectional connection between the server and the client. The client establishes the connection with the server using a TCP connection.
TCP also means the connection is stateful.
The connection life-cycle is tied to a machine. When that machine crashes or restarts, the connection needs to be reestablished.

:brain: There are hardly any good reasons to pick the polling protocols in a system design interview unless it’s for a quick prototype or you need to deliver it fast.

WebSocket is the most commonly used solution for real-time data.
Since WebSocket is bi-directional, there isn’t much polling, and SSE protocols can do what WebSocket can’t do. Then the question is, when would you pick SSE over WebSocket, even if the communication is unidirectional?
- For SSE against WebSocket, you can consider SSE as less complex as it uses the traditional HTTP, which means it’s less work to get SSE working than the need for dedicated WebSocket servers to handle the connections.
- You can opt for SSE for unidirectional events from the server to the client;
- otherwise, use WebSocket. Opting for a WebSocket solution for unidirectional would be fine as well.

How to Scale WebSockets

For a WebSocket, you need to establish a stateful connection to a physical server with IP and port.

A common mistake is thinking the connection is established onto the app servers’ load balancer.
Connecting directly to the application load balancer doesn’t scale because that single load balancer will run out of memory.

You might wonder why the connection can’t be on the app server. The reason is that

the client establishes the stateful connection to the load balancer’s machine with an IP and port.
Also, maintaining an open connection to the app server’s load balancer isn’t what the load balancer is designed for.

The more common approach is to have a load balancer that hands out the endpoint of a WebSocket proxy farm.

Each WebSocket server maintains a list of connections to the user.
A connection consists of from_ip, from_port, to_ip, and to_port.
The server can use that tuple to figure out which IP and port to use to forward the events back to the client.
Since a user can have multiple sessions, you can consider each connection as a session.
Since the connection to the WebSocket servers is stateful, when the WebSocket server goes down, all the clients connected to the WebSocket server need to reconnect.
This disconnection can cause a thundering herd onto other servers. The more connections a server has, the bigger the thundering herd when it goes down.
Conversely, the less connections a server can handle, the more servers the system needs.
So this is a trade-off discussion point you can potentially have.
You need to maintain a mapping store from a user attribute to all the WebSocket servers to know which server to forward the message to.
For example, say you want to notify all users who live in New York. You can have a store where the key is location and the value is a list of WebSocket servers where the user is from New York.

Is Connection Still Alive?

When you have an open connection via WebSocket, it is important to determine if the client is still alive.

There are elegant ways to close a connection by sending a close request, but sometimes due to network interruptions the close request isn’t elegantly called. When that happens, the server won’t know the client has disconnected.
Also, sometimes the interruption is temporary and quickly recovers.
In those situations, you don’t want the server to have to shut down and reconnect again.

When the client is closed, an open connection implies that the server will continue to send data to the client unnecessarily.

As a result, the connection servers may be holding onto dead connections and wasting memory.
For some system design questions like Facebook Chat’s user online status, it is likely they use the open connection as an indicator of online and offline.
In addition, for system design questions like designing a notification server where the WebSocket connection is the core of the design, you need to think about the edge case of unexpected client disconnection.
The clients send heartbeats to the server to handle temporarily interrupted connections. Thus, the server knows the client is still alive, and the client knows the server is still alive.
Like in any distributed system, the browser may temporarily hang and the network may have an interruption such that the server doesn’t receive the ping as frequently as they expect, but it doesn’t mean the client has disconnected. The way to handle this is to have a timeout setting on the server-side.
- Every time the server receives a ping, the server remembers the timestamp. If the server doesn’t receive another ping from the client in the next buffer_seconds, then the server determines the connection as dead.

:brain: For trade-off discussions in a system design interview, you can talk about the frequency of the ping and the size of the buffer_seconds, and how they impact the end-user and the design.

A more frequent ping will result in better accuracy of the connection status but will overload the server more.
On the other hand, a bigger buffer_second will result in less toggling between online and offline when the client is still online, but results in inaccuracy if the client has disconnected because you have to wait for the buffer_second to expire.

Other realtime data update

Server-Sent Event (SSE)

The client requests the server for SSE connection.

The server keeps the connection open, but the client doesn’t keep the connection open.
SSE is appropriate for applications like stock price feed where only the server pushes data to the client but not the other way around.

Long Poll

In long polling, the client sends a request to the server, and the server keeps that connection open until there is data to respond with.

Suppose there’s no response after some time, then the connection times out.
After the long poll timeout, the client can decide to send another request to the server to wait for an update.
Long polling has the advantage over short polling, where the :whale: server notifies the client when there is updated data.
:weary: The downside is the additional overhead and complexity to be connection aware.

Short Poll

Short polling is the process during which the client sends a request to fetch updated information periodically.

:weary: Short polling is not preferable since you may be invoking many requests with no update and overloading the servers unnecessarily.
The only possible reason you would ever do this is for a startup and prototype project without the overhead of maintaining server-side connection. But this is unlikely the reason for a scalable system design question.