Deep Dive into WebSockets and Their Position in Consumer-Server Communication


Deep Dive into WebSockets and Their Position in Consumer-Server Communication
Picture by Kelly from Unsplash

Actual-time communication is in all places – dwell chatbots, knowledge streams, or on the spot messaging. WebSockets are a strong enabler of this, however when must you use them? How do they work, and the way do they differ from conventional HTTP requests?

This text was impressed by a latest system design interview – “design an actual time messaging app” – the place I stumbled via some ideas. Now that I’ve dug deeper, I’d prefer to share what I’ve realized so you’ll be able to keep away from the identical errors.

On this article, we’ll discover how WebSockets match into the larger image of shopper‑server communication. We’ll talk about what they do effectively, the place they fall quick, and – sure – the right way to design an actual‑time messaging app.

Consumer-server communication

At its core, client-server communication is the change of knowledge between two entities: a shopper and a server.

The shopper requests for knowledge, and the server processes these requests and returns a response. These roles usually are not unique – providers can act as each a shopper and a server concurrently, relying on the context.

Earlier than diving into the small print of WebSockets, let’s take a step again and discover the larger image of client-server communication strategies.

1. Quick polling

Quick polling is the best, most acquainted method.

The shopper repeatedly sends HTTP requests to the server at common intervals (e.g., each few seconds) to examine for brand spanking new knowledge. Every request is unbiased and one-directional (shopper → server).

This technique is straightforward to arrange however can waste assets if the server not often has recent knowledge. Use it for much less time‑delicate purposes the place occasional polling is adequate.

2. Lengthy polling

Lengthy polling is an enchancment over quick polling, designed to cut back the variety of pointless requests. As an alternative of the server instantly responding to a shopper request, the server retains the connection open till new knowledge is obtainable. As soon as the server has knowledge, it sends the response, and the shopper instantly establishes a brand new connection.

Lengthy polling can also be stateless and one-directional (shopper → server).

A typical instance is a trip‑hailing app, the place the shopper waits for a match or reserving replace.

3. Webhooks

Webhooks flip the script by making the server the initiator. The server sends HTTP POST requests to a client-defined endpoint every time particular occasions happen.

Every request is unbiased and doesn’t depend on a persistent connection. Webhooks are additionally one-directional (server to shopper).

Webhooks are broadly used for asynchronous notifications, particularly when integrating with third-party providers. For instance, cost techniques use webhooks to inform shoppers when the standing of a transaction modifications.

4. Server-Despatched Occasions (SSE)

SSEs are a native HTTP-based occasion streaming protocol that permits servers to push real-time updates to shoppers over a single, persistent connection.

SSE works utilizing the EventSource API, making it easy to implement in fashionable net purposes. It’s one-directional (server to shopper) and preferrred for conditions the place the shopper solely must obtain updates.

SSE is well-suited for purposes like buying and selling platforms or dwell sports activities updates, the place the server pushes knowledge like inventory costs or scores in actual time. The shopper doesn’t must ship knowledge again to the server in these situations.

However what about two-way communication?

All of the strategies above deal with one‑directional circulate. For true two‑approach, actual‑time exchanges, we want a distinct method. That’s the place WebSockets shine.

Let’s dive in.

How do WebSockets work?

WebSockets allow real-time, bidirectional communication, making them good for purposes like chat apps, dwell notifications, and on-line gaming. Not like the normal HTTP request-response mannequin, WebSockets create a persistent connection, the place each shopper and server can ship messages independently with out ready for a request.

The connection begins as an everyday HTTP request and is upgraded to a WebSocket connection via a handshake.

As soon as established, it makes use of a single TCP connection, working on the identical ports as HTTP (80 and 443). Messages despatched over WebSockets are small and light-weight, making them environment friendly for low-latency, high-interactivity use instances.

WebSocket connections comply with a particular URI format: ws:// for normal connections and wss:// for safe, encrypted connections.

What’s a handshake?

A handshake is the method of initialising a connection between two techniques. For WebSockets, it begins with an HTTP GET request from the shopper, asking for a protocol improve. This ensures compatibility with HTTP infrastructure earlier than transitioning to a persistent WebSocket connection.

  1. Consumer sends a request, with headers that appear like:
GET /chat HTTP/1.1
Host: server.instance.com
Improve: websocket
Connection: Improve
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Origin: http://instance.com
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Model: 13
  • Improve – alerts the request to modify the protocol
  • Sec-WebSocket-Key – Randomly generated, base64 encoded string used for handshake verification
  • Sec-WebSocket-Protocol (non-compulsory) – Lists subprotocols the shopper helps, permitting the server to choose one.

2. Server responds to resquest

If the server helps WebSockets and agrees to the improve, it responds with a 101 Switching Protocols standing. Instance headers:

HTTP/1.1 101 Switching Protocols
Improve: websocket
Connection: Improve
Sec-WebSocket-Settle for: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Protocol: chat
  • Sec-WebSocket-Settle for – Base64 encoded hash of the shopper’s Sec-WebSocket-Key and a GUID. This ensures the handshake is safe and legitimate.

3. Handshake validation

With the 101 Switching Protocols response, the WebSocket connection is efficiently established and each shopper and server can begin exchanging messages in actual time.

This connection will stay open until it’s explicitly closed by both get together.

If any code apart from 101 is returned, the shopper has to finish the connection and the WebSocket handshake will fail.

Right here’s a abstract.

Summary of WebSockets (drawn by me)
Abstract of WebSockets (drawn by me)

WebSocket use instances

We’ve talked about how WebSockets allow real-time, bidirectional communication, however that’s nonetheless fairly summary time period. Let’s nail down some actual examples.

WebSockets are broadly utilized in real-time collaboration instruments and chat purposes, comparable to Excalidraw, Telegram, WhatsApp, Google Docs, Google Maps and the dwell chat part throughout a YouTube or TikTok dwell stream.

Commerce offs

1. Having a fallback technique if connections are terminated

WebSockets don’t routinely get well if the connection is terminated on account of community points, server crashes, or different failures. The shopper should explicitly detect the disconnection and try and re-establish the connection.

Lengthy polling is usually used as a backup whereas a WebSocket connection tries to get reestablished.

2. Not optimised for streaming audio and video knowledge

WebSocket messages are designed for sending small, structured messages. To stream massive media knowledge, a know-how like WebRTC is best fitted to these situations.

3. WebSockets are stateful, therefore horizontally scaling shouldn’t be trivial

WebSockets are stateful, that means the server should keep an energetic connection for each shopper. This makes horizontal scaling extra advanced in comparison with stateless HTTP, the place any server can deal with a shopper request with out sustaining persistent state.

You’ll want a further layer of pub/sub mechanisms to do that.

Design an actual time messaging app

Now let’s see how that is utilized in system design. I’ve lined each the straightforward (unscalable) resolution and a horizontally scaled one.

End-to-end flow for a horizontally scaled, real time 1–1 chat (drawn by me)
Finish-to-end circulate for a horizontally scaled, actual time 1–1 chat (drawn by me)

Non-scalable single server app: How do two customers chat actual time?

  1. All customers join by way of WebSocket to 1 server. The server holds an in-memory mapping of userID : WebSocket conn 1
  2. user1 sends the message over its WebSocket connection to the server.
  3. The server writes the message to the MessageDB (persistence first).
  4. The server then seems to be up user2 : WebSocket conn 2 in it’s in reminiscence map. If user2 is on-line, it delivers the message in actual time.
  5. If user2 is offline, the server writes to InboxDB (a retailer of undelivered messages). When user2 returns on-line, the server fetches all offline messages from InboxDB.

Horizontally scaled system: How do two customers chat actual time?

A single server can solely deal with so many concurrent WebSockets. To serve extra customers, you want to horizontally scale your WebSocket connections.

The important thing problem: If user1 is related to server1 however user2 is related to server2, how does the system know the place to ship the message?

Redis can be utilized as a worldwide knowledge retailer that maps userID : serverID for energetic WebSocket periods. Every server updates Redis when a consumer connects (goes on-line) or disconnects (goes offline).

For example:

  • user1 connects to server1. server1’s in reminiscence map: user1 : WebSocket connection server1 additionally writes to Redis: user1 : server1

  • user2 connects to server2. server2’s in reminiscence map: user2 : WebSocket connection server2 additionally writes to Redis: user2 : server2

Finish to finish chat circulate: user1 sends a message to user2

  1. user1 sends a message via it’s WebSocket on server1.
  2. server1 passes the message to a Chat Service.
  3. Chat Service first writes the message to MessageDB for persistence.
  4. Chat Service then checks Redis to get the net/offline standing of user2.
  5. If user2 is on-line, Chat Service publishes the message to a message dealer, tagging it with: “user2: server2”.
  6. The dealer then routes the message to server2.
  7. server2 seems to be up it’s native in reminiscence mapping to seek out the WebSocket connection of user2 and pushes the message actual time over that WebSocket.
  8. If user2 is offline (no entry in Redis), Chat Service writes the message to the InboxDB. When user2 returns on-line, Chat Service will fetch all of the undelivered messages.
  9. Each time a brand new WebSocket connection is opened or closed, the servers replace Redis.
  10. When a consumer first hundreds the app or opens a chat, the Chat Service fetches historic messages (e.g., from the final 10 days) from MessageDB. A cache layer can cut back repeated DB queries.

Some vital design concerns:

  1. Persistence first All messages go to the DB earlier than being delivered. If a push to WebSocket fails, the message remains to be protected within the DB.

  2. Redis Shops solely energetic connections to attenuate overhead. A duplicate could be added to forestall a single level of failure.

  3. Inbox DB helps to deal with offline instances cleanly.
  4. Chat Service abstraction The WebSocket servers deal with actual‐time connections and routing. The Chat Service layer handles HTTP requests and all DB writes. This separation of issues makes it simpler to scale or evolve each bit.

  5. Making certain so as supply of messages Typical “actual time push” workflows can have community variations, resulting in messages arriving out of order. Many message brokers additionally don’t assure strict ordering. To deal with this, every message is assigned a timestamp at creation. Even when messages arrive out of order, the shopper can reorder them primarily based on the timestamp.

  6. Load balancers L4 Load Balancer (TCP) for sticky WebSocket connections. L7 Load Balancer (HTTP) for normal requests (CRUD, login, and so on).

Wrapping up

That’s all for now! There’s a lot extra we may discover, however I hope this gave you a stable start line. Be at liberty to drop your questions within the feedback beneath 🙂

I write frequently on Python, Software program Growth and the tasks I construct, so give me a comply with to not miss out. See you within the subsequent article.