Follow this link for all System design articles

TCP — Transition Control Protocol

  • most common protocol when developing distributed systems.
  • backbone of internet
  • Each TCP/IP packet has basically four fields for addressing. These are:
source_ip source_port destination_ip destination_port
<----- client ------> <--------- server ------------>

Inside the TCP stack, these four fields are used as a compound key to match up packets to connections (e.g. file descriptors).

If a client has many connections to the same port on the same destination, then three of those fields will be the same — only source_port varies to differentiate the different connections. Ports are 16-bit numbers, therefore the maximum number of connections any given client can have to any given host port is 64K.

However, multiple clients can each have up to 64K connections to some server’s port, and if the server has multiple ports or either is multi-homed then you can multiply that further.

So the real limit is file descriptors(what is file descriptor?). Each individual socket connection is given a file descriptor, so the limit is really the number of file descriptors that the system has been configured to allow and resources to handle. The maximum limit is typically up over 300K but is configurable e.g. with sysctl.

IP address + port = Socket

The realistic limits being boasted about for normal boxes are around 80K for example single-threaded Jabber messaging servers.

There are many ways to overcome the 64000 limit or any other limit so it’s not fixed

  • Pros:
    → Reliable ▶ message acknowledgement
    → Ordered ▶ msgs are ordered and if some piece missing then retransmit
    → Error-checked ▶ checksum
    → Retries
  • Cons:
    → Slower than some other protocol
    → Duplication due to failed “ack” and retry ( Handled using idempotency)

Maximum TCP connections?

On the TCP level the tuple (source ip, source port, destination ip, destination port) must be unique for each simultaneous connection. That means a single client cannot open more than 65535 simultaneous connections to a single server. But a server can (theoretically) serve 65535 simultaneous connections per client.

  • Max connections for multiple clients to the same server — A single listening port can accept more than one connection simultaneously.
    → Theoretically, there is no limit(as long as the request is coming from different IP addresses) and depends on memory and CPU to actually process the TCP connection
    Whatsapp handles 3 Million TCP connections/server
  • Per Client Per Server Port — For each client connecting to a particular server, there is a limit — why? because the TCP (source), the port header size is 16 bits so that gives 2¹⁶ = 65,536 ~64k (some of the bits are reserved) so ~limit of 64k connections
    → The server is listening on a single port, what’s changing here is the client port
  • Max connections on the Reverse proxy:

Layer 4 Proxy → can open only 64k connection
→ the solution to scale clients
— 1️⃣ add more l4 proxy 2️⃣ listen on a different port on the same proxy

UDP — User Datagram Protocol

  • no acknowledgement (makes sense for Realtime delivery)
  • messages can be corrupted [no checksum]
  • no packet numbering
  • faster than TCP
  • ✅ Used for sending streams of data — Monitoring metrics, Video streaming, Gaming, Stock exchange
  • ❓ Can you afford to lose some of the data?
  • ❓ Isques speed > Reliability?

TCP vs UDP

HTTP — Hypertext Transfer protocol

  • Hypertext — text with links to other documents
  • Based on TCP
  • Request — Response protocol
  • Layer 7 — Application
  • Request
    → Method
    → URL
    → Headers
    → Body (optional)
  • Response
    → Status
    → Headers
    → Body (optional)
  • HTTP Methods
    GET — read
    DELETE — delete
    POST — create [ can b used fr reading a lot of entities which GET can’t support]
    PUT — update
    PATCH — partial update
  • HTTP Status codes
    100–199 — Informational ⏩ 100 — Continue
    → 200–299 — Successful ⏩ 200 — OK, 201 — Created
    → 300–399 — Redirection ⏩ 301 — Moved Permanently
    → 400–499 — Client Error ⏩ 401 — Authorized, 403 — Forbidden, 404 — Not Found
    → 500–599 — Server Error ⏩ 500 — Internal Server Error, 503 — Service Unavailable

Statelessness means that every HTTP request happens in complete isolation. When the client makes an HTTP request, it includes all information necessary for the server to fulfill the request.

The server never relies on information from previous requests from the client. If any such information is important then the client will send that as part of the current request.

Problem with HTTP

  • Head of line blocking was a problem with HTTP but in HTTP2.0 (gRPC) this problem was resolved
Source — https://medium.com/globant/essential-guide-to-http-b7d6f7023b7f
  • BUT, it’s resolved at application layer. if we deliver different streams over a single TCP connection and one of the stream has dropped a message then all the stream will be blocked so doesn’t work on a single TCP connection → HTTP3.0 solves this problem by using UDP connection (no ACK)

TODO

HTTP1.0 vs HTTP2.0 vs HTTP3.0

Source — https://ably.com/topic/http-2-vs-http-3

HTTPS | How SSL works —

REST

  • Build on HTTP
  • Standard URL structure
  • Standard use for HTTP verbs
  • HTTP status codes to indicate errors
  • JSON as body format
  • Stateless — rather than relying on the server remembering previous requests, REST applications require each request to contain all of the information necessary for the server to understand it.
  • Cacheable
  • Identifiers

RESTful

  • RESTful verbs
    → GET — read
    → DELETE — delete
    → POST — create
    → PUT — update
  • RESTful URLs
METHOD /[resource/id] For example: GET /users/123 and not /user/123 or /getUSer/123
  • RESTful URLs — nested resources
METHOD /[resource/id]/[resource/id]For example: 
DELETE /users/123/books/567 -- returns book 567 that was borrowed by user 123
GET /users/123/books/ -- fetches all books borrowed by user 123
  • RESTful URLs — state
    → Use PUT for state change/switch as it’s idempotent [setting the same state twice is okay]
PUT /[resource/id]/[action]For example: 
PUT /users/567/enable
PUT /users/567/disable
  • RESTful URLs — safety
    GET shouldn’t change/mutate an entity in any way
GET shouldn't change an entity in any wayDon't do this:GET /users/123/ban
  • RESTful URLs — idempotency
PUT /users/123/onlinevs POST /users/123/likes
  • RESTful URLs — pagination
    → query params as it’s optional and can have a default value if not supplied
GET /[resource/id]?limit=X&offset=Y
Example:
GET /books?limit=50&offset=100

REST is Stateless

  • the server does not store any state about the client session on the server-side. This restriction is called Statelessness.
  • Each request from the client to the server must contain all of the necessary information to understand the request. The server cannot take advantage of any stored context on the server.
  • The application’s session state is therefore kept entirely on the client. The client is responsible for storing and handling the session related information on its own side.
  • This also means that the client is responsible for sending any state information to the server whenever it is needed. There should not be any session affinity or sticky session between the client and the server.

Advantages of Stateless APIs

  1. Statelessness helps in scaling the APIs to millions of concurrent users by deploying them to multiple servers. Any server can handle any request because there is no session related dependency.
  2. Being stateless makes REST APIs less complex — by removing all server-side state synchronization logic.
  3. A stateless API is also easy to cache as well. Specific software can decide whether or not to cache the result of an HTTP request just by looking at that one request. There’s no nagging uncertainty that state from a previous request might affect the cacheability of this one. It improves the performance of applications.
  4. The server never loses track of “where” each client is in the application because the client sends all necessary information with each request.

More on the below articles →

WebSocket

⚠ TODO — Merge with Websocket — [Notes]

WebSockets are fundamentally long-lived TCP sockets with a HTTP-like handshake and minimal framing for messages.

A TCP (and hence WebSocket) connection established to a server, but not sending or receiving (sitting idle), does consume memory on the server, but no CPU cycles.

To keep the TCP connection alive (and also “responsive”) on certain network environment like mobile may require periodic sending/receiving of small amounts of data. E.g. WebSocket has built-in ping/pong (non app data) messages for that. Doing so then will consume some CPU cycles, but not a lot.

Websocket require the server to store the state associated with each connection, they require maintenance (such as keep-alive packets or websocket pings), and they require monitoring (to detect state changes or arriving information). So you spend some memory and CPU resources per connection.

BUT they save a lot of time, and often resources, on connection re-initializations; once established, they allow both parties to send and receive information as opposed to non-persistent client-server systems like classic HTTP.

So it really depends on the system you’re building. If your system has millions of users that need connectivity to the server only once in a while, then the benefit of keeping these connections open is probably not worth the extra resources. But if you’re designing something like a chat server for hundred people, then the additional responsiveness is probably worth it.

  • 🔄 duplex (2 way) protocol — built on TCP, same as HTTP
  • Persistent connection
  • More efficient than polling with HTTP
  • 💚 Connection is established only once.
  • 💚 Real-time message delivery to the client
  • ⛔ More complicated to implement than HTTP — if n/w error WebSocket connection will be dropped and reconnected by writing some codes
  • ⛔ May not have the best support in some languages
  • ⛔ Load balancer may have some troubles
  • ⛔ Unlike REST, need to reinvent the “protocol” every time
  • ⛔The only problem with WebSocket is that the app needs a persistent connection with a particular server.
    With websockets’ defaults, on the server-side, a single connections uses 70 KiB of memory.

How many system resources will be held for keeping 1,000,000 websocket open? [closed]

On today’s systems, handling 1 million concurrent TCP connections is not an issue.

We had to demonstrate several times, to some of our customers, that 1 million connections can be reached on a single box (and not necessarily a super-monster machine). But let me recap the configuration where we tested 500K concurrent connections, as this is a much more recent test performed on Amazon EC2.

We installed Lightstreamer Server (which is a WebSocket server, among other things) on a m2.4xlarge instance. This means 8 cores and 68.4 GiB memory.

We launched 11 client machines to create 500,000 concurrent connections to the Lightstreamer Server. The test was configured so that the total outbound throughput from the server was 90,000 updates/s, resulting in peaks of 450 Mbit/s outbound bandwidth.

The server never used more than 13 GiB of RAM and the CPU was stable around 60%.

With at least 30 GiB RAM you can handle 1 million concurrent sockets. The CPU needed depends on the data throughput you need.

Short answer: yes, but it’s expensive.

Long answer:

This question is not unique to WebSockets since WebSockets are fundamentally long-lived TCP sockets with a HTTP-like handshake and minimal framing for messages.

The real question is: could a single server handle 1,000,000 simultaneous socket connections and what server resources would this consume? The answer is complicated by several factors, but 1,000,000 simultaneous active socket connections is possible for a properly sized system (lots of CPU, RAM and fast networking) and with a tuned server system and optimized server software.

The number of connections is not the primary problem (that’s mostly just a question of kernel tuning and enough memory), it is the processing and sending/receiving data to/from each of those connections. If the incoming connections are spread out over a long period, and they are mostly idle or infrequently sending small chunks of static data then you could probably get much higher than even 1,000,000 simultaneous connections. However, even under those conditions (slow connections that are mostly idle) you will still run into problems with networks, server systems and server libraries that aren’t configured and designed to handle large numbers of connections.

Here are some older but still applicable resources to read on how you would configure your server and write your server software to support large numbers of connections:

What is the theoretical maximum number of open TCP connections that a modern Linux box can have

http://www.kegel.com/c10k.html

Apparantly 12 million socket connections is possible on a single JVM. See how they did it — https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-connections-how-migratorydata-did-it/

think the total count of websocket connections alone is not a problem and kernel can handle 10M+ just fine. The problem is buffering (e.g. if you need to push lots of data to many sockets and the client doesn’t flush the socket you end up having lots of RAM reserved for outgoing TCP/IP buffers) and data per socket on the server. For example, if you run Node.js on the server, the total RAM per connection to hold any objects related to a single connection. In theory one could optimize that, too, but it would be hugely expensive because you would need similar code quality to Linux kernel

Advance — TODO

Long Polling

⚠ Integerate — https://ably.com/blog/websockets-vs-long-polling

  • With constant polling(HTTP requests at regular intervals) there is a waste of resources i.e. open connection then close connection and it continues until we get what we are looking for
  • The connection will be open until there is a response or timeout
  • 🟢 Real-time message delivery
  • 🟢 Good alternative if WebSockets are not an option
  • 🔴 It May be hard to implement in some languages and frameworks
    → ruby on rails connections are controlled by the framework
    → if the webserver is based on process or threads then it’s difficult to manage this

RPC — Remote Procedure Call

How do microservices communicate internally?

  • Invoke another service as if it is a local function
  • A function is described in an abstract language”
    IDL → Interface Description Language
  • The generator takes description as input and creates implementation in a particular language
  • RPC takes care of communication marshalling and unmarshalling

gRPC

gRPC is a modern open source high performance Remote Procedure Call (RPC) framework that can run in any environment.

  • developed by Google
  • Uses Protobuf as IDL
  • Uses HTTP2.0 as transport
  • 🔴 Cannot be used in browser
  • Language agnostic
  • Binary, more efficient than JSON but not human readable
  • Great for communicating between services
  • Fast and typesafe
  • Every service is a client, when it’s sending a request.
    Every service is a Server, when it’s responding to a request.

TODO
https://medium.com/@yangli907/grpc-learning-part-1-cdcf59e52707

Protobuf

  • Binary protocol
  • Not human-readable
  • The description is stored in *.proto files
  • Messages are smaller/faster than JSON or XML

Thrift

GraphQL

  • Released by FB in 2015
  • Comes to solve two issues REST has
    1️⃣ Overfetching
    2️⃣ Underfetching
  • QL stands for “Query Language”
  • Based on HTTP
  • Requests and Responses are in JSON format
  • Let’s you define which fields to return
  • Let’s you define which nested entities to return
  • 🟢 Allows fetching nested entities with a single request
  • 🟢 Greate for reporting systems
  • 🟢 Saves bandwidth
  • 🔴 Results are less cacheable (uses POST requests)
  • 🔴 Support outside of Javascript ecosystem is not great

[Important] Steps/Process to choosing a protocol

  • mentioning TCP and HTTP is useless as many protocols are already built on top of these
  • However, mentioning UDP can help sometimes

XMPP

DASH — Dynamic Adaptive Streaming over HTTP

  • Dynamically adapting to bandwidth
  • control signals tell server about the connection quality, bandwidth etc.
  • over TCP
  • ordering
  • continuous viewing is preferred over buffering

HLS

  • control signals tell server about the connection quality, bandwidth etc.
  • over TCP

--

--