Back Of Envelope Numbers— [Notes]
--
· How to Decide on RAM, HDD
· Byte Conversion
∘ Useful Calculations
· Time Conversion
· Throughput
∘ Updates per second
∘ Maximum effective capacity
· Object Sizes
∘ Data
∘ Objects
∘ Lengths
· Per Period Numbers
· References
Follow this link for all System design articles
TODO — why is scale estimation/Back of envelope important in system design?
TODO | 3 OCT 2024
World population = 8 billion
How to Decide on RAM, HDD
- Normally the access speed in such a layered hierarchy is measured in CPU (Central Processing Unit) cycles, and from top to down the number of cycles increases almost exponentially.
- For instance, roughly, to get access to an on-chip register/cache about 3=3*10⁰ cycles are enough, instead,
- to get access to network/storage resources 2M=2*10⁶ cycles are needed;
- getting into the main memory requires about 200=2*10² cycles.
I/O is expensive and needs to be managed carefully at design-time, according to the context and main use cases
Best RAM on AWS = 128GB
Best HDD on AWS ≈ PB-----------------------Best to use RAM ≥ 16GB
Best to use HDD ≥ 1Tb
----------------------no. of virtual CPU will be limited and will not be able to parallelize
Byte Conversion
Useful Calculations
x Million users * y KB = xy GB
example: 1M users * a documents of 100KB per day = 100GB per day.
x Million users * y MB = xy TB
example: 200M users * a short video of 2MB per day = 400TB per day.
Bit = 0/1
Byte = 8 bit = 2^8 = 2561KB = 1024 bytes
= 2^10
~ 1000 bytes1MB = 10⁶ bytes
1GB = 10⁹ bytes
1TB = 10¹² bytes
1TB = 10¹⁵ bytesPower of 2 Table +-------+-------------+------------+-------+
| Power | Exact Value | Appx Value | Bytes |
+-------+-------------+------------+-------+
| 7 | 128 | | |
+-------+-------------+------------+-------+
| 8 | 256 | | |
+-------+-------------+------------+-------+
| 10 | 1024 | 1000 | 1KB |
+-------+-------------+------------+-------+
| 16 | 65,536 | | 64KB |
+-------+-------------+------------+-------+
| 20 | 1,048,576 | 1 million | 1MB |
+-------+-------------+------------+-------+
| 30 | 1073741824 | I billion | 1GB |
+-------+-------------+------------+-------+
| 40 | | 1 trillion | 1TB |
+-------+-------------+------------+-------+
Time Conversion
1ns=1e-9 second
1µs = 1e-6 second = 1000ns
1ms = 1e-3 second = 1000µs = 1e6 nsNumber of seconds in a day =▶ 24*3600 = 86400 ~ 100,000 ~ 10⁵
Number of seconds in a month =▶ 2.628e+6 ~ 25*10⁵Number of minutes in a day =▶ 1440
Throughput
| Component | Reads/Sec | Write/Sec |
|-------------------|------------------------|---------------|
| RDBMS | 10,000 | 5,000 |
| Distributed Cache | 100,000 | 100,000 |
| Message Queue | 100,000 | 100,000 |
| NoSQL | 20,000-50,000 | 10,000-25,000 |
| Dynamo DB | Millions of reads/ sec | 10,000-25,000 |
Updates per second
A appserver can handle 500 updates/sec
Maximum effective capacity
+-------------------+----------------------------+
| Component | Maximum effective capacity |
+-------------------+----------------------------+
| RDBMS | 3TB |
+-------------------+----------------------------+
| Distributed Cache | 16GB-128GB |
+-------------------+----------------------------+
| NoSQL | Depends |
+-------------------+----------------------------+
Object Sizes
Data
The numbers vary depending on the language and implementation.char: 1B (8 bits)char (Unicode): 2B (16 bits)Short: 2B (16 bits)Int: 4B (32 bits)Long: 8B (64 bits)UUID/GUID: 16B
Objects
File: 100 KBWeb Page: 100 KB (not including images)Picture: 200 KBShort Posted Video: 2MBSteaming Video: 50MB per minuteLong/Lat: 8B
Lengths
Maximum URL Size: ~2000 (depends on browser)ASCII charset: 128Unicode charset: 143, 859
Per Period Numbers
The following numbers are heavily rounded and help determine how often something needs to happen over a period of time. For example, if a server has a million requests per day, it will need to handle 12 requests per second.
More complex example:
100M photos (200KB) are uploaded daily to a server.
- 100 (number of millions) * 12 (the number per second for 1M) = 1200 uploads a second.
- 1200 (uploads) * 200KB (size of photo) = 240MB per second.
The web servers will need to handle a network bandwidth of 240MB per second. You will therefore need a machine with high network performance to handle this bandwidth. In AWS this would translate to at least a m4.4xlarge, but it would be better to have multiple smaller servers to handle fault tolerance.
SQL
Redis
Single Node -
Storage max: 300GB
Connections: 10k
Requests: 100k per sec
No-SQL
Single Node -
Read Requests: 20k-50k per sec
Write Requests: 10k-25k per sec
Web servers
Requests: 5k-10k requests per sec
Queues/Streams
Requests: 1000-3000 requests/s (FIFO)
Throuput: 1MB - 50MB/s write
2MB - 100MB/s read
Network
What’s the difference? Gigabit vs Gigabyte
ISPs do not simply measure their plans or storage in bits and bytes. These are the most basic, smallest units, and we require much higher data caps in our day-to-day life. So, instead, we use unit prefixes from the metric system to communicate quantity more efficiently (see table below for examples).
Mega as a prefix denotes a million.
- One Megabit (Mb) is one million bits. We measure internet speed with megabits per second (Mbps).
- ISPs offer packages ranging from speeds of 200 Mbps to 5,000 Mbps (or 5 Gbps).
- A Megabyte (MB) is one million bytes (or eight million bits). We use it to measure file sizes and storage.
Following the same logic, giga means a billion.
- One gigabit (Gb) is 1,000 Megabits (Mb).
- One gigabyte (GB) is 1,000 Megabytes (MB).
Megabit speeds allow for a lower number of users to browse the internet and complete simple tasks such as light streaming and email correspondence.
Gigabit speeds on the other hand allow for more users to use the available broadband with no risk of significant lagging, making these speeds ideal for heavy use such as online gaming and multiple concurrent streams.
How fast is 1000 Mbps (1 Gbps) or 125MB/s in terms of usage?
- 100-page pdf document (size of 1 MB) = 0.008 download time,
- Streaming services in high definition (approx. 3 GB/hour) = 24 seconds.
Use this download time calculator to see how fast various other internet speeds are.