System Design Overview

Typical system design workflow

What are the problem constraints

What’s the amount of traffic the system should handle

Num of operation per day * Number of daily active users / 86400 (~100,000)

Peak QPS = Average QPS * 3

What’s the amount of data the system should handle

New data written per year: DAU * 365 (~400) * 5

What features the system needs to support

List features

Common features

Abstract design

Application service layer

Data cache

Data storage

Message queue + notification center

Logs + storage + analytics

Search

Scale

System design evaluation standards

OO design principles

SRP: The Single Responsibility Principle

OCP: The Open-Closed Principle

LSP: The Liskov Substitution Principle

DIP: The Dependency-Inversion Principle

ISP: The Interface-Segregation Principle

DRY: Don’t repeat yourself

Distributed system concepts

CAP theorem

Consistency

Update consistency

Read consistency

Replication Consistency

Message queue

Benefits

Components

Routing methods

Protocols

Metrics to decide which message broker to use

Challenges

Networking

TCP vs UDP

| TCP | UDP | |——————————————————————————————————————————————————————————————————————————————–|———————————————————————————————————————————————————————————————————————————-| | Reliable: TCP is connection-oriented protocol. When a file or message send it will get delivered unless connections fails. If connection lost, the server will request the lost part. There is no corruption while transferring a message. | Not Reliable: UDP is connectionless protocol. When you a send a data or message, you don’t know if it’ll get there, it could get lost on the way. There may be corruption while transferring a message. | | Ordered: If you send two messages along a connection, one after the other, you know the first message will get there first. You don’t have to worry about data arriving in the wrong order. | Not Ordered: If you send two messages out, you don’t know what order they’ll arrive in i.e. no ordered | | Heavyweight: – when the low level parts of the TCP “stream” arrive in the wrong order, resend requests have to be sent, and all the out of sequence parts have to be put back together, so requires a bit of work to piece together. | Lightweight: No ordering of messages, no tracking connections, etc. It’s just fire and forget! This means it’s a lot quicker, and the network card / OS have to do very little work to translate the data back from the packets. | | Streaming: Data is read as a “stream,” with nothing distinguishing where one packet ends and another begins. There may be multiple packets per read call. | Datagrams: Packets are sent individually and are guaranteed to be whole if they arrive. One packet per one read call. | | Examples: World Wide Web (Apache TCP port 80), e-mail (SMTP TCP port 25 Postfix MTA), File Transfer Protocol (FTP port 21) and Secure Shell (OpenSSH port 22) etc. | Examples: Domain Name System (DNS UDP port 53), streaming media applications such as IPTV or movies, Voice over IP (VoIP), Trivial File Transfer Protocol (TFTP) and online multiplayer games |

HTTP

Status code

Groups

| Status code | Meaning | Examples | |————-|————–|——————————————————————————-| | 5XX | Server error | 500 Server Error | | 4XX | Client error | 401 Authentication failure; 403 Authorization failure; 404 Resource not found | | 3XX | Redirect | 301 Resource moved permanently; 302 Resource moved temporarily | | 2XX | Success | 200 OK; 201 Created; 203 Object marked for deletion |

HTTP 4XX status codes

| Status code | Meaning | Examples | |————-|———————-|—————| | 400 | Malformed request | Frequently a problem with parameter formatting or missing headers | | 401 | Authentication error | The system doesn’t know who the request if from. Authentication signature errors or invalid credentials can cause this | | 403 | Authorization error | The system knows who you are but you don’t have permission for the action you’re requesting | | 404 | Page not found | The resource doesn’t exist | | 405 | Method not allowed | Frequently a PUT when it needs a POST, or vice versa. Check the documentation carefully for the correct HTTP method |

Verbs

CRUD example with Starbucks

| Action | System call | HTTP verb address | Request body | Successful response code + Response body | |—————–|—————————-|——————-|————————————————————————————|————————————————————————————————-| | Order iced team | Add order to system | Post /orders/ | {“name” : “iced tea”, “size” : “trenta”} | 201 Created Location: /orders/1 | | Update order | Update existing order item | PUT /orders/1 | {“name” : “iced tea”, “size” : “trenta”, “options” : [“extra ice”, “unsweetened”]} | 204 No Content or 200 Success | | Check order | Read order from system | GET /orders/1 | | 200 Success { “name” : “iced tea”, “size” : “trenta”, “options” : [“extra ice”, “unsweetened”]} | | Cancel order | Delete order from system | DELETE /orders/1 | | 202 Item Marked for Deletion or 204 No Content |

Others

Headers

Request

| Header | Example value | Meaning | |—————–|—————————–|————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————–| | Accept | Text/html, application/json | The client’s preferred format for the response body. Browsers tend to prefer text/html, which is a human-friendly format. Applications using an API are likely to request JSON, which is structured in a machine-parseable way. This can be a list, and if so, the list is parsed in priority order: the first entry is the most desired format, all the way down to the last one. | | Accept-language | en-US | The preferred written language for the response. This is most often used by browsers indicating the language the user has specified as a preference | | User-agent | Mozilla/5.0 | This header tells the server what kind of client is making the request. This is an important header because sometimes responses or JavaScript actions are performed differently for different browsers. This is used less frequently for this purpose by API clients, but it’s a friendly practice to send a consistent user-agent for the server to use when determining how to send the information back. | | Content-length | size of the content body | When sending a PUT or POST, this can be sent so the server can verify that the request body wasn’t truncated on the way to the server. | | Content-type | application/json | When a content body is sent, the client can indicate to the server what the format is for that content in order to help the server respond to the request correctly. |

Response

| Header | Example value | Meaning | |——————————|————————————-|————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————–| | Content-Type | application/json | As with the request, when the content body is sent back to the client, the Content-Type is generally set to help the client know how best to process the request. Note that this is tied somewhat indirectly to the Accept header sent by the client. The server will generally do its best to send the first type of content from the list sent by the client but may not always provide the first choice. | | Access-Control-Allow-Headers | Content-Type, Authorization, Accept | This restricts the headers that a client can use for the request to a particular resource | | Access-Control-Allow-Methods | GET, PUT, POST, DELETE, OPTIONS | What HTTP methods are allowed for this resource | | Access-Control-Allow-Origin | * or http://www.example.com | This restricts the locations that can refer requests to the resource |

Compression

Parameters

| Action | System call | HTTP verb address | Successful response code / Response body | |—————————————|—————————————————————-|—————————————–|————————————————————————————————–| | Get order list, only Trenta iced teas | Retrieve list with a filter | Get /orders?name=iced%20tea&size=trenta | [{ “id” : 1, “name” : “iced tea”, “size” : “trenta”, “options” : [“extra ice”, “unsweetened”] }] | | Get options and size for the order | Retrieve order with a filter specifying which pieces to return | Get /orders/1?fields=options,size | { “size” : “trenta”, “options” : [“extra ice”, “unsweetened”]} |

HTTP session

Stateless applications

Structure of a session

Server-side session vs client-side cookie

| Category | Session | Cookie | |—————-|—————————————————————————————|—————————————————| | Location | User ID on server | User ID on web browser | | Safeness | Safer because data cannot be viewed or edited by the client | A hacker could manipulate cookie data and attack | | Amount of data | Big | Limited | | Efficiency | Save bandwidth by passing only a reference to the session (sessionID) each pageload. | Must pass all data to the webserver each pageload | | Scalability | Need efforts to scale because requests depend on server state | Easier to implement |

Store session state in client-side cookies

Cookie Def
Cookie typical workflow
// Browser request example:

GET /index.html HTTP/1.1
Host: www.example.com

// Example answer from the server:


HTTP/1.1 200 OK
Content-type: text/html
Set-Cookie: foo=10
Set-Cookie: bar=20; Expires=Fri, 30 Sep 2011 11:48:00 GMT
... rest  of the response

// Here two cookies foo=10 and bar=20 are stored on the browser. The second one will expire on 30 September. In each subsequent request the browser will send the cookies back to the server.


GET /spec.html HTTP/1.1
Host: www.example.com
Cookie: foo=10; bar=20
Accept: */*
Cookie Pros and cons

Store session state in server-side

Typical server-side session workflow
  1. Every time an internet user visits a specific website, a new session ID (a unique number that a web site’s server assigns a specific user for the duration of that user’s visit) is generated. And an entry is created inside server’s session table

| Columns | Type | Meaning | |————|————-|——————————-| | sessionID | string | a global unique hash value | | userId | Foreign key | pointing to user table | | expireAt | timestamp | when does the session expires |

  1. Server returns the sessionID as a cookie header to client
  2. Browser sets its cookie with the sessionID
  3. Each time the user sends a request to the server. The cookie for that domain will be automatically attached.
  4. The server validates the sessionID inside the request. If it is valid, then the user has logged in before.
Use a load balancer that supports sticky sessions:

DNS

Design

Initial design

A distributed, hierarchical database

Internals

DNS records

Insert records into DNS DB

DNS query parsing

Types

Round-robin DNS

GeoDNS

Functionality

DNS Caching

Load balancing

Host alias

DNS prefetching

Def

Control prefetching

X-DNS-Prefetch-Control: off

Load balancers

Benefits

Round-robin algorithm

Security

SSL

Definition

How does HTTPS work

How to avoid public key being modified?

How to avoid computation consumption from PKI

NoSQL

NoSQL vs SQL

| Database | SQL | NoSQL |
| ——————— |:————-:| ————:| | Data uniformness | Uniform data. Best visualized as a set of tables. Each table has rows, with each row representing an entity of interest. Each row is described through columns. One row cannot be nested inside another. | Non-uniform data. NoSQL databases recognize that often, it is common to operate on data in units that have a more complex structure than a set of rows. This is particularly useful in dealing with nonuniform data and custom fields. NoSQL data model can be put into four categories: key-value, document, column-family and graph. | | Schema change | Define what table exists, what column exists, what data types are. Although actually relational schemas can be changed at any time with standard SQL commands, it is of hight cost. | Changing schema is casual and of low cost. Essentially, a schemaless database shifts the schema into the application code. | | Query flexibility | Low cost on changing query. It allows you to easily look at the data in different ways. Standard SQL supports things like joins and subqueries. | High cost in changing query. It does not allow you to easily look at the data in different ways. NoSQL databases do not have the flexibility of joins or subqueries. | | Transactions | SQL has ACID transactions (Atomic, Consistent, Isolated, and Durable). It allows you to manipulate any combination of rows from any tables in a single transaction. This operation either succeeds or fails entirely, and concurrent operations are isolated from each other so they cannot see a partial update. | Graph database supports ACID transactions. Aggregate-oriented databases do not have ACID transactions that span multiple aggregates. Instead, they support atomic manipulation of a single aggregate at a time. If we need to manipulate multiple aggregates in an atomic way, we have to manage that ourselves in application code. An aggregate structure may help with some data interactions but be an obstacle for others. | | Consistency | Strong consistency | Trade consistency for availability or partition tolerance. Eventual consistency | | Scalability | elational database use ACID transactions to handle consistency across the whole database. This inherently clashes with a cluster environment | Aggregate structure helps greatly with running on a cluster. It we are running on a cluster, we need to minize how many nodes we need to query when we are gathering data. By using aggregates, we give the database important information about which bits of data (an aggregate) will be manipulated together, and thus should live on the same node. | | Performance | MySQL/PosgreSQL ~ 1k QPS | MongoDB/Cassandra ~ 10k QPS. Redis/Memcached ~ 100k ~ 1M QPS | | Maturity | Over 20 years. Integrate naturally with most web frameworks. For example, Active Record inside Ruby on Rails | Usually less than 10 years. Not great support for serialization and secondary index |

NoSQL flavors

Key-value

Document

Column-Family

CREATE COLUMN FAMILY visit_counter
WITH default_validation_class=CounterColumnType
AND key_validation_class=UTF8Type AND comparator=UTF8Type

// Once a column family is created, you can have arbitrary columns for each page visited within the web application for every user. 
INCR visit_counter['mfowler'][home] BY 1;
INCR visit_counter['mfowler'][products] BY 1;
INCR visit_counter['mfowler'][contactus] BY 1;

// expiring columns
SET Customer['mfowler']['demo_access'] = 'allowed' WITH ttl=2592000;

Graph

Scaling

Functional partitioning

REST best practices

Consistency

Endpoint naming conventions
HTTP verbs and CRUD consistency

| Verb | Endpoint | Description | |——–|———————–|————————————————————————| | GET | /products | Gets a list of products | | GET | /products/:id | Gets a single product by ID | | GET | /products/:id/parts | Gets a list of parts in a single product | | PUT | /products/:id/parts | Inserts a new part for a particular product | | DELETE | /products/:id | Deletes a single product by ID | | PUT | /products | Inserts a new product | | HEAD | /products/:id | Returns whether the product exists through a status code of 200 or 404 | | PATCH | /products/:id | Edits an existing product by ID | | POST | /authentication/login | Most other API methods should use POST requests |

Versioning
Data transfer format
{
    "data" : {}  // actual response
}
HTTP status codes and error handling
HTTP/1.1 200 OK
{
    "data": {
        "id" : "baeb-b001",
        "name" : "Angry Pirate Plush Toy",
        "description" : "Batteries not included",
        "price" : "$39.99",
        "categories": ["plushies", "kids"]
    }
}
// if input validation fails on a form while attempting to create a product, you could return a response using a 400 bad request status code, as shown in the following listing.
HTTP/1.1 400 Bad Request
{
    "error": {
        "code": "bf-400",
        "message": "Some required fields were invalid.",
        "context": {
            "validation": [
                "The product name must be 6-20 alphanumeric characters",
                "The price cann't be negative",
                "At least one product category should be selected. "
            ]
        }
    }
}

// server side error
{
    "error": {
        "code": "bf-500",
        "message": "An unexpected error occurred while accessing the database",
        "context": {
            "id": "baeb-b001"
        }
    }
}
Paging
 curl 'https://api.github.com/user/repos?page=2&per_page=100'
{
  "results": [ ... actual results ... ],
  "pagination": {
    "count": 2340,
    "page": 4,
    "per_page": 20
  }
}
 Link: <https://api.github.com/user/repos?page=3&per_page=100>; rel="next",
   <https://api.github.com/user/repos?page=50&per_page=100>; rel="last"

| Name | Description | |——-|—————————————————————| | next | The link relation for the immediate next page of results. | | last | The link relation for the last page of results. | | first | The link relation for the first page of results. | | prev | The link relation for the immediate previous page of results. |

Scaling REST web services

Keeping service machine stateless
Benefits
Common use cases needing share state
Caching service responses
Cache-Control header
Expires
Last-Modified/If-Modified-Since/Max-age

Cache-Control: private, max-age=86400 Last-Modified: Thu, 3 Jul 2014 18:31:12 GMT

If-Modified-Since: Thu, 3 Jul 2014 18:31:12 GMT

ETag

Cache-Control: private, max-age=86400 ETag: “d5jiodjiojiojo”

If-None-Match: “d5jiodjiojiojo”

Vary: Authorization
Functional partitioning

Security

Throttling

X-RateLimit-Limit: 2000 X-RateLimit-Remaining: 1999 X-RateLimit-Reset: 1404429213925

Use OAuth2 with HTTPS for authorization, authentication and confidentiality.

Documentation

Others

Data partitioning - Sharding

Sharding benefits

Sharding key

Sharding function

Static sharding

Dynamic sharding

Challenges

Cross-shard joins

Using AUTO_INCREMENT

Distributed transactions

Clones - Replication

Replication purpose

High availability by creating redundancy

Planning for failures

Replication for scaling read

When to use
When not to use

Replication Topology

Master-slave vs peer-to-peer

| Types | Strengths | Weakness | | ———— |:—————-:|:——————-:| | Master-slave |

| | | p2p: Master-master | | Not a viable scalability technique. | | p2p: Ring-based | Chain three or more masters together to create a ring. | |

Master-slave replication

Number of slaves

Peer-to-peer replication

Replication mode

Synchronous and Asynchronous

Synchronous vs Asynchronous

Cache

Why does cache work

Cache hit ratio

How much will cache benefit

Access pattern

Write through cache

Write around cache

Write back cache

How to handle cache failure

Typical caching scenarios

HTTP Cache

Headers

Types

Browser cache
Caching proxies
Reverse proxy
Content delivery networks

Scaling

Application objects cache

Types

Client-side web storage
Caches co-located with code: One located directly on your web servers.
Distributed cache store

Scaling

Caching rules of thumb

Cache priority

Cache reuse

Cache invalidation

Pains

Pain of large data sets - When cache memory is full

Pain of stale data

Pain of loading

Pain of duplication

Thundering herd problem

Def

/* read some data, check cache first, otherwise read from SoR */
public V readSomeData(K key) {
  Element element;
  if ((element = cache.get(key)) != null) {
    return element.getValue();
  }

  // note here you should decide whether your cache
  // will cache 'nulls' or not
  if (value = readDataFromDataStore(key)) != null) {
    cache.put(new Element(key, value));
  }

  return value;
}

Solutions

Scaling Memcached at Facebook

Architecture

Lambda architecture

Building blocks

Load balancer

Hardware vs software

| Category | Software | Hardware | |———-|———————————————————————————————————————————————————————————————————|———————————————————————————————————————————————————| | Def | Run on standard PC hardware, using applications like Nginx and HAProxy | Run on special hardware and contain any software pre-installed and configured by the vendor. | | Model | Operate on Application Layer | Operate on network and transport layer and work with TCP/IP packets. Route traffic to backend servers and possibly handling network address translation | | Strength/Weakness | More intelligent because can talk HTTP (can perform the compression of resources passing through and routing-based on the presence of cookies) and more flexible for hacking in new features or changes | Higher throughput and lower latency. High purchase cost. Hardware load balancer prices start from a few thousand dollars and go as high as over 100,000 dollars per device. Specialized training and harder to find people with the work experience necessary to operate them. |

HAProxy vs Nginx

| Category | Nginx | HAProxy | |———–|———————————————|——————————————————————————————————————————-| | Strengths | Can cache HTTP responses from your servers. | A little faster than Nginx and a wealth of extra features. It can be configured as either a layer 4 or layer 7 load balancer. |

Web server

Apache and Nginx

Apache vs Nginx

Category Apache Nginx
History Invented around 1990s when web traffic is low and web pages are really simple. Apache’s heavyweight, monolithic model has its limit. Tunning Apache to cope with real-world traffic efficiently is a complex art. Heavy traffic and web pages. Designed for high concurrency. Provides 12 features including which make them appropriate for microservices.
Architecture One process/threads per connection. Each requests to be handled as a separate child/thread. Asynchronous event-driven model. There is a single master process with one or more worker processes.
Performance To decrease page-rendering time, web browsers routinely open six or more TCP connections to a web server for each user session so that resources can download in parallel. Browsers hold these connections open for a period of time to reduce delay for future requests the user might make during the session. Each open connection exclusively reserves an httpd process, meaning that at busy times, Apache needs to create a large number of processes. Each additional process consumes an extra 4MB or 5MB of memory. Not to mention the overhead involved in creating and destroying child processes. Can handle a huge number of concurrent requests
Easier development Very easy to insert additional code at any point in Apache’s web-serving logic. Developers could add code securely in the knowledge that if newly added code is blocked, ran slowly, leaked resources, or even crashed, only the worker process running the code would be affected. Processing of all other connections would continue undisturbed Developing modules for it isn’t as simple and easy as with Apache. Nginx module developers need to be very careful to create efficient and accurate code, without any resource leakage, and to interact appropriately with the complex event-driven kernel to avoid blocking operations.

Cache

In-memory cache - Guava cache

Standalone cache

Memcached

Redis

Database

DynamoDB

Cassandra

Queue

ActiveMQ

RabbitMQ

SQS

Kafka

Data Processing

Hadoop

Spark

EMR

Stream Processing

Samza

Storm

References