translate mint
parent
a525d50c83
commit
2bc9be30ee
|
@ -1,102 +1,102 @@
|
||||||
# Design Mint.com
|
# 设计 Mint.com
|
||||||
|
|
||||||
*Note: This document links directly to relevant areas found in the [system design topics](https://github.com/donnemartin/system-design-primer#index-of-system-design-topics) to avoid duplication. Refer to the linked content for general talking points, tradeoffs, and alternatives.*
|
*注意:这个文档中的链接会直接指向[系统设计主题索引](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#系统设计主题索引)中的有关部分,以避免重复的内容。您可以参考链接的相关内容,来了解其总的要点、方案的权衡取舍以及可选的替代方案。*
|
||||||
|
|
||||||
## Step 1: Outline use cases and constraints
|
## 第一步:简述用例与约束条件
|
||||||
|
|
||||||
> Gather requirements and scope the problem.
|
> 搜集需求与问题的范围。
|
||||||
> Ask questions to clarify use cases and constraints.
|
> 提出问题来明确用例与约束条件。
|
||||||
> Discuss assumptions.
|
> 讨论假设。
|
||||||
|
|
||||||
Without an interviewer to address clarifying questions, we'll define some use cases and constraints.
|
我们将在没有面试官明确说明问题的情况下,自己定义一些用例以及限制条件。
|
||||||
|
|
||||||
### Use cases
|
### 用例
|
||||||
|
|
||||||
#### We'll scope the problem to handle only the following use cases
|
#### 我们将把问题限定在仅处理以下用例的范围中
|
||||||
|
|
||||||
* **User** connects to a financial account
|
* **用户** 连接到一个财务账户
|
||||||
* **Service** extracts transactions from the account
|
* **服务** 从账户中提取交易
|
||||||
* Updates daily
|
* 每日更新
|
||||||
* Categorizes transactions
|
* 分类交易
|
||||||
* Allows manual category override by the user
|
* 允许用户手动分类
|
||||||
* No automatic re-categorization
|
* 不自动重新分类
|
||||||
* Analyzes monthly spending, by category
|
* 按类别分析每月支出
|
||||||
* **Service** recommends a budget
|
* **服务** 推荐预算
|
||||||
* Allows users to manually set a budget
|
* 允许用户手动设置预算
|
||||||
* Sends notifications when approaching or exceeding budget
|
* 当接近或者超出预算时,发送通知
|
||||||
* **Service** has high availability
|
* **服务** 具有高可用性
|
||||||
|
|
||||||
#### Out of scope
|
#### 非用例范围
|
||||||
|
|
||||||
* **Service** performs additional logging and analytics
|
* **服务** 执行附加的日志记录和分析
|
||||||
|
|
||||||
### Constraints and assumptions
|
### 限制条件与假设
|
||||||
|
|
||||||
#### State assumptions
|
#### 提出假设
|
||||||
|
|
||||||
* Traffic is not evenly distributed
|
* 网络流量非均匀分布
|
||||||
* Automatic daily update of accounts applies only to users active in the past 30 days
|
* 自动账户日更新只适用于30天内活跃的用户
|
||||||
* Adding or removing financial accounts is relatively rare
|
* 添加或者移除财务账户相对较少
|
||||||
* Budget notifications don't need to be instant
|
* 预算通知不需要及时
|
||||||
* 10 million users
|
* 1000万用户
|
||||||
* 10 budget categories per user = 100 million budget items
|
* 每个用户10个预算类别= 1亿个预算项
|
||||||
* Example categories:
|
* 示例类别:
|
||||||
* Housing = $1,000
|
* Housing = $1,000
|
||||||
* Food = $200
|
* Food = $200
|
||||||
* Gas = $100
|
* Gas = $100
|
||||||
* Sellers are used to determine transaction category
|
* 卖方确定交易类别
|
||||||
* 50,000 sellers
|
* 50,000 个卖方
|
||||||
* 30 million financial accounts
|
* 3000万财务账户
|
||||||
* 5 billion transactions per month
|
* 每月50亿交易
|
||||||
* 500 million read requests per month
|
* 每月5亿读请求
|
||||||
* 10:1 write to read ratio
|
* 10:1 读写比
|
||||||
* Write-heavy, users make transactions daily, but few visit the site daily
|
* Write-heavy, 用户每天都进行交易,但是每天很少访问该网站
|
||||||
|
|
||||||
#### Calculate usage
|
#### 计算用量
|
||||||
|
|
||||||
**Clarify with your interviewer if you should run back-of-the-envelope usage calculations.**
|
**如果你需要进行粗略的用量计算,请向你的面试官说明。**
|
||||||
|
|
||||||
* Size per transaction:
|
* 每次交易的用量:
|
||||||
* `user_id` - 8 bytes
|
* `user_id` - 8 字节
|
||||||
* `created_at` - 5 bytes
|
* `created_at` - 5 字节
|
||||||
* `seller` - 32 bytes
|
* `seller` - 32 字节
|
||||||
* `amount` - 5 bytes
|
* `amount` - 5 字节
|
||||||
* Total: ~50 bytes
|
* Total: ~50 字节
|
||||||
* 250 GB of new transaction content per month
|
* 每月产生 250 GB 新的交易内容
|
||||||
* 50 bytes per transaction * 5 billion transactions per month
|
* 50 bytes per transaction * 5 billion transactions per month
|
||||||
* 9 TB of new transaction content in 3 years
|
* 9 TB of new transaction content in 3 years
|
||||||
* Assume most are new transactions instead of updates to existing ones
|
* Assume most are new transactions instead of updates to existing ones
|
||||||
* 2,000 transactions per second on average
|
* 平均每秒产生 2,000 次交易
|
||||||
* 200 read requests per second on average
|
* 平均每秒产生 200 读请求
|
||||||
|
|
||||||
Handy conversion guide:
|
便利换算指南:
|
||||||
|
|
||||||
* 2.5 million seconds per month
|
* 每个月有 250 万秒
|
||||||
* 1 request per second = 2.5 million requests per month
|
* 每秒一个请求 = 每个月 250 万次请求
|
||||||
* 40 requests per second = 100 million requests per month
|
* 每秒 40 个请求 = 每个月 1 亿次请求
|
||||||
* 400 requests per second = 1 billion requests per month
|
* 每秒 400 个请求 = 每个月 10 亿次请求
|
||||||
|
|
||||||
## Step 2: Create a high level design
|
## 第二步:高层设计
|
||||||
|
|
||||||
> Outline a high level design with all important components.
|
> 列出所有重要组件以规划高层设计。
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
## Step 3: Design core components
|
## 第三步:设计核心组件
|
||||||
|
|
||||||
> Dive into details for each core component.
|
> 深入每个核心组件的细节。
|
||||||
|
|
||||||
### Use case: User connects to a financial account
|
### 用例:用户连接到一个财务账户
|
||||||
|
|
||||||
We could store info on the 10 million users in a [relational database](https://github.com/donnemartin/system-design-primer#relational-database-management-system-rdbms). We should discuss the [use cases and tradeoffs between choosing SQL or NoSQL](https://github.com/donnemartin/system-design-primer#sql-or-nosql).
|
我们可以将1000万用户的信息存储在一个[关系数据库](https://github.com/donnemartin/system-design-primer#relational-database-management-system-rdbms)中。我们应该讨论一下[选择SQL或NoSQL之间的用例和权衡](https://github.com/donnemartin/system-design-primer#sql-or-nosql)了。
|
||||||
|
|
||||||
* The **Client** sends a request to the **Web Server**, running as a [reverse proxy](https://github.com/donnemartin/system-design-primer#reverse-proxy-web-server)
|
* **客户端** 作为一个[反向代理](https://github.com/donnemartin/system-design-primer#reverse-proxy-web-server),发送请求到 **Web 服务器**
|
||||||
* The **Web Server** forwards the request to the **Accounts API** server
|
* **Web 服务器** 转发请求到 **账户API** 服务器
|
||||||
* The **Accounts API** server updates the **SQL Database** `accounts` table with the newly entered account info
|
* **账户API** 服务器将新输入的账户信息更新到 **SQL数据库** 的`accounts`表
|
||||||
|
|
||||||
**Clarify with your interviewer how much code you are expected to write**.
|
**告知你的面试官你准备写多少代码**。
|
||||||
|
|
||||||
The `accounts` table could have the following structure:
|
`accounts`表应该具有如下结构:
|
||||||
|
|
||||||
```
|
```
|
||||||
id int NOT NULL AUTO_INCREMENT
|
id int NOT NULL AUTO_INCREMENT
|
||||||
|
@ -110,9 +110,9 @@ PRIMARY KEY(id)
|
||||||
FOREIGN KEY(user_id) REFERENCES users(id)
|
FOREIGN KEY(user_id) REFERENCES users(id)
|
||||||
```
|
```
|
||||||
|
|
||||||
We'll create an [index](https://github.com/donnemartin/system-design-primer#use-good-indices) on `id`, `user_id `, and `created_at` to speed up lookups (log-time instead of scanning the entire table) and to keep the data in memory. Reading 1 MB sequentially from memory takes about 250 microseconds, while reading from SSD takes 4x and from disk takes 80x longer.<sup><a href=https://github.com/donnemartin/system-design-primer#latency-numbers-every-programmer-should-know>1</a></sup>
|
我们将在`id`,`user_id`和`created_at`等字段上创建一个[索引](https://github.com/donnemartin/system-design-primer#use-good-indices)以加速查找(对数时间而不是扫描整个表)并保持数据在内存中。从内存中顺序读取1 MB数据花费大约250毫秒,而从SSD读取是其4倍,从磁盘读取是其80倍。<sup><a href=https://github.com/donnemartin/system-design-primer#latency-numbers-every-programmer-should-know>1</a></sup>
|
||||||
|
|
||||||
We'll use a public [**REST API**](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest):
|
我们将使用公开的[**REST API**](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest):
|
||||||
|
|
||||||
```
|
```
|
||||||
$ curl -X POST --data '{ "user_id": "foo", "account_url": "bar", \
|
$ curl -X POST --data '{ "user_id": "foo", "account_url": "bar", \
|
||||||
|
@ -120,35 +120,35 @@ $ curl -X POST --data '{ "user_id": "foo", "account_url": "bar", \
|
||||||
https://mint.com/api/v1/account
|
https://mint.com/api/v1/account
|
||||||
```
|
```
|
||||||
|
|
||||||
For internal communications, we could use [Remote Procedure Calls](https://github.com/donnemartin/system-design-primer#remote-procedure-call-rpc).
|
对于内部通信,我们可以使用[远程过程调用](https://github.com/donnemartin/system-design-primer#remote-procedure-call-rpc)。
|
||||||
|
|
||||||
Next, the service extracts transactions from the account.
|
接下来,服务从账户中提取交易。
|
||||||
|
|
||||||
### Use case: Service extracts transactions from the account
|
### 用例:服务从账户中提取交易
|
||||||
|
|
||||||
We'll want to extract information from an account in these cases:
|
如下几种情况下,我们会想要从账户中提取信息:
|
||||||
|
|
||||||
* The user first links the account
|
* 用户首次链接账户
|
||||||
* The user manually refreshes the account
|
* 用户手动更新账户
|
||||||
* Automatically each day for users who have been active in the past 30 days
|
* 为过去30天内活跃的用户自动日更新
|
||||||
|
|
||||||
Data flow:
|
数据流:
|
||||||
|
|
||||||
* The **Client** sends a request to the **Web Server**
|
* **客户端**向 **Web服务器** 发送请求
|
||||||
* The **Web Server** forwards the request to the **Accounts API** server
|
* **Web服务器** 将请求转发到 **帐户API** 服务器
|
||||||
* The **Accounts API** server places a job on a **Queue** such as Amazon SQS or [RabbitMQ](https://www.rabbitmq.com/)
|
* **帐户API** 服务器将job放在 **队列** 中,如 Amazon SQS 或者 [RabbitMQ](https://www.rabbitmq.com/)
|
||||||
* Extracting transactions could take awhile, we'd probably want to do this [asynchronously with a queue](https://github.com/donnemartin/system-design-primer#asynchronism), although this introduces additional complexity
|
* 提取交易可能需要一段时间,我们可能希望[与队列异步](https://github.com/donnemartin/system-design-primer#asynchronism)地来做,虽然这会引入额外的复杂度。
|
||||||
* The **Transaction Extraction Service** does the following:
|
* **交易提取服务** 执行如下操作:
|
||||||
* Pulls from the **Queue** and extracts transactions for the given account from the financial institution, storing the results as raw log files in the **Object Store**
|
* 从 **Queue** 中拉取并从金融机构中提取给定用户的交易,将结果作为原始日志文件存储在 **对象存储区**。
|
||||||
* Uses the **Category Service** to categorize each transaction
|
* 使用 **分类服务** 来分类每个交易
|
||||||
* Uses the **Budget Service** to calculate aggregate monthly spending by category
|
* 使用 **预算服务** 来按类别计算每月总支出
|
||||||
* The **Budget Service** uses the **Notification Service** to let users know if they are nearing or have exceeded their budget
|
* **预算服务** 使用 **通知服务** 让用户知道他们是否接近或者已经超出预算
|
||||||
* Updates the **SQL Database** `transactions` table with categorized transactions
|
* 更新具有分类交易的 **SQL数据库** 的`transactions`表
|
||||||
* Updates the **SQL Database** `monthly_spending` table with aggregate monthly spending by category
|
* 按类别更新 **SQL数据库** `monthly_spending`表的每月总支出
|
||||||
* Notifies the user the transactions have completed through the **Notification Service**:
|
* 通过 **通知服务** 提醒用户交易完成
|
||||||
* Uses a **Queue** (not pictured) to asynchronously send out notifications
|
* 使用一个 **队列** (没有画出来) 来异步发送通知
|
||||||
|
|
||||||
The `transactions` table could have the following structure:
|
`transactions`表应该具有如下结构:
|
||||||
|
|
||||||
```
|
```
|
||||||
id int NOT NULL AUTO_INCREMENT
|
id int NOT NULL AUTO_INCREMENT
|
||||||
|
@ -160,9 +160,9 @@ PRIMARY KEY(id)
|
||||||
FOREIGN KEY(user_id) REFERENCES users(id)
|
FOREIGN KEY(user_id) REFERENCES users(id)
|
||||||
```
|
```
|
||||||
|
|
||||||
We'll create an [index](https://github.com/donnemartin/system-design-primer#use-good-indices) on `id`, `user_id `, and `created_at`.
|
我们将在 `id`,`user_id`,和 `created_at`字段上创建[索引](https://github.com/donnemartin/system-design-primer#use-good-indices)。
|
||||||
|
|
||||||
The `monthly_spending` table could have the following structure:
|
`monthly_spending`表应该具有如下结构:
|
||||||
|
|
||||||
```
|
```
|
||||||
id int NOT NULL AUTO_INCREMENT
|
id int NOT NULL AUTO_INCREMENT
|
||||||
|
@ -174,13 +174,13 @@ PRIMARY KEY(id)
|
||||||
FOREIGN KEY(user_id) REFERENCES users(id)
|
FOREIGN KEY(user_id) REFERENCES users(id)
|
||||||
```
|
```
|
||||||
|
|
||||||
We'll create an [index](https://github.com/donnemartin/system-design-primer#use-good-indices) on `id` and `user_id `.
|
我们将在`id`,`user_id`字段上创建[索引](https://github.com/donnemartin/system-design-primer#use-good-indices)。
|
||||||
|
|
||||||
#### Category service
|
#### 分类服务
|
||||||
|
|
||||||
For the **Category Service**, we can seed a seller-to-category dictionary with the most popular sellers. If we estimate 50,000 sellers and estimate each entry to take less than 255 bytes, the dictionary would only take about 12 MB of memory.
|
对于 **分类服务**, 我们可以生成一个带有最受欢迎卖家的卖家-类别字典。如果我们估计50000个卖家,并估计每个条目占用不少于255个字节,该字典只需要大约12MB内存。
|
||||||
|
|
||||||
**Clarify with your interviewer how much code you are expected to write**.
|
**告知你的面试官你准备写多少代码**。
|
||||||
|
|
||||||
```
|
```
|
||||||
class DefaultCategories(Enum):
|
class DefaultCategories(Enum):
|
||||||
|
@ -197,7 +197,7 @@ seller_category_map['Target'] = DefaultCategories.SHOPPING
|
||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
|
||||||
For sellers not initially seeded in the map, we could use a crowdsourcing effort by evaluating the manual category overrides our users provide. We could use a heap to quickly lookup the top manual override per seller in O(1) time.
|
对于一开始没有在映射中的卖家,我们可以通过评估用户提供的手动类别来进行众包。在O(1)时间内,我们可以用堆来快速查找每个卖家的顶端的手动覆盖。
|
||||||
|
|
||||||
```
|
```
|
||||||
class Categorizer(object):
|
class Categorizer(object):
|
||||||
|
@ -217,7 +217,7 @@ class Categorizer(object):
|
||||||
return None
|
return None
|
||||||
```
|
```
|
||||||
|
|
||||||
Transaction implementation:
|
交易实现:
|
||||||
|
|
||||||
```
|
```
|
||||||
class Transaction(object):
|
class Transaction(object):
|
||||||
|
@ -228,9 +228,10 @@ class Transaction(object):
|
||||||
self.amount = amount
|
self.amount = amount
|
||||||
```
|
```
|
||||||
|
|
||||||
### Use case: Service recommends a budget
|
### 用例:服务推荐预算
|
||||||
|
|
||||||
To start, we could use a generic budget template that allocates category amounts based on income tiers. Using this approach, we would not have to store the 100 million budget items identified in the constraints, only those that the user overrides. If a user overrides a budget category, which we could store the override in the `TABLE budget_overrides`.
|
首先,我们可以使用根据收入等级分配每类别金额的通用预算模板。使用这种方法,我们不必存储在约束中标识的1亿个预算项目,只需存储用户覆盖的预算项目。如果用户覆盖预算类别,我们可以在
|
||||||
|
`TABLE budget_overrides`中存储此覆盖。
|
||||||
|
|
||||||
```
|
```
|
||||||
class Budget(object):
|
class Budget(object):
|
||||||
|
@ -252,26 +253,26 @@ class Budget(object):
|
||||||
self.categories_to_budget_map[category] = amount
|
self.categories_to_budget_map[category] = amount
|
||||||
```
|
```
|
||||||
|
|
||||||
For the **Budget Service**, we can potentially run SQL queries on the `transactions` table to generate the `monthly_spending` aggregate table. The `monthly_spending` table would likely have much fewer rows than the total 5 billion transactions, since users typically have many transactions per month.
|
对于 **预算服务** 而言,我们可以在`transactions`表上运行SQL查询以生成`monthly_spending`聚合表。由于用户通常每个月有很多交易,所以`monthly_spending`表的行数可能会少于总共50亿次交易的行数。
|
||||||
|
|
||||||
As an alternative, we can run **MapReduce** jobs on the raw transaction files to:
|
作为替代,我们可以在原始交易文件上运行 **MapReduce** 作业来:
|
||||||
|
|
||||||
* Categorize each transaction
|
* 分类每个交易
|
||||||
* Generate aggregate monthly spending by category
|
* 按类别生成每月总支出
|
||||||
|
|
||||||
Running analyses on the transaction files could significantly reduce the load on the database.
|
对交易文件的运行分析可以显著减少数据库的负载。
|
||||||
|
|
||||||
We could call the **Budget Service** to re-run the analysis if the user updates a category.
|
如果用户更新类别,我们可以调用 **预算服务** 重新运行分析。
|
||||||
|
|
||||||
**Clarify with your interviewer how much code you are expected to write**.
|
**告知你的面试官你准备写多少代码**.
|
||||||
|
|
||||||
Sample log file format, tab delimited:
|
日志文件格式样例,以tab分割:
|
||||||
|
|
||||||
```
|
```
|
||||||
user_id timestamp seller amount
|
user_id timestamp seller amount
|
||||||
```
|
```
|
||||||
|
|
||||||
**MapReduce** implementation:
|
**MapReduce** 实现:
|
||||||
|
|
||||||
```
|
```
|
||||||
class SpendingByCategory(MRJob):
|
class SpendingByCategory(MRJob):
|
||||||
|
@ -282,26 +283,25 @@ class SpendingByCategory(MRJob):
|
||||||
...
|
...
|
||||||
|
|
||||||
def calc_current_year_month(self):
|
def calc_current_year_month(self):
|
||||||
"""Return the current year and month."""
|
"""返回当前年月"""
|
||||||
...
|
...
|
||||||
|
|
||||||
def extract_year_month(self, timestamp):
|
def extract_year_month(self, timestamp):
|
||||||
"""Return the year and month portions of the timestamp."""
|
"""返回时间戳的年,月部分"""
|
||||||
...
|
...
|
||||||
|
|
||||||
def handle_budget_notifications(self, key, total):
|
def handle_budget_notifications(self, key, total):
|
||||||
"""Call notification API if nearing or exceeded budget."""
|
"""如果接近或超出预算,调用通知API"""
|
||||||
...
|
...
|
||||||
|
|
||||||
def mapper(self, _, line):
|
def mapper(self, _, line):
|
||||||
"""Parse each log line, extract and transform relevant lines.
|
"""解析每个日志行,提取和转换相关行。
|
||||||
|
|
||||||
Argument line will be of the form:
|
参数行应为如下形式:
|
||||||
|
|
||||||
user_id timestamp seller amount
|
user_id timestamp seller amount
|
||||||
|
|
||||||
Using the categorizer to convert seller to category,
|
使用分类器来将卖家转换成类别,生成如下形式的key-value对:
|
||||||
emit key value pairs of the form:
|
|
||||||
|
|
||||||
(user_id, 2016-01, shopping), 25
|
(user_id, 2016-01, shopping), 25
|
||||||
(user_id, 2016-01, shopping), 100
|
(user_id, 2016-01, shopping), 100
|
||||||
|
@ -314,7 +314,7 @@ class SpendingByCategory(MRJob):
|
||||||
yield (user_id, period, category), amount
|
yield (user_id, period, category), amount
|
||||||
|
|
||||||
def reducer(self, key, value):
|
def reducer(self, key, value):
|
||||||
"""Sum values for each key.
|
"""将每个key对应的值求和。
|
||||||
|
|
||||||
(user_id, 2016-01, shopping), 125
|
(user_id, 2016-01, shopping), 125
|
||||||
(user_id, 2016-01, gas), 50
|
(user_id, 2016-01, gas), 50
|
||||||
|
@ -323,119 +323,117 @@ class SpendingByCategory(MRJob):
|
||||||
yield key, sum(values)
|
yield key, sum(values)
|
||||||
```
|
```
|
||||||
|
|
||||||
## Step 4: Scale the design
|
## 第四步:设计扩展
|
||||||
|
|
||||||
> Identify and address bottlenecks, given the constraints.
|
> 根据限制条件,找到并解决瓶颈。
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
**Important: Do not simply jump right into the final design from the initial design!**
|
**重要提示:不要从最初设计直接跳到最终设计中!**
|
||||||
|
|
||||||
State you would 1) **Benchmark/Load Test**, 2) **Profile** for bottlenecks 3) address bottlenecks while evaluating alternatives and trade-offs, and 4) repeat. See [Design a system that scales to millions of users on AWS](../scaling_aws/README.md) as a sample on how to iteratively scale the initial design.
|
现在你要 1) **基准测试、负载测试**。2) **分析、描述**性能瓶颈。3) 在解决瓶颈问题的同时,评估替代方案、权衡利弊。4) 重复以上步骤。请阅读[「设计一个系统,并将其扩大到为数以百万计的 AWS 用户服务」](../scaling_aws/README.md) 来了解如何逐步扩大初始设计。
|
||||||
|
|
||||||
It's important to discuss what bottlenecks you might encounter with the initial design and how you might address each of them. For example, what issues are addressed by adding a **Load Balancer** with multiple **Web Servers**? **CDN**? **Master-Slave Replicas**? What are the alternatives and **Trade-Offs** for each?
|
讨论初始设计可能遇到的瓶颈及相关解决方案是很重要的。例如加上一个配置多台 **Web 服务器**的**负载均衡器**是否能够解决问题?**CDN**呢?**主从复制**呢?它们各自的替代方案和需要**权衡**的利弊又有什么呢?
|
||||||
|
|
||||||
We'll introduce some components to complete the design and to address scalability issues. Internal load balancers are not shown to reduce clutter.
|
我们将会介绍一些组件来完成设计,并解决架构扩张问题。内置的负载均衡器将不做讨论以节省篇幅。
|
||||||
|
|
||||||
*To avoid repeating discussions*, refer to the following [system design topics](https://github.com/donnemartin/system-design-primer#index-of-system-design-topics) for main talking points, tradeoffs, and alternatives:
|
**为了避免重复讨论**,请参考[系统设计主题索引](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#系统设计主题的索引)相关部分来了解其要点、方案的权衡取舍以及可选的替代方案。
|
||||||
|
|
||||||
* [DNS](https://github.com/donnemartin/system-design-primer#domain-name-system)
|
* [DNS](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#域名系统)
|
||||||
* [CDN](https://github.com/donnemartin/system-design-primer#content-delivery-network)
|
* [负载均衡器](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#负载均衡器)
|
||||||
* [Load balancer](https://github.com/donnemartin/system-design-primer#load-balancer)
|
* [水平拓展](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#水平扩展)
|
||||||
* [Horizontal scaling](https://github.com/donnemartin/system-design-primer#horizontal-scaling)
|
* [反向代理(web 服务器)](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#反向代理web-服务器)
|
||||||
* [Web server (reverse proxy)](https://github.com/donnemartin/system-design-primer#reverse-proxy-web-server)
|
* [API 服务(应用层)](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#应用层)
|
||||||
* [API server (application layer)](https://github.com/donnemartin/system-design-primer#application-layer)
|
* [缓存](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#缓存)
|
||||||
* [Cache](https://github.com/donnemartin/system-design-primer#cache)
|
* [关系型数据库管理系统 (RDBMS)](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#关系型数据库管理系统rdbms)
|
||||||
* [Relational database management system (RDBMS)](https://github.com/donnemartin/system-design-primer#relational-database-management-system-rdbms)
|
* [SQL 故障主从切换](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#故障切换)
|
||||||
* [SQL write master-slave failover](https://github.com/donnemartin/system-design-primer#fail-over)
|
* [主从复制](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#主从复制)
|
||||||
* [Master-slave replication](https://github.com/donnemartin/system-design-primer#master-slave-replication)
|
* [一致性模式](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#一致性模式)
|
||||||
* [Asynchronism](https://github.com/donnemartin/system-design-primer#aysnchronism)
|
* [可用性模式](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#可用性模式)
|
||||||
* [Consistency patterns](https://github.com/donnemartin/system-design-primer#consistency-patterns)
|
|
||||||
* [Availability patterns](https://github.com/donnemartin/system-design-primer#availability-patterns)
|
|
||||||
|
|
||||||
We'll add an additional use case: **User** accesses summaries and transactions.
|
我们将增加一个额外的用例:**用户** 访问摘要和交易数据。
|
||||||
|
|
||||||
User sessions, aggregate stats by category, and recent transactions could be placed in a **Memory Cache** such as Redis or Memcached.
|
用户会话,按类别统计的统计信息,以及最近的事务可以放在 **内存缓存**(如 Redis 或 Memcached )中。
|
||||||
|
|
||||||
* The **Client** sends a read request to the **Web Server**
|
* **客户端** 发送读请求给 **Web 服务器**
|
||||||
* The **Web Server** forwards the request to the **Read API** server
|
* **Web 服务器** 转发请求到 **读 API** 服务器
|
||||||
* Static content can be served from the **Object Store** such as S3, which is cached on the **CDN**
|
* 静态内容可通过 **对象存储** 比如缓存在 **CDN** 上的 S3 来服务
|
||||||
* The **Read API** server does the following:
|
* **读 API** 服务器做如下动作:
|
||||||
* Checks the **Memory Cache** for the content
|
* 检查 **内存缓存** 的内容
|
||||||
* If the url is in the **Memory Cache**, returns the cached contents
|
* 如果URL在 **内存缓存**中,返回缓存的内容
|
||||||
* Else
|
* 否则
|
||||||
* If the url is in the **SQL Database**, fetches the contents
|
* 如果URL在 **SQL 数据库**中,获取该内容
|
||||||
* Updates the **Memory Cache** with the contents
|
* 以其内容更新 **内存缓存**
|
||||||
|
|
||||||
Refer to [When to update the cache](https://github.com/donnemartin/system-design-primer#when-to-update-the-cache) for tradeoffs and alternatives. The approach above describes [cache-aside](https://github.com/donnemartin/system-design-primer#cache-aside).
|
参考 [何时更新缓存](https://github.com/donnemartin/system-design-primer#when-to-update-the-cache) 中权衡和替代的内容。以上方法描述了 [cache-aside缓存模式](https://github.com/donnemartin/system-design-primer#cache-aside).
|
||||||
|
|
||||||
Instead of keeping the `monthly_spending` aggregate table in the **SQL Database**, we could create a separate **Analytics Database** using a data warehousing solution such as Amazon Redshift or Google BigQuery.
|
我们可以使用诸如 Amazon Redshift 或者 Google BigQuery 等数据仓库解决方案,而不是将`monthly_spending`聚合表保留在 **SQL 数据库** 中。
|
||||||
|
|
||||||
We might only want to store a month of `transactions` data in the database, while storing the rest in a data warehouse or in an **Object Store**. An **Object Store** such as Amazon S3 can comfortably handle the constraint of 250 GB of new content per month.
|
我们可能只想在数据库中存储一个月的`交易`数据,而将其余数据存储在数据仓库或者 **对象存储区** 中。**对象存储区** (如Amazon S3) 能够舒服地解决每月250GB新内容的限制。
|
||||||
|
|
||||||
To address the 2,000 *average* read requests per second (higher at peak), traffic for popular content should be handled by the **Memory Cache** instead of the database. The **Memory Cache** is also useful for handling the unevenly distributed traffic and traffic spikes. The **SQL Read Replicas** should be able to handle the cache misses, as long as the replicas are not bogged down with replicating writes.
|
为了解决每秒 *平均* 2000 次读请求数(峰值时更高),受欢迎的内容的流量应由 **内存缓存** 而不是数据库来处理。 **内存缓存** 也可用于处理不均匀分布的流量和流量尖峰。 只要副本不陷入重复写入的困境,**SQL 读副本** 应该能够处理高速缓存未命中。
|
||||||
|
|
||||||
200 *average* transaction writes per second (higher at peak) might be tough for a single **SQL Write Master-Slave**. We might need to employ additional SQL scaling patterns:
|
*平均* 200次交易写入每秒(峰值时更高)对于单个 **SQL 写入主-从服务** 来说可能是棘手的。我们可能需要考虑其它的 SQL 性能拓展技术:
|
||||||
|
|
||||||
* [Federation](https://github.com/donnemartin/system-design-primer#federation)
|
* [联合](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#联合)
|
||||||
* [Sharding](https://github.com/donnemartin/system-design-primer#sharding)
|
* [分片](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#分片)
|
||||||
* [Denormalization](https://github.com/donnemartin/system-design-primer#denormalization)
|
* [非规范化](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#非规范化)
|
||||||
* [SQL Tuning](https://github.com/donnemartin/system-design-primer#sql-tuning)
|
* [SQL 调优](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#sql-调优)
|
||||||
|
|
||||||
We should also consider moving some data to a **NoSQL Database**.
|
我们也可以考虑将一些数据移至 **NoSQL 数据库**。
|
||||||
|
|
||||||
## Additional talking points
|
## 其它要点
|
||||||
|
|
||||||
> Additional topics to dive into, depending on the problem scope and time remaining.
|
> 是否深入这些额外的主题,取决于你的问题范围和剩下的时间。
|
||||||
|
|
||||||
#### NoSQL
|
#### NoSQL
|
||||||
|
|
||||||
* [Key-value store](https://github.com/donnemartin/system-design-primer#key-value-store)
|
* [键-值存储](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#键-值存储)
|
||||||
* [Document store](https://github.com/donnemartin/system-design-primer#document-store)
|
* [文档类型存储](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#文档类型存储)
|
||||||
* [Wide column store](https://github.com/donnemartin/system-design-primer#wide-column-store)
|
* [列型存储](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#列型存储)
|
||||||
* [Graph database](https://github.com/donnemartin/system-design-primer#graph-database)
|
* [图数据库](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#图数据库)
|
||||||
* [SQL vs NoSQL](https://github.com/donnemartin/system-design-primer#sql-or-nosql)
|
* [SQL vs NoSQL](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#sql-还是-nosql)
|
||||||
|
|
||||||
### Caching
|
### 缓存
|
||||||
|
|
||||||
* Where to cache
|
* 在哪缓存
|
||||||
* [Client caching](https://github.com/donnemartin/system-design-primer#client-caching)
|
* [客户端缓存](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#客户端缓存)
|
||||||
* [CDN caching](https://github.com/donnemartin/system-design-primer#cdn-caching)
|
* [CDN 缓存](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#cdn-缓存)
|
||||||
* [Web server caching](https://github.com/donnemartin/system-design-primer#web-server-caching)
|
* [Web 服务器缓存](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#web-服务器缓存)
|
||||||
* [Database caching](https://github.com/donnemartin/system-design-primer#database-caching)
|
* [数据库缓存](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#数据库缓存)
|
||||||
* [Application caching](https://github.com/donnemartin/system-design-primer#application-caching)
|
* [应用缓存](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#应用缓存)
|
||||||
* What to cache
|
* 什么需要缓存
|
||||||
* [Caching at the database query level](https://github.com/donnemartin/system-design-primer#caching-at-the-database-query-level)
|
* [数据库查询级别的缓存](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#数据库查询级别的缓存)
|
||||||
* [Caching at the object level](https://github.com/donnemartin/system-design-primer#caching-at-the-object-level)
|
* [对象级别的缓存](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#对象级别的缓存)
|
||||||
* When to update the cache
|
* 何时更新缓存
|
||||||
* [Cache-aside](https://github.com/donnemartin/system-design-primer#cache-aside)
|
* [缓存模式](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#缓存模式)
|
||||||
* [Write-through](https://github.com/donnemartin/system-design-primer#write-through)
|
* [直写模式](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#直写模式)
|
||||||
* [Write-behind (write-back)](https://github.com/donnemartin/system-design-primer#write-behind-write-back)
|
* [回写模式](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#回写模式)
|
||||||
* [Refresh ahead](https://github.com/donnemartin/system-design-primer#refresh-ahead)
|
* [刷新](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#刷新)
|
||||||
|
|
||||||
### Asynchronism and microservices
|
### 异步与微服务
|
||||||
|
|
||||||
* [Message queues](https://github.com/donnemartin/system-design-primer#message-queues)
|
* [消息队列](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#消息队列)
|
||||||
* [Task queues](https://github.com/donnemartin/system-design-primer#task-queues)
|
* [任务队列](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#任务队列)
|
||||||
* [Back pressure](https://github.com/donnemartin/system-design-primer#back-pressure)
|
* [背压](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#背压)
|
||||||
* [Microservices](https://github.com/donnemartin/system-design-primer#microservices)
|
* [微服务](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#微服务)
|
||||||
|
|
||||||
### Communications
|
### 通信
|
||||||
|
|
||||||
* Discuss tradeoffs:
|
* 可权衡选择的方案:
|
||||||
* External communication with clients - [HTTP APIs following REST](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest)
|
* 与客户端的外部通信 - [使用 REST 作为 HTTP API](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#表述性状态转移rest)
|
||||||
* Internal communications - [RPC](https://github.com/donnemartin/system-design-primer#remote-procedure-call-rpc)
|
* 服务器内部通信 - [RPC](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#远程过程调用协议rpc)
|
||||||
* [Service discovery](https://github.com/donnemartin/system-design-primer#service-discovery)
|
* [服务发现](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#服务发现)
|
||||||
|
|
||||||
### Security
|
### 安全性
|
||||||
|
|
||||||
Refer to the [security section](https://github.com/donnemartin/system-design-primer#security).
|
请参阅[「安全」](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#安全)一章。
|
||||||
|
|
||||||
### Latency numbers
|
### 延迟数值
|
||||||
|
|
||||||
See [Latency numbers every programmer should know](https://github.com/donnemartin/system-design-primer#latency-numbers-every-programmer-should-know).
|
请参阅[「每个程序员都应该知道的延迟数」](https://github.com/donnemartin/system-design-primer/blob/master/README-zh-Hans.md#每个程序员都应该知道的延迟数)。
|
||||||
|
|
||||||
### Ongoing
|
### 持续探讨
|
||||||
|
|
||||||
* Continue benchmarking and monitoring your system to address bottlenecks as they come up
|
* 持续进行基准测试并监控你的系统,以解决他们提出的瓶颈问题。
|
||||||
* Scaling is an iterative process
|
* 架构拓展是一个迭代的过程。
|
||||||
|
|
Loading…
Reference in New Issue