diff --git a/README-zh-Hant.md b/README-zh-Hant.md index b9ea2b3b..90a08dce 100755 --- a/README-zh-Hant.md +++ b/README-zh-Hant.md @@ -1364,7 +1364,7 @@ TCP 对于需要高可靠性但时间紧迫的应用程序很有用。比如包 - 你需要数据完好无损。 - 你想对网络吞吐量自动进行最佳评估。 -### User datagram protocol (UDP) +### 用户数据报协议(UDP)
@@ -1372,28 +1372,28 @@ TCP 对于需要高可靠性但时间紧迫的应用程序很有用。比如包 Source: How to make a multiplayer game
-UDP is connectionless. Datagrams (analogous to packets) are guaranteed only at the datagram level. Datagrams might reach their destination out of order or not at all. UDP does not support congestion control. Without the guarantees that TCP support, UDP is generally more efficient. +UDP 是无连接的。数据报(类似于数据包)只在数据报级别有保证。数据报可能会无序的到达目的地,也有可能会遗失。UDP 不支持拥塞控制。虽然不如 TCP 那样有保证,但 UDP 通常效率更高。 -UDP can broadcast, sending datagrams to all devices on the subnet. This is useful with [DHCP](https://en.wikipedia.org/wiki/Dynamic_Host_Configuration_Protocol) because the client has not yet received an IP address, thus preventing a way for TCP to stream without the IP address. +UDP 可以通过广播将数据报发送至子网内的所有设备。这对 [DHCP](https://en.wikipedia.org/wiki/Dynamic_Host_Configuration_Protocol) 很有用,因为子网内的设备还没有分配 IP 地址,而 IP 对于 TCP 是必须的。 -UDP is less reliable but works well in real time use cases such as VoIP, video chat, streaming, and realtime multiplayer games. +UDP 可靠性更低但适合用在网络电话、视频聊天,流媒体和实时多人游戏上。 -Use UDP over TCP when: +以下情况使用 UDP 代替 TCP: -* You need the lowest latency -* Late data is worse than loss of data -* You want to implement your own error correction +* 你需要低延迟 +* 相对于数据丢失更糟的是数据延迟 +* 你想实现自己的错误校正方法 -#### Source(s) and further reading: TCP and UDP +#### 来源及延伸阅读:TCP 与 UDP -* [Networking for game programming](http://gafferongames.com/networking-for-game-programmers/udp-vs-tcp/) -* [Key differences between TCP and UDP protocols](http://www.cyberciti.biz/faq/key-differences-between-tcp-and-udp-protocols/) -* [Difference between TCP and UDP](http://stackoverflow.com/questions/5970383/difference-between-tcp-and-udp) -* [Transmission control protocol](https://en.wikipedia.org/wiki/Transmission_Control_Protocol) -* [User datagram protocol](https://en.wikipedia.org/wiki/User_Datagram_Protocol) -* [Scaling memcache at Facebook](http://www.cs.bu.edu/~jappavoo/jappavoo.github.com/451/papers/memcache-fb.pdf) +* [游戏编程的网络](http://gafferongames.com/networking-for-game-programmers/udp-vs-tcp/) +* [TCP 与 UDP 的关键区别](http://www.cyberciti.biz/faq/key-differences-between-tcp-and-udp-protocols/) +* [TCP 与 UDP 的不同](http://stackoverflow.com/questions/5970383/difference-between-tcp-and-udp) +* [传输控制协议](https://en.wikipedia.org/wiki/Transmission_Control_Protocol) +* [用户数据报协议](https://en.wikipedia.org/wiki/User_Datagram_Protocol) +* [Memcache 在 Facebook 的扩展](http://www.cs.bu.edu/~jappavoo/jappavoo.github.com/451/papers/memcache-fb.pdf) -### Remote procedure call (RPC) +### 远程过程调用协议(RPC)@@ -1401,18 +1401,17 @@ Use UDP over TCP when: Source: Crack the system design interview
-In an RPC, a client causes a procedure to execute on a different address space, usually a remote server. The procedure is coded as if it were a local procedure call, abstracting away the details of how to communicate with the server from the client program. Remote calls are usually slower and less reliable than local calls so it is helpful to distinguish RPC calls from local calls. Popular RPC frameworks include [Protobuf](https://developers.google.com/protocol-buffers/), [Thrift](https://thrift.apache.org/), and [Avro](https://avro.apache.org/docs/current/). +在 RPC 中,客户端会去调用另一个地址空间(通常是一个远程服务器)里的方法。调用代码看起来就像是调用的是一个本地方法,客户端和服务器交互的具体过程被抽象。远程调用相对于本地调用一般较慢而且可靠性更差,因此区分两者是有帮助的。热门的 RPC 框架包括 [Protobuf](https://developers.google.com/protocol-buffers/), [Thrift](https://thrift.apache.org/) 和 [Avro](https://avro.apache.org/docs/current/)。 -RPC is a request-response protocol: +RPC 是一个“请求-响应”协议: -* **Client program** - Calls the client stub procedure. The parameters are pushed onto the stack like a local procedure call. -* **Client stub procedure** - Marshals (packs) procedure id and arguments into a request message. -* **Client communication module** - OS sends the message from the client to the server. -* **Server communication module** - OS passes the incoming packets to the server stub procedure. -* **Server stub procedure** - Unmarshalls the results, calls the server procedure matching the procedure id and passes the given arguments. -* The server response repeats the steps above in reverse order. +* **客户端程序** ── 调用客户端存根程序。就像调用本地方法一样,参数会被压入栈中。 +* **客户端 stub 程序** ── 将请求过程的 id 和参数打包进请求信息中。 +* **客户端通信模块** ── 将信息从客户端发送至服务端。 +* **服务端通信模块** ── 将接受的包传给服务端存根程序。 +* **服务端 stub 程序** ── 将结果解包,依据过程 id 调用服务端方法并将参数传递过去。 -Sample RPC calls: +RPC 调用示例: ``` GET /someoperation?data=anId @@ -1424,36 +1423,37 @@ POST /anotheroperation } ``` -RPC is focused on exposing behaviors. RPCs are often used for performance reasons with internal communications, as you can hand-craft native calls to better fit your use cases. +RPC 专注于暴露方法。RPC 通常用于处理内部通讯的性能问题,这样你可以手动处理本地调用以更好的适应你的情况。 -Choose a Native Library aka SDK when: -* You know your target platform. -* You want to control how your "logic" is accessed -* You want to control how error control happens off your library -* Performance and end user experience is your primary concern +当以下情况时选择本地库(也就是 SDK): -HTTP APIs following **REST** tend to be used more often for public APIs. +* 你知道你的目标平台。 +* 你想控制如何访问你的“逻辑”。 +* 你想对发生在你的库中的错误进行控制。 +* 性能和终端用户体验是你最关心的事。 -#### Disadvantage(s): RPC +遵循 **REST** 的 HTTP API 往往更适用于公共 API。 -* RPC clients become tightly coupled to the service implementation. -* A new API must be defined for every new operation or use case. -* It can be difficult to debug RPC. -* You might not be able to leverage existing technologies out of the box. For example, it might require additional effort to ensure [RPC calls are properly cached](http://etherealbits.com/2012/12/debunking-the-myths-of-rpc-rest/) on caching servers such as [Squid](http://www.squid-cache.org/). +#### 缺点:RPC -### Representational state transfer (REST) +* RPC 客户端与服务实现捆绑地很紧密。 +* 一个新的 API 必须在每一个操作或者用例中定义。 +* RPC 很难调试。 +* 你可能没办法很方便的去修改现有的技术。举个例子,如果你希望在 [Squid](http://www.squid-cache.org/) 这样的缓存服务器上确保 [RPC 被正确缓存](http://etherealbits.com/2012/12/debunking-the-myths-of-rpc-rest/)的话可能需要一些额外的努力了。 -REST is an architectural style enforcing a client/server model where the client acts on a set of resources managed by the server. The server provides a representation of resources and actions that can either manipulate or get a new representation of resources. All communication must be stateless and cacheable. +### 表述性状态转移(REST) -There are four qualities of a RESTful interface: +REST 是一种强制的客户端/服务端架构设计模型,客户端基于服务端管理的一系列资源操作。服务端提供修改或获取资源的接口。所有的通信必须是无状态和可缓存的。 -* **Identify resources (URI in HTTP)** - use the same URI regardless of any operation. -* **Change with representations (Verbs in HTTP)** - use verbs, headers, and body. -* **Self-descriptive error message (status response in HTTP)** - Use status codes, don't reinvent the wheel. -* **[HATEOAS](http://restcookbook.com/Basics/hateoas/) (HTML interface for HTTP)** - your web service should be fully accessible in a browser. +RESTful 接口有四条规则: -Sample REST calls: +* **标志资源(HTTP 里的 URI)** ── 无论什么操作都使用同一个 URI。 +* **表示的改变(HTTP 的动作)** ── 使用动作, headers 和 body。 +* **可自我描述的错误信息(HTTP 中的 status code)** ── 使用状态码,不要重新造轮子。 +* **[HATEOAS](http://restcookbook.com/Basics/hateoas/)(HTTP 中的HTML 接口)** ── 你的 web 服务器应该能够通过浏览器访问。 + +REST 请求的例子: ``` GET /someresources/anId @@ -1462,63 +1462,63 @@ PUT /someresources/anId {"anotherdata": "another value"} ``` -REST is focused on exposing data. It minimizes the coupling between client/server and is often used for public HTTP APIs. REST uses a more generic and uniform method of exposing resources through URIs, [representation through headers](https://github.com/for-GET/know-your-http-well/blob/master/headers.md), and actions through verbs such as GET, POST, PUT, DELETE, and PATCH. Being stateless, REST is great for horizontal scaling and partitioning. +REST 关注于暴露数据。它减少了客户端/服务端的耦合程度,经常用于公共 HTTP API 接口设计。REST 使用更通常与规范化的方法来通过 URI 暴露资源,[通过 header 来表述](https://github.com/for-GET/know-your-http-well/blob/master/headers.md)并通过 GET, POST, PUT, DELETE 和 PATCH 这些动作来进行操作。因为无状态的特性,REST 易于横向扩展和隔离。 -#### Disadvantage(s): REST +#### 缺点:REST -* With REST being focused on exposing data, it might not be a good fit if resources are not naturally organized or accessed in a simple hierarchy. For example, returning all updated records from the past hour matching a particular set of events is not easily expressed as a path. With REST, it is likely to be implemented with a combination of URI path, query parameters, and possibly the request body. -* REST typically relies on a few verbs (GET, POST, PUT, DELETE, and PATCH) which sometimes doesn't fit your use case. For example, moving expired documents to the archive folder might not cleanly fit within these verbs. -* Fetching complicated resources with nested hierarchies requires multiple round trips between the client and server to render single views, e.g. fetching content of a blog entry and the comments on that entry. For mobile applications operating in variable network conditions, these multiple roundtrips are highly undesirable. -* Over time, more fields might be added to an API response and older clients will receive all new data fields, even those that they do not need, as a result, it bloats the payload size and leads to larger latencies. +* 由于 REST 将重点放在暴露数据,所以当资源不是自然组织的或者结构复杂的时候它可能无法很好的适应。举个例子,返回过去一小时中与特定事件集匹配的更新记录这种操作就很难表示为路径。使用 REST,可能会使用 URI 路径,查询参数和可能的请求体来实现。 +* REST 一般依赖几个动作(GET, POST, PUT, DELETE 和 PATCH),但有时候仅仅这些没法满足你的需要。举个例子,将过期的文档移动到归档文件夹里去,这样的操作可能没法简单的用上面这几个 verbs 表达。 +* 为了渲染单个页面,获取被嵌套在层级结构中的复杂资源需要客户端,服务器之间多次往返通信。例如,获取博客内容及其关联评论。对于使用不确定网络环境的移动应用来说,这些多次往返通信是非常麻烦的。 +* 随着时间的推移,更多的字段可能会被添加到 API 响应中,较旧的客户端将会接收到所有新的数据字段,即使是那些它们不需要的字段,结果它会增加负载大小并引起更大的延迟。 -### RPC and REST calls comparison +### RPC 与 REST 比较 -| Operation | RPC | REST | +| 操作 | RPC | REST | | ------------------------------- | ---------------------------------------- | ---------------------------------------- | -| Signup | **POST** /signup | **POST** /persons | -| Resign | **POST** /resignSource: Do you really know why you prefer REST over RPC
-#### Source(s) and further reading: REST and RPC +#### 来源及延伸阅读:REST 与 RPC -* [Do you really know why you prefer REST over RPC](https://apihandyman.io/do-you-really-know-why-you-prefer-rest-over-rpc/) -* [When are RPC-ish approaches more appropriate than REST?](http://programmers.stackexchange.com/a/181186) +* [你知道你为什么更喜欢 REST 而不是 RPC 吗](https://apihandyman.io/do-you-really-know-why-you-prefer-rest-over-rpc/) +* [什么时候 RPC 比 REST 更合适?](http://programmers.stackexchange.com/a/181186) * [REST vs JSON-RPC](http://stackoverflow.com/questions/15056878/rest-vs-json-rpc) -* [Debunking the myths of RPC and REST](http://etherealbits.com/2012/12/debunking-the-myths-of-rpc-rest/) -* [What are the drawbacks of using REST](https://www.quora.com/What-are-the-drawbacks-of-using-RESTful-APIs) -* [Crack the system design interview](http://www.puncsky.com/blog/2016/02/14/crack-the-system-design-interview/) +* [揭开 RPC 和 REST 的神秘面纱](http://etherealbits.com/2012/12/debunking-the-myths-of-rpc-rest/) +* [使用 REST 的缺点是什么](https://www.quora.com/What-are-the-drawbacks-of-using-RESTful-APIs) +* [破解系统设计面试](http://www.puncsky.com/blog/2016/02/14/crack-the-system-design-interview/) * [Thrift](https://code.facebook.com/posts/1468950976659943/) -* [Why REST for internal use and not RPC](http://arstechnica.com/civis/viewtopic.php?t=1190508) +* [为什么在内部使用 REST 而不是 RPC](http://arstechnica.com/civis/viewtopic.php?t=1190508) -## Security +## 安全 -This section could use some updates. Consider [contributing](#contributing)! +这一部分需要更多内容。[一起来吧](#contributing)! -Security is a broad topic. Unless you have considerable experience, a security background, or are applying for a position that requires knowledge of security, you probably won't need to know more than the basics: +安全是一个宽泛的话题。除非你有相当的经验、安全方面背景或者正在申请的职位要求安全知识,你不需要了解安全基础知识以外的内容: -* Encrypt in transit and at rest. -* Sanitize all user inputs or any input parameters exposed to user to prevent [XSS](https://en.wikipedia.org/wiki/Cross-site_scripting) and [SQL injection](https://en.wikipedia.org/wiki/SQL_injection). -* Use parameterized queries to prevent SQL injection. -* Use the principle of [least privilege](https://en.wikipedia.org/wiki/Principle_of_least_privilege). +* 在运输和等待过程中加密 +* 对所有的用户输入和从用户那里发来的参数进行处理以防止 [XSS](https://en.wikipedia.org/wiki/Cross-site_scripting) 和 [SQL 注入](https://en.wikipedia.org/wiki/SQL_injection)。 +* 使用参数化的查询来防止 SQL 注入。 +* 使用[最小权限原则](https://en.wikipedia.org/wiki/Principle_of_least_privilege)。 -### Source(s) and further reading +### 来源及延伸阅读 -* [Security guide for developers](https://github.com/FallibleInc/security-guide-for-developers) +* [为开发者准备的安全引导](https://github.com/FallibleInc/security-guide-for-developers) * [OWASP top ten](https://www.owasp.org/index.php/OWASP_Top_Ten_Cheat_Sheet) -## Appendix +## 附录 -You'll sometimes be asked to do 'back-of-the-envelope' estimates. For example, you might need to determine how long it will take to generate 100 image thumbnails from disk or how much memory a data structure will take. The **Powers of two table** and **Latency numbers every programmer should know** are handy references. +一些时候你会被要求做出保守估计。比如,你可能需要估计从磁盘中生成 100 张图片的缩略图需要的时间或者一个数据结构需要多少的内存。**2 的次方表**和**每个开发者都需要知道的一些时间数据**(译注:OSChina 上有这篇文章的[译文](https://www.oschina.net/news/30009/every-programmer-should-know))都是一些很方便的参考资料。 -### Powers of two table +### 2 的次方表 ``` Power Exact Value Approx Value Bytes @@ -1533,11 +1533,11 @@ Power Exact Value Approx Value Bytes 40 1,099,511,627,776 1 trillion 1 TB ``` -#### Source(s) and further reading +#### 来源及延伸阅读 -* [Powers of two](https://en.wikipedia.org/wiki/Power_of_two) +* [2 的次方](https://en.wikipedia.org/wiki/Power_of_two) -### Latency numbers every programmer should know +### 每个程序员都应该知道的延迟数 ``` Latency Comparison Numbers @@ -1565,58 +1565,57 @@ Notes 1 ms = 10^-3 seconds = 1,000 us = 1,000,000 ns ``` -Handy metrics based on numbers above: +基于上述数字的指标: +* 从磁盘以 30 MB/s 的速度顺序读取 +* 以 100 MB/s 从 1 Gbps 的以太网顺序读取 +* 从 SSD 以 1 GB/s 的速度读取 +* 以 4 GB/s 的速度从主存读取 +* 每秒能绕地球 6-7 圈 +* 数据中心内每秒有 2,000 次往返 -* Read sequentially from disk at 30 MB/s -* Read sequentially from 1 Gbps Ethernet at 100 MB/s -* Read sequentially from SSD at 1 GB/s -* Read sequentially from main memory at 4 GB/s -* 6-7 world-wide round trips per second -* 2,000 round trips per second within a data center - -#### Latency numbers visualized +#### 延迟数可视化 ![](https://camo.githubusercontent.com/77f72259e1eb58596b564d1ad823af1853bc60a3/687474703a2f2f692e696d6775722e636f6d2f6b307431652e706e67) -#### Source(s) and further reading +#### 来源及延伸阅读 -* [Latency numbers every programmer should know - 1](https://gist.github.com/jboner/2841832) -* [Latency numbers every programmer should know - 2](https://gist.github.com/hellerbarde/2843375) -* [Designs, lessons, and advice from building large distributed systems](http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf) -* [Software Engineering Advice from Building Large-Scale Distributed Systems](https://static.googleusercontent.com/media/research.google.com/en//people/jeff/stanford-295-talk.pdf) +* [每个程序员都应该知道的延迟数 — 1](https://gist.github.com/jboner/2841832) +* [每个程序员都应该知道的延迟数 — 2](https://gist.github.com/hellerbarde/2843375) +* [关于建设大型分布式系统的的设计方案、课程和建议](http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf) +* [关于建设大型可拓展分布式系统的软件工程咨询](https://static.googleusercontent.com/media/research.google.com/en//people/jeff/stanford-295-talk.pdf) -### Additional system design interview questions +### 额外的系统设计面试问题 -> Common system design interview questions, with links to resources on how to solve each. +> 常见的系统设计面试问题,给出了如何解决的方案链接 -| Question | Reference(s) | -| ---------------------------------------- | ---------------------------------------- | -| Design a file sync service like Dropbox | [youtube.com](https://www.youtube.com/watch?v=PE4gwstWhmc) | -| Design a search engine like Google | [queue.acm.org](http://queue.acm.org/detail.cfm?id=988407)@@ -1624,13 +1623,13 @@ Handy metrics based on numbers above: Source: Twitter timelines at scale
-**Don't focus on nitty gritty details for the following articles, instead:** +** 不要专注于以下文章的细节,专注于以下方面: ** -* Identify shared principles, common technologies, and patterns within these articles -* Study what problems are solved by each component, where it works, where it doesn't -* Review the lessons learned +* 发现这些文章中的共同的原则、技术和模式。 +* 学习每个组件解决哪些问题,什么情况下使用,什么情况下不适用 +* 复习学过的文章 -| Type | System | Reference(s) | +| 类型 | 系统 | 引用 | | --------------- | ---------------------------------------- | ---------------------------------------- | | Data processing | **MapReduce** - Distributed data processing from Google | [research.google.com](http://static.googleusercontent.com/media/research.google.com/zh-CN/us/archive/mapreduce-osdi04.pdf) | | Data processing | **Spark** - Distributed data processing from Databricks | [slideshare.net](http://www.slideshare.net/AGrishchenko/apache-spark-architecture) | @@ -1638,7 +1637,7 @@ Handy metrics based on numbers above: | | | | | Data store | **Bigtable** - Distributed column-oriented database from Google | [harvard.edu](http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/chang06bigtable.pdf) | | Data store | **HBase** - Open source implementation of Bigtable | [slideshare.net](http://www.slideshare.net/alexbaranau/intro-to-hbase) | -| Data store | **Cassandra** - Distributed column-oriented database from Facebook | [slideshare.net](http://www.slideshare.net/planetcassandra/cassandra-introduction-features-30103666) +| Data store | **Cassandra** - Distributed column-oriented database from Facebook | [slideshare.net](http://www.slideshare.net/planetcassandra/cassandra-introduction-features-30103666) | | Data store | **DynamoDB** - Document-oriented database from Amazon | [harvard.edu](http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/decandia07dynamo.pdf) | | Data store | **MongoDB** - Document-oriented database | [slideshare.net](http://www.slideshare.net/mdirolf/introduction-to-mongodb) | | Data store | **Spanner** - Globally-distributed database from Google | [research.google.com](http://research.google.com/archive/spanner-osdi2012.pdf) | @@ -1649,12 +1648,12 @@ Handy metrics based on numbers above: | File system | **Hadoop File System (HDFS)** - Open source implementation of GFS | [apache.org](https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html) | | | | | | Misc | **Chubby** - Lock service for loosely-coupled distributed systems from Google | [research.google.com](http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/chubby-osdi06.pdf) | -| Misc | **Dapper** - Distributed systems tracing infrastructure | [research.google.com](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36356.pdf) +| Misc | **Dapper** - Distributed systems tracing infrastructure | [research.google.com](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36356.pdf) | | Misc | **Kafka** - Pub/sub message queue from LinkedIn | [slideshare.net](http://www.slideshare.net/mumrah/kafka-talk-tri-hug) | | Misc | **Zookeeper** - Centralized infrastructure and services enabling synchronization | [slideshare.net](http://www.slideshare.net/sauravhaloi/introduction-to-apache-zookeeper) | | | Add an architecture | [Contribute](#contributing) | -### Company architectures +### 公司的系统架构 | Company | Reference(s) | | -------------- | ---------------------------------------- | @@ -1681,11 +1680,11 @@ Handy metrics based on numbers above: | WhatsApp | [The WhatsApp architecture Facebook bought for $19 billion](http://highscalability.com/blog/2014/2/26/the-whatsapp-architecture-facebook-bought-for-19-billion.html) | | YouTube | [YouTube scalability](https://www.youtube.com/watch?v=w5WVu624fY8)