mirror of
https://github.com/donnemartin/system-design-primer.git
synced 2025-12-16 01:48:56 +03:00
Merge 70aba94582 into b02784ffeb
This commit is contained in:
@@ -77,7 +77,7 @@ Handy conversion guide:
|
|||||||
|
|
||||||
### Use case: Service crawls a list of urls
|
### Use case: Service crawls a list of urls
|
||||||
|
|
||||||
We'll assume we have an initial list of `links_to_crawl` ranked initially based on overall site popularity. If this is not a reasonable assumption, we can seed the crawler with popular sites that link to outside content such as [Yahoo](https://www.yahoo.com/), [DMOZ](http://www.dmoz.org/), etc.
|
We'll assume we have an initial list of `links_to_crawl` ranked initially based on overall site popularity. If this is not a reasonable assumption, we can seed the crawler with popular sites that link to outside content such as [Yahoo](https://www.yahoo.com/), [DMOZ](https://dmoz-odp.org/), etc.
|
||||||
|
|
||||||
We'll use a table `crawled_links` to store processed links and their page signatures.
|
We'll use a table `crawled_links` to store processed links and their page signatures.
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user