links/DMOZ: url updated
parent
7a094cec63
commit
70aba94582
|
@ -77,7 +77,7 @@ Handy conversion guide:
|
||||||
|
|
||||||
### Use case: Service crawls a list of urls
|
### Use case: Service crawls a list of urls
|
||||||
|
|
||||||
We'll assume we have an initial list of `links_to_crawl` ranked initially based on overall site popularity. If this is not a reasonable assumption, we can seed the crawler with popular sites that link to outside content such as [Yahoo](https://www.yahoo.com/), [DMOZ](http://www.dmoz.org/), etc.
|
We'll assume we have an initial list of `links_to_crawl` ranked initially based on overall site popularity. If this is not a reasonable assumption, we can seed the crawler with popular sites that link to outside content such as [Yahoo](https://www.yahoo.com/), [DMOZ](https://dmoz-odp.org/), etc.
|
||||||
|
|
||||||
We'll use a table `crawled_links` to store processed links and their page signatures.
|
We'll use a table `crawled_links` to store processed links and their page signatures.
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue