为什么搜索不能提供超过一千页的结果

来源:互联网 发布:饮料瓶外包装设计软件 编辑:程序博客网 时间:2024/05/15 17:45

Deep Paging in Distributed Systems
To understand why deep paging is problematic, let’s imagine that we are searching within a single index with five primary shards. When we request the first page of results (results 1 to 10), each shard produces its own top 10 results and returns them to the coordinating node, which then sorts all 50 results in order to select the overall top 10.

Now imagine that we ask for page 1,000—results 10,001 to 10,010. Everything works in the same way except that each shard has to produce its top 10,010 results. The coordinating node then sorts through all 50,050 results and discards 50,040 of them!

You can see that, in a distributed system, the cost of sorting results grows exponentially the deeper we page. There is a good reason that web search engines don’t return more than 1,000 results for any query.

百度只显示 76页的内容。
搜狗提供100页的内容。

排序量大概是
页数* 单页数量 * 节点数

0 0
原创粉丝点击