文章/答案/技术大牛

发布

社区首页 >问答首页 >微软(Bing)认知搜索API (v5)的totalEstimatedMatches行为

问微软(Bing)认知搜索API (v5)的totalEstimatedMatches行为
EN

Stack Overflow用户

提问于 2017-01-06 20:39:23

回答 2查看 707关注 0票数 1

最近将一些Bing搜索API v2代码转换为v5，它可以工作，但我对"totalEstimatedMatches“的行为很好奇。这里有一个例子来说明我的问题：

用户在我们的网站上搜索一个特定的词。API查询返回10个结果(我们的页面大小设置)，并将totalEstimatedMatches设置为21。因此，我们指出3页的结果，并让用户页面通过。

当他们到达第3页时，totalEstimatedMatches返回22而不是21。似乎奇怪的是，这么小的结果集，它不应该已经知道它是22，但好吧，我可以接受。所有结果都会正确显示。

现在，如果用户再次从第3页返回到第2页，则totalEstimatedMatches的值再次为21。这让我感到有点惊讶，因为一旦结果集被分页通过，API可能应该知道有22个结果，而不是21个结果。

从80年代开始，我就一直是一名专业的软件开发人员，所以我知道这是与API设计有关的那些细节问题之一。显然，它不是在缓存结果的确切数量，或者什么的。我只是不记得V2搜索API中的那种行为(我知道这是第三方代码)。这是相当可靠的数量的结果。

除了我以外，有没有人觉得这有点出乎意料？

bing-api

microsoft-cognitive

回答 2

Stack Overflow用户

回答已采纳

发布于 2017-01-09 22:51:01

原来这就是响应JSON字段totalEstimatedMatches包含单词...Estimated...的原因，而不仅仅是totalMatches：

"...search引擎索引不支持对总匹配的准确估计。“

摘自：https://stackoverflow.com/questions/39752665/news-search-api-v5-paging-results-with-offset-and-count?rq=1

正如人们可能预期的那样，返回的结果越少，您可能在totalEstimatedMatches值中看到的%错误就越大。类似地，您的查询越复杂(例如，运行一个复合查询，例如../search?q=(foo OR bar OR foobar)&...，实际上是将3个搜索打包到1中)，该值的变化似乎就越大。

也就是说，我已经(至少是初步地)通过设置offset == totalEstimatedMatches和创建一个简单的等效检查函数来弥补这一点。

下面是python中的一个简单示例：

while True:
    if original_totalEstimatedMatches < new_totalEstimatedMatches:
       original_totalEstimatedMatches = new_totalEstimatedMatches.copy()

       #set_new_offset_and_call_api() is a func that does what it says.
       new_totalEstimatedMatches = set_new_offset_and_call_api()
    else:
        break

票数 1

Stack Overflow用户

发布于 2019-03-08 01:32:20

重新访问API &我想出了一种不需要使用"totalEstimatedMatches"返回值就能高效分页的方法：

class ApiWorker(object):
    def __init__(self, q):
        self.q = q
        self.offset = 0
        self.result_hashes = set()
        self.finished = False

    def calc_next_offset(self, resp_urls):
       before_adding = len(self.result_hashes)
       self.result_hashes.update((hash(i) for i in resp_urls)) #<==abuse of set operations.
       after_adding = len(self.result_hashes)
       if after_adding == before_adding: #<==then we either got a bunch of duplicates or we're getting very few results back.
           self.complete = True
       else:
           self.offset += len(new_results)

    def page_through_results(self, *args, **kwargs):
        while not self.finished:
            new_resp_urls = ...<call_logic>...
            self.calc_next_offset(new_resp_urls)
            ...<save logic>...
        print(f'All unique results for q={self.q} have been obtained.')

一旦获得副本的完整响应，此^将停止分页。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/41513978

复制

相似问题

问微软(Bing)认知搜索API (v5)的totalEstimatedMatches行为
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问微软(Bing)认知搜索API (v5)的totalEstimatedMatches行为EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问微软(Bing)认知搜索API (v5)的totalEstimatedMatches行为
EN