文章/答案/技术大牛

发布

社区首页 >问答首页 >Elasticsearch:从聚合中的存储桶中访问值

问Elasticsearch:从聚合中的存储桶中访问值
EN

Stack Overflow用户

提问于 2014-05-13 23:02:36

回答 1查看 2.1K关注 0票数 8

我想创建单词云来可视化Elasticsearch查询的结果。在单词云中，应该显示与查询匹配的文档中出现的所有术语。因此，我需要计算出现在某个任意文档集中的所有术语的词频。问题是我需要文档中所有术语的实际频率，而不仅仅是一个术语出现在其中的文档数量(这很容易使用术语聚合或facet来解决)。

给定以下测试索引

curl -XPOST localhost:9200/test -d '{
    "mappings": {
        "testdoc" : {
            "properties" : {
                "text" : {
                    "type" : "string",
                    "term_vector": "yes"
                }
            }
         }
    }
}'

和数据：

curl -XPOST "http://localhost:9200/sports/_bulk" -d'
{"index":{"_index":"test","_type":"testdoc"}}
{"text":"bike bike car"}
{"index":{"_index":"test","_type":"testdoc"}}
{"text":"car"}
{"index":{"_index":"test","_type":"testdoc"}}
{"text":"car car bus bus"}
{"index":{"_index":"test","_type":"testdoc"}}
{"text":"bike car bus"}
'

下面的查询返回术语'bike‘的术语频率。

curl -XPOST "http://localhost:9200/test/testdoc/_search" -d'
{
    "query": {
        "match_all": {}
    },
    "aggs": {
        "words": {
            "terms": {
                "field": "text"
            },
            "aggs": {
                "tf_sum": {
                     "sum": {
                         "script": "_index[\"text\"][\"bike\"].tf()"
                     }
                }
            }
        }
    }
}'

结果：

{
   "took": 3,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 4,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "words": {
         "buckets": [
            {
               "key": "car",
               "doc_count": 4,
               "tf_sum": {
                  "value": 3
               }
            },
            {
               "key": "bike",
               "doc_count": 2,
               "tf_sum": {
                  "value": 3
               }
            },
            {
               "key": "bus",
               "doc_count": 2,
               "tf_sum": {
                  "value": 1
               }
            }
         ]
      }
   }
}

但是，我不是只计算“自行车”的聚合，而是计算由tf_sum - tf_sum返回的所有术语的聚合。有没有一种方法可以访问tf_sum聚合脚本中存储桶的关键字段，这样我就可以计算单词-聚合返回的所有术语的总词频？

elasticsearch

aggregation

回答 1

Stack Overflow用户

发布于 2014-12-24 06:07:43

我们可以通过在术语聚合中使用脚本来实现这一点。我们可以使用_value变量访问术语值

curl -XPOST "http://localhost:9200/test/testdoc/_search" -d'
{
    "query": {
        "match_all": {}
    },
    "aggs": {
        "words": {
            "terms": {
                "field": "text",
                "script" : "_index[\"text\"][_value].tf()"
            }
        }
    }
}'

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/23634902

复制

相似问题

问Elasticsearch:从聚合中的存储桶中访问值
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Elasticsearch:从聚合中的存储桶中访问值EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Elasticsearch:从聚合中的存储桶中访问值
EN