es的批量查询(批量操作)

2024-12-18 10:29:26 # 问题总结 # elasticsearch # es的批量操作 #elasticsearch

es的批量查询(批量操作)

如果我们要查询的话，查询100条数据，然后把这100条数据拼接的话就需要发送100次请求，这个开销很大

基于mget的批量查询

mget是es的一个api可以一次性的从同一个索引或者多个索引一起检索文档

我们首先放入数据

put /ecommerce/product/1
{
  "a":"1",
  "b":"2"
}
put /ecommerce/product/2
{
  "a":"3",
  "b":"4"
}

批量查询：

GET/_mget
{
    "docs":{
        "_index":"ecommerce",
        "_type":"product",
        "_id":1
    },
    {
        "_index":"ecommerce",
        "_type":"product",
        "_id":2
    }
}

结果：

{
  "docs" : [
    {
      "_index" : "ecommerce",
      "_type" : "product",
      "_id" : "1",
      "_version" : 1,
      "_seq_no" : 0,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "a" : "1",
        "b" : "2"
      }
    },
    {
      "_index" : "ecommerce",
      "_type" : "product",
      "_id" : "2",
      "_version" : 1,
      "_seq_no" : 0,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "a" : "3",
        "b" : "4"
      }
    }
  ]
}

就像上文一样，索引都是一个索引，查询的是一个索引内的多个数据

GET ecommerce/_mget
{
  "docs":[
      {
        "_type":"product",
        "_id":1
      },
      {
        "_type":"product",
        "_id":2
      }
    ]

}

如果是docs内都是同一个index同一个type的就可以

GET /ecommerce/product/_mget
{
      "ids":[1,2,3,4]

}

当前我们可以从不同的索引中获取文档

GET /_mget
{
  "docs": [
    {
      "_index": "test-index-1",
      "_id": "1",
      "_source": ["field1", "field2"]
    },
    {
      "_index": "test-index-2",
      "_id": "2",
      "_source": "field3"
    }
  ]
}

基于bulk的批量增删改

这个bulk api对格式有严格的要求，除了delete外，每一个操作都要两个json字符串并且每一个json字符串内不能换行，非同一个json字符串必须换行

基本格式：

1
2
3

POST /<index>/_bulk
{"action": {"metadata"}}
{"data"}

增加

1
2
3

POST /_bluk
{"create":{"_index":"test","_id":2}}
{"field":0,"field2":"value"}

这里我们在这个test的表的所有id为2的数据下面加上了两个字段，一个是field1，一个是field2

删除

1
2
3

POST /bulk
{"delete":{"_index":"test","_id":2}}
{"delete":{"_index":"test","_id":3}}

修改

POST /_bulk
{ "update" : { "_index" : "test", "_id" : "1" } }
{ "doc" : { "field1" : "new_value1", "field2" : "new_value2" }}
{ "update" : { "_index" : "test", "_id" : "2" } }
{ "doc" : { "field1" : "new_value3", "field2" : "new_value4" }}

filter_path

filter_path是es中可以过滤返回的响应内容的使用

可以减少es的返回数据量

下面是使用方式：

filter_path=took: 这个请求仅返回执行请求所花费的时间（以毫秒为单位）。
filter_path=items._id,items._index: 这个请求仅返回每个 item 的 _id 和 _index 字段。
filter_path=items.*.error: 这个请求会返回所有包含 error 字段的 items。
filter_path=hits.hits._source: 这个请求仅返回搜索结果中的原始文档内容。
filter_path=_shards, hits.total: 这个请求返回关于 shards 的信息和命中的总数。
filter_path=aggregations.*.value: 这个请求仅返回每个聚合的值。
请注意，如果你在 filter_path 中指定了多个字段，你需要使用逗号将它们分隔开。

举例

以下是关于每个 filter_path 的 Elasticsearch REST API 示例，展示如何使用这些过滤参数：

filter_path=took
仅返回请求所花费的时间。

请求：
1
GET /my_index/_search?filter_path=took
响应：
1
2
3
{
"took": 15
}

filter_path=items._id,items._index
仅返回 items 中的 _id 和 _index 字段。

请求：

POST /_bulk?filter_path=items._id,items._index
{
"index": { "_index": "test", "_id": "1" }
}
{
"field1": "value1"
}
{
"index": { "_index": "test", "_id": "2" }
}
{
"field1": "value2"
}

响应

{
"items": [
    { "_id": "1", "_index": "test" },
    { "_id": "2", "_index": "test" }
]
}

filter_path=items.*.error
返回 items 中所有包含 error 字段的内容。

请求：


POST /_bulk?filter_path=items.*.error
{
"update": { "_id": "1", "_index": "nonexistent_index" }
}
{
"doc": { "field1": "value1" }
}

响应：

{
"items": [
    {
    "update": {
        "error": {
        "type": "index_not_found_exception",
        "reason": "no such index [nonexistent_index]"
        }
    }
    }
]
}

filter_path=hits.hits._source
仅返回搜索结果中原始文档内容。

请求：

GET /my_index/_search?filter_path=hits.hits._source
{
"query": { "match_all": {} }
}

响应：

{
"hits": {
    "hits": [
    { "_source": { "field1": "value1" } },
    { "_source": { "field2": "value2" } }
    ]
}
}

filter_path=_shards, hits.total
返回关于分片信息和总命中数。

请求：


GET /my_index/_search?filter_path=_shards,hits.total
{
"query": { "match": { "field1": "value" } }
}

响应：


{
"_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": {
    "value": 10,
    "relation": "eq"
    }
}
}

filter_path=aggregations.*.value
仅返回每个聚合的值。

请求：


GET /my_index/_search?filter_path=aggregations.*.value
{
"aggs": {
    "avg_price": { "avg": { "field": "price" } },
    "max_price": { "max": { "field": "price" } }
},
"size": 0
}

响应：

{
"aggregations": {
    "avg_price": { "value": 100.5 },
    "max_price": { "value": 200 }
}
}

2024-12-18 10:29:26 # 问题总结 # elasticsearch # es的批量操作 #elasticsearch