有没大佬试过pipeline在reindex里用？

Elasticsearch | 作者 God_lockin | 发布于2019年03月08日 | 阅读数：3648

目前我们的数据里需要添加一些字段，但是原来的数据结构里面这些字段是缺失的，所以我准备通过pipeline的方式把这些默认值填上

{

  "mappings": {

    "_doc": {

      "properties": {

        "sentiment": {

          "type": "keyword"

        },

        "ingest_timestamp": {

            "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis",

            "type": "date"

        },

        ……

      }

    }

  },

  "settings": {

    "number_of_replicas": 0,

    "number_of_shards": 9,

    "default_pipeline": "defaultSentimentConditionPipeline"

  }

}

然后通过一组的pipeline去设值

{

  "defaultSentimentConditionPipeline": {

    "description": "set default sentiment condition pipeline",

    "processors": [

      {

        "pipeline": {

          "if": "ctx.sentiment == '' || ctx.sentiment == null",

          "name": "defaultSentimentPipeline"

        }

      },

      {

        "pipeline": {

          "if": "ctx.ingest_timestamp == '' || ctx.ingest_timestamp == null",

          "name": "timestamp_pipeline"

        }

      }

    ]

  },

  "defaultSentimentPipeline": {

    "description": "set default sentiment",

    "processors": [

      {

        "set": {

          "field": "sentiment",

          "value": "sentiment"

        }

      }

    ]

  },

  "timestamp_pipeline": {

    "description": "Add insert timestamp",

    "processors": [

      {

        "set": {

          "field": "ingest_timestamp",

          "value": "{{_ingest.timestamp}}"

        }

      }

    ]

  }

}

但是貌似reindex进来的数据并没有带上时间戳和默认值，理论上reindex是es内部的scroll+bulkinsert，但是这个流程是不走pipeline的吗？

1 个回复

rochy - rochy_he

赞同来自: God_lockin

比可以通过 reindex 时候添加脚本来完成上述的操作

POST _reindex

{

  "source": {

    "index": "twitter"

  },

  "dest": {

    "index": "new_twitter",

    "version_type": "external"

  },

  "script": {

    "source": "if (ctx._source.foo == 'bar') {ctx._version++; ctx._source.remove('foo')}",

    "lang": "painless"

  }

}

或者通过指定 pipeline 名称来实现上述效果：

POST _reindex

{

  "source": {

    "index": "source"

  },

  "dest": {

    "index": "dest",

    "pipeline": "some_ingest_pipeline"

  }

}

要回复问题请先登录或注册

有没大佬试过pipeline在reindex里用？

1 个回复

发起人

活动推荐

相关问题

问题状态

有没大佬试过pipeline在reindex里用？

与内容相关的链接

1 个回复

发起人

活动推荐

相关问题

问题状态