加入同义词全文检索查询以后，性能明显下降，请问怎么回事呢？

Elasticsearch | 作者 Yu Tao | 发布于2019年01月21日 | 阅读数：3184

我在我的集群里面做了一个测试，当我把同义词字典加入以后，我在做全文检索的时候， query的时间明显变长了。
总文档数也就是 45万，5个多G的测试数据。生产上面有150多个G的数据
我的mapping是这么定义：

{

  "settings": {

    "number_of_replicas": 0,

    "number_of_shards": 3,

    "index": {

      "analysis": {

        "analyzer": {

          "ik_mword_analyzer": {

            "type": "custom",

            "tokenizer": "ik_max_word",

            "filter": [

              "my_synonym_filter"

            ]

          },

          "ik_smart_analyzer": {

            "type": "custom",

            "tokenizer": "ik_smart",

            "filter": [

              "my_synonym_filter"

            ]

          }

        },

        "filter": {

          "my_synonym_filter": {

            "type": "synonym",

            "synonyms_path": "analysis-ik/synonyms.txt"

          }

        }

      }

    }

  },

    "mappings": {

    "news": {

      "properties":{

        "title":{

          "type": "text",

          "similarity": "BM25",

          "index_options" : "offsets",

          "store": true,

          "copy_to": "fullcontent",

          "analyzer": "ik_max_word"

        },

        "author_level":{

          "type": "text",

          "similarity": "BM25",

          "store": true,

          "fielddata": true,

          "copy_to": "fullcontent",

          "analyzer": "ik_smart"

        },

        "article_content":{

          "type": "text",

          "store": true,

          "fielddata": true,

          "copy_to": "fullcontent",

          "analyzer": "ik_max_word"

        },

        "fullcontent":{

          "type": "text",

          "store": true,

          "fielddata": true,

          "copy_to": "fullcontent",

          "index_options" : "offsets",

          "analyzer": "ik_max_word"

        }

}

同义词字典是:
西红柿,番茄
土豆,马铃薯,薯片
蜂蜜,蜜蜂
类似于这样子的格式

然后我用最简单的query:

{

  "query": {

    "match_phrase": {

      "fullcontent": "西红柿"

    }

  }

}

发现一个问题，
在我的index_A有同义词的下面，query一次要6s

在我的index_B没有同义词的下面，query一次要0.8s

请问一下这个要怎么优化呢？

3 个回复

rochy - rochy_he

这个最好看一下分词的结果，从分词上查找原因

JackGe

你看下query profile中每个shard查询耗时2s+，时间花在哪里了

liubin

1.索引变大，既然你都用了copy_to 字段了，其他字段不参与搜索，完全没必要进行分词。2。你用的ik_max_word,和你的词库有关，西红柿，番茄，在分词的情况下，看看分成了啥，而且match_phrase 做短语搜索很耗性能

要回复问题请先登录或注册

加入同义词全文检索查询以后，性能明显下降，请问怎么回事呢？

3 个回复

发起人

活动推荐

相关问题

问题状态

加入同义词全文检索查询以后，性能明显下降，请问怎么回事呢？

与内容相关的链接

3 个回复

发起人

活动推荐

相关问题

问题状态