我本地的es版本,是1.5.2,对数据建立索引时,并没有指定index_options的类型。(默认,应该是plain)
本地put条数据,大致的样子,是:
{"id": "20001",
"name": "江苏大学附属医院"}
查询语句:
curl localhost:9200/**-index/**/_search?pretty -d '{"query":{"bool":{"should":[{"match":{"name":{"query":"江苏人民医院","type":"boolean","boost":12.0}}}]}},"highlight":{"fields":{"name":{}}},"from":0,"size":2}'。
我的预期,对于hightlight部分,结果是:<em>江苏</em>省<em>人民医院</em>。
而实际结果是:<em>江苏省人员医院</em>,这个“省”字,也高亮了,与我预期不符。问题出在哪里?
本地put条数据,大致的样子,是:
{"id": "20001",
"name": "江苏大学附属医院"}
查询语句:
curl localhost:9200/**-index/**/_search?pretty -d '{"query":{"bool":{"should":[{"match":{"name":{"query":"江苏人民医院","type":"boolean","boost":12.0}}}]}},"highlight":{"fields":{"name":{}}},"from":0,"size":2}'。
我的预期,对于hightlight部分,结果是:<em>江苏</em>省<em>人民医院</em>。
而实际结果是:<em>江苏省人员医院</em>,这个“省”字,也高亮了,与我预期不符。问题出在哪里?
3 个回复
qinpengfei - 一个连电脑都玩不明白的逗逼
赞同来自:
2. highlight里面也可以写高亮的查询语句,看看能不能满足 highlight_query(https://www.elastic.co/guide/e ... g.html)
flank
赞同来自:
zhouliang
赞同来自:
在建立索引时,给name字段指定index_options类型,即:"index_options":"offsets"。
If index_options is set to offsets in the mapping the postings highlighter will be used instead of the plain highlighter. The postings highlighter:
Is faster since it doesn’t require to reanalyze the text to be highlighted: the larger the documents the better the performance gain should be
Requires less disk space than term_vectors, needed for the fast vector highlighter
Breaks the text into sentences and highlights them. Plays really well with natural languages, not as well with fields containing for instance html markup
Treats the document as the whole corpus, and scores individual sentences as if they were documents in this corpus, using the BM25 algorithm