看官网对norms描述:
if you don’t need scoring on a specific field, you should disable norms on that field. In particular, this is the case for fields that are used solely for filtering or aggregations.
不需要对某字段进行打分排序时,可禁用norms。换句话说,只有type 为 "text" 的字段,才有必要设置 norms 属性吧? (norms 默认为true)
而对于 keyword 类型的字段,其实是没有 norms 属性的吧?看官网对keyword的解释:
they are typically used for filtering (Find me all blog posts where status is published), for sorting, and for aggregations. Keyword fields are only searchable by their exact value.
es6.3.2 测试了一下:
GET test/_mapping 返回如下:
if you don’t need scoring on a specific field, you should disable norms on that field. In particular, this is the case for fields that are used solely for filtering or aggregations.
不需要对某字段进行打分排序时,可禁用norms。换句话说,只有type 为 "text" 的字段,才有必要设置 norms 属性吧? (norms 默认为true)
而对于 keyword 类型的字段,其实是没有 norms 属性的吧?看官网对keyword的解释:
they are typically used for filtering (Find me all blog posts where status is published), for sorting, and for aggregations. Keyword fields are only searchable by their exact value.
es6.3.2 测试了一下:
PUT test
{
"settings": {
"index":{
"number_of_shards":2,
"number_of_replicas":0
}
}
, "mappings": {
"_doc":{
"properties":{
"title":{"type":"text","norms":false},
"overview":{"type":"keyword","norms":false}
}
}
}
}
GET test/_mapping 返回如下:
{
"test": {
"mappings": {
"_doc": {
"properties": {
"overview": {
"type": "keyword"
},
"title": {
"type": "text",
"norms": false
}
}
}
}
}
}
2 个回复
Ombres
赞同来自: ridethewind 、Esmmmmmmmm
你的理解是对的。
2. 而对于 keyword 类型的字段,其实是没有 norms 属性的吧?
keyword类型是有norms属性的,默认是false。在初始化的时候设置了,以下引用部分源码。
hapjin
赞同来自:
这里说:norms里面存储的是各种各样的归一化因子。看 bm25-the-next-generation-of-lucene-relevation,应该是 与文档平均长度(average length of a document)有关的因子吧?
norms又是如何影响搜索结果的呢?比如说将:norms 与 term frequency 对比,一般来说tf越大,文档得分越高,那 norms 是怎么影响得分?
看title-search-when-relevancy-is-only-skin-deep/里面说:norms 倾向于给 短文本document 打高分,但不是太理解,希望有人解答一下?
看到:https://lucene.472066.n3.nabbl ... .html 中的描述之后,Lucene评分模型倾向于给短文本打高分,是不是就是因为: 开启 norms 参数 导致的?