When using Doc Values on a not_analyzed String field you may still get some field data usage from global ordinals . This is a data structure that assigns a number (ordinal) to each term in the index for that field to save using excess memory by having multiple copies of the String value of the field when doing calculations. Global ordinals cannot be included in Doc Values as they need to be computed at query time by running over all the terms currently in the field assigning each a unique number. This would explain why you still see a small amount of field data usage even when you are using doc values for a not_analyzed String field.
1 个回复
rochy - rochy_he
赞同来自: kennywu76
全局序数(global_ordinals)是一个数据结构,它为该字段的索引中的每个 Term 分配一个数字(序号),以达到节约内存的目的。
全局序数(global_ordinals)不能包含在Doc Values中,因为它们需要在查询时通过运行当前在字段中指定每个唯一编号的所有 Term 来计算。
因此,即使在对not_analyzed String字段使用Doc Values时,仍然会看到少量的字段数据使用情况。
下面是原文:
When using Doc Values on a not_analyzed String field you may still get some field data usage from global ordinals . This is a data structure that assigns a number (ordinal) to each term in the index for that field to save using excess memory by having multiple copies of the String value of the field when doing calculations. Global ordinals cannot be included in Doc Values as they need to be computed at query time by running over all the terms currently in the field assigning each a unique number. This would explain why you still see a small amount of field data usage even when you are using doc values for a not_analyzed String field.