三人行必有我师

fore_merge 如何做到 segments 均衡?

Elasticsearch | 作者 caster_QL | 发布于2021年10月04日 | 阅读数:1897

索引一直会有写入删除操作,但是集群压力较大,需要手动 force merge 减轻压力。
当进行一次 froce merge 后,索引再有数据写入删除等操作,就会导致 segments 大小的极度不均和,如图所示,再进行 force merge 也无法解决这个问题。
以下 merge 参数是否只适用于 es 自行 merge 的操作,force merge 并不会受其限制,有没有办法利用 max_merged_segment 参数限制 大 segment 不参与 force merge ?
 
"merge" : {
"scheduler" : {
"max_thread_count" : "4",
"auto_throttle" : "true",
"max_merge_count" : "9"
},
"policy" : {
"reclaim_deletes_weight" : "2.0",
"floor_segment" : "2mb",
"max_merge_at_once_explicit" : "30",
"max_merge_at_once" : "10",
"max_merged_segment" : "5gb",
"expunge_deletes_allowed" : "10.0",
"segments_per_tier" : "10.0",
"deletes_pct_allowed" : "33.0"
}
},

 
 
1.jpg
已邀请:

pzw9696

赞同来自: caster_QL FFFrp

选哪些段集进行合并的决定因素是MergePolicy所影响的,上述参数是TieredMergePolicy,看了下force_merge采用的MergePolicy 
ElasticsearchMergePolicy(包装的最顶层还是TieredMergePolicy) 所以以上参数还是适用于force_merge,我记得 TieredMergePolicy
有个maxMergedSegmentBytes 默认是(5GB/2)  和 deletesPctAllowed 默认33% 这两个参数会限制大段合并,对应这里应该是 max_merged_segment 和 deletes_pct_allowed 你可以自己研究下。
 

Charele - Cisco4321

赞同来自: caster_QL

你说得没错,
max_merged_segment只针对系统自己的merge有效,对手工的force merge操作无效。
 
我不知倒你系统是啥方面的压力,需要“需要手动 force merge 减轻压力”?
还得要“segments 均衡”?

flyingpot

赞同来自:

不建议在仍有写入的情况下做force_merge,原因见ES force merge的官方文档:
 


Force merge should only be called against an index after you have finished writing to it. Force merge can cause very large (>5GB) segments to be produced, and if you continue to write to such an index then the automatic merge policy will never consider these segments for future merges until they mostly consist of deleted documents. This can cause very large segments to remain in the index which can result in increased disk usage and worse search performance.

flyingpot

赞同来自:

ES官方不建议在有写入的情况下做forcemerge:
 


Force merge should only be called against an index after you have finished writing to it. Force merge can cause very large (>5GB) segments to be produced, and if you continue to write to such an index then the automatic merge policy will never consider these segments for future merges until they mostly consist of deleted documents. This can cause very large segments to remain in the index which can result in increased disk usage and worse search performance.


 

要回复问题请先登录注册