存储路径

全部内容
精华
推荐
我的收藏
关于话题

path.data 配置了多个路径后 es的存储和获取机制是什么

贡献

Elasticsearch • rojay 回复了问题 • 7 人关注 • 4 个回复 • 20792 次浏览 • 2018-08-29 09:14 • 来自相关话题

elasticsearch存储index到HDFS

贡献

Elasticsearch • ybtsdst 回复了问题 • 5 人关注 • 4 个回复 • 11492 次浏览 • 2017-05-27 15:06 • 来自相关话题

rojay 回答了问题 • 2018-08-29 09:14 • 4 个回复不感兴趣

path.data 配置了多个路径后 es的存储和获取机制是什么

最近也遇到第一个问题。查找网上所有资料均未给出合适的答案，无奈只好硬着头皮去看源码。好在终于把这个原理理清楚了，来跟大家一起分享一下。

ES多盘shard分配原理
假设现在单机环境中有两块磁盘，es的配置文件elasticsearch.yml中的path.... 显示全部 »

最近也遇到第一个问题。查找网上所有资料均未给出合适的答案，无奈只好硬着头皮去看源码。好在终于把这个原理理清楚了，来跟大家一起分享一下。

ES多盘shard分配原理
假设现在单机环境中有两块磁盘，es的配置文件elasticsearch.yml中的path.data：/index/data,/data2/index/data
配置了两块盘，对应了两个路径。那么我现在要创建hrecord1索引的2个主shard分配原理如下：
首先会创建shard1（我估计ES会优先创建shard编号大的shard，但是影响不大），创建shard1的时候会找出两个路径对应的磁盘空间大的那个盘，然后将shard1放到那个路径下。
创建shard0的时候，会将/index和/data2磁盘的剩余可用空间相加，然后将这个总和乘以百分之五
将前面创建shard1的磁盘空间减去这个百分之五的值，然后再将这个差值与/data2磁盘剩余空间进行比较，找出磁盘空间大的，然后把shard0放到那个大的磁盘空间上。
说白了，这个百分之五的空间是ES为那个创建的shard1设置的预留空间吧。
有错误的地方也欢迎大家指出，一起交流哈！
主要代码在ShardPath.java里面
[code]public static ShardPath selectNewPathForShard(NodeEnvironment env, ShardId shardId, IndexSettings indexSettings,
long avgShardSizeInBytes, Map<Path,Integer> dataPathToShardCount) throws IOException {

final Path dataPath;
final Path statePath;

if (indexSettings.hasCustomDataPath()) {
dataPath = env.resolveCustomLocation(indexSettings, shardId);
statePath = env.nodePaths()[0].resolve(shardId);
} else {
BigInteger totFreeSpace = BigInteger.ZERO;
for (NodeEnvironment.NodePath nodePath : env.nodePaths()) {
totFreeSpace = totFreeSpace.add(BigInteger.valueOf(nodePath.fileStore.getUsableSpace()));
}

// TODO: this is a hack!! We should instead keep track of incoming (relocated) shards since we know
// how large they will be once they're done copying, instead of a silly guess for such cases:

// Very rough heuristic of how much dtisk space we expec the shard will use over its lifetime, the max of current average
// shard size across the cluster and 5% of the total available free space on this node:
BigInteger estShardSizeInBytes = BigInteger.valueOf(avgShardSizeInBytes).max(totFreeSpace.divide(BigInteger.valueOf(20)));

// TODO - do we need something more extensible? Yet, this does the job for now...
final NodeEnvironment.NodePath[] paths = env.nodePaths();
NodeEnvironment.NodePath bestPath = null;
BigInteger maxUsableBytes = BigInteger.valueOf(Long.MIN_VALUE);
for (NodeEnvironment.NodePath nodePath : paths) {
FileStore fileStore = nodePath.fileStore;

BigInteger usableBytes = BigInteger.valueOf(fileStore.getUsableSpace());
assert usableBytes.compareTo(BigInteger.ZERO) >= 0;

// Deduct estimated reserved bytes from usable space:
Integer count = dataPathToShardCount.get(nodePath.path);
if (count != null) {
usableBytes = usableBytes.subtract(estShardSizeInBytes.multiply(BigInteger.valueOf(count)));
}
if (bestPath == null || usableBytes.compareTo(maxUsableBytes) > 0) {
maxUsableBytes = usableBytes;
bestPath = nodePath;
}
}

statePath = bestPath.resolve(shardId);
dataPath = statePath;
}
return new ShardPath(indexSettings.hasCustomDataPath(), dataPath, statePath, shardId);
}[/code]

path.data 配置了多个路径后 es的存储和获取机制是什么

Elasticsearch • rojay 回复了问题 • 7 人关注 • 4 个回复 • 20792 次浏览 • 2018-08-29 09:14 • 来自相关话题

elasticsearch存储index到HDFS

Elasticsearch • ybtsdst 回复了问题 • 5 人关注 • 4 个回复 • 11492 次浏览 • 2017-05-27 15:06 • 来自相关话题

存储路径

path.data 配置了多个路径后 es的存储和获取机制是什么

elasticsearch存储index到HDFS

path.data 配置了多个路径后 es的存储和获取机制是什么

path.data 配置了多个路径后 es的存储和获取机制是什么

elasticsearch存储index到HDFS

话题描述

活动推荐

相关话题

最佳回复者

3 人关注该话题