timeout

elasticsearch collector [index-stats] timed out when collecting data

贡献

Elasticsearch • dingxiaocao 回复了问题 • 10 人关注 • 16 个回复 • 24265 次浏览 • 2021-05-15 19:06 • 来自相关话题

一次filebeat i/o timeout 问题记录-ES内存引起

Beats • ziyou 发表了文章 • 2 个评论 • 9528 次浏览 • 2019-01-07 16:17 • 来自相关话题

问题 kibana展示数据表明数据采集中断了，没有新的日志数据进来了。排查查看logstash日志：

[2019-01-07T14:59:27,435][INFO ][org.logstash.beats.BeatsHandler] Exception: Connection reset by peer
[2019-01-07T14:59:29,870][INFO ][org.logstash.beats.BeatsHandler] Exception: Connection reset by peer
[2019-01-07T14:59:29,870][INFO ][org.logstash.beats.BeatsHandler] Exception: Connection reset by peer
[2019-01-07T14:59:41,719][INFO ][org.logstash.beats.BeatsHandler] Exception: Connection reset by peer
[2019-01-07T14:59:42,777][INFO ][org.logstash.beats.BeatsHandler] Exception: Connection reset by peer
[2019-01-07T14:59:48,227][INFO ][org.logstash.beats.BeatsHandler] Exception: Connection reset by peer

查看filebeat日志：

2019-01-07T15:00:13+08:00 INFO No non-zero metrics in the last 30s
2019-01-07T15:00:43+08:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=1 libbeat.logstash.publish.write_bytes=241120
2019-01-07T15:00:48+08:00 ERR Failed to publish events (host: 10.68.24.138:5044:10200), caused by: read tcp 10.68.24.46:59310->10.68.24.138:5044: i/o timeout
2019-01-07T15:00:48+08:00 INFO Error publishing events (retrying): read tcp 10.68.24.46:59310->10.68.24.138:5044: i/o timeout
2019-01-07T15:01:13+08:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.read_errors=1 libbeat.logstash.published_but_not_acked_events=2034

查看的初步结果是，filebeat连不上logstash，logstash一直重置filebeat的连接，但是这两个机器是一点问题没有。日志看过了，没有明显的问题，那就按部就班一步一步查吧 1、先来最基本的，查看elasticsearch、logstash、filebeat是否启动。 2、网络，网络环境是之前配置好的，一直没有变的，网络的可能性小一些，但是也是使用telnet测试一下各个端口是不是通的。 3、logstash故障，查看是不是因为logstash的未知故障，记录logstash的日志，然后重启logstash，看看重启logstash后是否解决问题了。 4、日志，查看日志是否是在更新，在5分钟以内是否在更新，因为是在运行的环境，日志一般不会断，所以我把这个检查放在了第四步。 5、查看ES的硬盘和内存。

GET /_cat/allocation?v
GET _cat/nodes?v

问题排查到第五步已经发现原因了：ES其中一台机器的内存满了。 原因始末 在部署这套ELK环境的时候，由于服务器提供方当时提供的两台ES机器的内存不一样，一台是8G的，一台是4G的，所以在使用的的时候，我配置的ES的堆内存一台是4G，一台是2G；ES集群就两台机器，也没配置数据节点和客户端节点，其实三台、四台我也都不配置的，集群太小再分开配置，就没有服务器了。开始使用的时候是没有问题的，但是当日志达到一定量的时候，2G的那台机器堆内存耗光了，然后就出现了日志不能采集的i/o timeout问题。经验在使用ELK的过程中，以上的五种原因导致的filebeat日志采集异常，我都遇见过，其中容易忽略的就是ES的内存和硬盘是否已经满了，当ES集群中其中一台机器的堆内存和硬盘满了的话，都会引起日志采集异常。所以在配置ES集群的时候最好所有的data节点的内存和硬盘配置一致。

[2019-01-07T14:59:27,435][INFO ][org.logstash.beats.BeatsHandler] Exception: Connection reset by peer
[2019-01-07T14:59:29,870][INFO ][org.logstash.beats.BeatsHandler] Exception: Connection reset by peer
[2019-01-07T14:59:29,870][INFO ][org.logstash.beats.BeatsHandler] Exception: Connection reset by peer
[2019-01-07T14:59:41,719][INFO ][org.logstash.beats.BeatsHandler] Exception: Connection reset by peer
[2019-01-07T14:59:42,777][INFO ][org.logstash.beats.BeatsHandler] Exception: Connection reset by peer
[2019-01-07T14:59:48,227][INFO ][org.logstash.beats.BeatsHandler] Exception: Connection reset by peer

查看filebeat日志：

2019-01-07T15:00:13+08:00 INFO No non-zero metrics in the last 30s
2019-01-07T15:00:43+08:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=1 libbeat.logstash.publish.write_bytes=241120
2019-01-07T15:00:48+08:00 ERR Failed to publish events (host: 10.68.24.138:5044:10200), caused by: read tcp 10.68.24.46:59310->10.68.24.138:5044: i/o timeout
2019-01-07T15:00:48+08:00 INFO Error publishing events (retrying): read tcp 10.68.24.46:59310->10.68.24.138:5044: i/o timeout
2019-01-07T15:01:13+08:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.read_errors=1 libbeat.logstash.published_but_not_acked_events=2034

GET /_cat/allocation?v
GET _cat/nodes?v

更多...

elasticsearch collector [index-stats] timed out when collecting data

一次filebeat i/o timeout 问题记录-ES内存引起

timeout设置1ms，实际took：4ms以上响应正常数据并提示time_out：false

elaticsearch卡死;重启elaticsearch,".kibana"报错

elasticsearch collector [index-stats] timed out when collecting data

timeout设置1ms，实际took：4ms以上响应正常数据并提示time_out：false

elaticsearch卡死;重启elaticsearch,".kibana"报错

一次filebeat i/o timeout 问题记录-ES内存引起

话题描述

活动推荐

相关话题

最佳回复者

8 人关注该话题