用了Elasticsearch,一口气上5T

.txt文件logstash写入es的无法多次导入

Logstash | 作者 Lincoln | 发布于2017年04月20日 | 阅读数:9062

input {
file {
path => "/var/opt/software/logstash/data/disease_data.txt"
start_position => "beginning"
}
}
filter {
csv {
columns => ["var01","var02","var03","var04","var05","var06","var07"]
separator => ","
}
}
output{
elasticsearch {
hosts =>["10.20.112.229:9200"]
index => "temp"
document_type => "disease"
}
}
日志如下  bin/logstash
Sending Logstash's logs to /var/opt/software/logstash/logs which is now configured via log4j2.properties
[2017-04-20T18:01:18,034][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>, :added=>[http://10.20.112.229:9200/]}}
[2017-04-20T18:01:18,048][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://10.20.112.229:9200/, :path=>"/"}
[2017-04-20T18:01:18,347][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>#<URI::HTTP:0x635bff5b URL:http://10.20.112.229:9200/&gt;}
[2017-04-20T18:01:18,352][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2017-04-20T18:01:18,435][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>50001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"_all"=>{"enabled"=>true, "norms"=>false}, "dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword"}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date", "include_in_all"=>false}, "@version"=>{"type"=>"keyword", "include_in_all"=>false}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2017-04-20T18:01:18,447][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>[#<URI::Generic:0x18c9c0bc URL://10.20.112.229:9200>]}
[2017-04-20T18:01:18,454][INFO ][logstash.pipeline ] Starting pipeline {"id"=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>125}
[2017-04-20T18:01:18,998][INFO ][logstash.pipeline ] Pipeline main started
[2017-04-20T18:01:19,107][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9601}
 
.sincedb*查询结果如下
[root@SHB-L0055020 logs]# cd  
[root@SHB-L0055020 ~]# ls -a
. .bash_logout .cshrc .pki .viminfo
.. .bash_profile install.log slogs wls81
anaconda-ks.cfg .bashrc install.log.syslog .ssh yypasswd
backup bin .oracle_jre_usage tarlog.sh.out
.bash_history clean.sh osbackup.sh.out .tcshrc
[root@SHB-L0055020 ~]# su lg522
[lg522@SHB-L0055020 root]$ cd
[lg522@SHB-L0055020 ~]$ ls -a
. .bash_history .bash_profile .gnome2 .oracle_jre_usage
.. .bash_logout .bashrc .kshrc .viminfo
[lg522@SHB-L0055020 ~]$ ps aux | grep logstash
lg522 125766 4.7 1.4 6148680 465320 ? Sl 11:27 6:31 /usr/bin/java -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+DisableExplicitGC -Djava.awt.headless=true -Dfile.encoding=UTF-8 -XX:+HeapDumpOnOutOfMemoryError -Xmx1g -Xms256m -Xss2048k -Djffi.boot.library.path=/opt/elk/logstash/vendor/jruby/lib/jni -Xbootclasspath/a:/opt/elk/logstash/vendor/jruby/lib/jruby.jar -classpath : -Djruby.home=/opt/elk/logstash/vendor/jruby -Djruby.lib=/opt/elk/logstash/vendor/jruby/lib -Djruby.script=jruby -Djruby.shell=/bin/sh org.jruby.Main /opt/elk/logstash/lib/bootstrap/environment.rb logstash/runner.rb -f config/ofbiz_log_import.conf
lg522 140015 0.0 0.0 105364 848 pts/0 R+ 13:43 0:00 grep logstash
[lg522@SHB-L0055020 ~]$

 
已邀请:

medcl - 今晚打老虎。

赞同来自:

问题是说导入一次之后,你想再次导入是么?
如果是的话,你可能需要清理sincedb 文件,路径一般是:$HOME/.sincedb*

要回复问题请先登录注册