ELK版本7.3.2 kafka版本的2.12
业务需要把kafka的数据通过logstash传入ES,kafka的数据是GBK格式的,无论怎么设置入到es都是乱码。
input {
file {
path => "/home/wangyu1/app/logstash/gbk2.txt"
start_position => "beginning"
codec => plain{
charset=>"GBK"
}
}
}
output {
file {
path => "/home/wangyu1/app/logstash/result_gbk1715.txt"
}
stdout{
}
}
input {
kafka {
bootstrap_servers=>"localhost:9092"
topics=>["jiake-alarm"]
type => "jiake-alarm"
codec => plain{
charset=>"GBK"
}
}
}
output {
kafka {
bootstrap_servers=>"localhost:9092"
topic_id=>["jiake_test1"]
}
file {
path => "/home/wangyu1/app/logstash/result_1837.txt"
}
}
业务需要把kafka的数据通过logstash传入ES,kafka的数据是GBK格式的,无论怎么设置入到es都是乱码。
- 确认kafka数据源:打开CRT,设置默认GB2312,输入kafka消费命令,可正常显示中文无乱码
- 测试logstash处理GBK数据:新建GBK格式的测试文件,通过如下配置可正常输出UTF-8格式的结果文件。
input {
file {
path => "/home/wangyu1/app/logstash/gbk2.txt"
start_position => "beginning"
codec => plain{
charset=>"GBK"
}
}
}
output {
file {
path => "/home/wangyu1/app/logstash/result_gbk1715.txt"
}
stdout{
}
}
- kafka input输出就乱码了,配置如下,试过GBK UTF-8 GB2312,输入输出都试过了,每次都乱码而且都乱码格式都不一样
input {
kafka {
bootstrap_servers=>"localhost:9092"
topics=>["jiake-alarm"]
type => "jiake-alarm"
codec => plain{
charset=>"GBK"
}
}
}
output {
kafka {
bootstrap_servers=>"localhost:9092"
topic_id=>["jiake_test1"]
}
file {
path => "/home/wangyu1/app/logstash/result_1837.txt"
}
}
2 个回复
doom
赞同来自: supolu 、yj7778826
org.apache.kafka.common.serialization.StringDeserializer,里面默认用UTF-8转码,这个插件还不能外部配置;
所以,修改成 字节流的形式;
key_deserializer_class => "org.apache.kafka.common.serialization.ByteArrayDeserializer"
value_deserializer_class => "org.apache.kafka.common.serialization.ByteArrayDeserializer"
然后在添加输出转码;
codec => plain { charset => "GBK" }
具体配置如下:
doom
赞同来自: supolu
我的是在数据源端就设置了uft-8;你是要是其他数据源最后在数据源修改为UTF-8,在传入kafka;
下面的flume的配置:
logstash正常配置,不需要处理转码: