通过 metadata 使logstash配置更简洁

从Logstash 1.5开始，我们可以在logstash配置中使用metadata。metadata不会在output中被序列化输出，这样我们便可以在metadata中添加一些临时的中间数据，而不需要去删除它。

我们可以通过以下方式来访问metadata:

[@metadata][foo]

用例

假设我们有这样一条日志：

[2017-04-01 22:21:21] production.INFO: this is a test log message by leon

我们可以在filter中使用grok来做解析:

grok {
      match => { "message" => "\[%{TIMESTAMP_ISO8601:timestamp}\] %{DATA:env}\.%{DATA:log_level}: %{DATA:content}" }
    }

解析的结果为

{
      "env" => "production",
      "timestamp" => "2017-04-01 22:21:21",
      "log_level" => "INFO",
      "content" => "{\"message\":\"[2017-04-01 22:21:21] production.INFO: this is a test log message by leon\"}"
}

假设我们希望

能把log_level为INFO的日志丢弃掉，但又不想让该字段出现在最终的输出中
输出的索引名中能体现出env，但也不想让该字段出现在输出结果里

对于1，一种方案是在输出之前通过mutate插件把不需要的字段删除掉，但是一旦这样的处理多了，会让配置文件变得“不干净”。

通过 metadata，我们可以轻松地处理这些问题：

grok {
    match => { "message" => "\[%{TIMESTAMP_ISO8601:timestamp}\] %{DATA:[@metadata][env]}\.%{DATA:[@metadata][log_level]}: %{DATA:content}" }
}

if [@metadata][log_level] == "INFO"{
    drop{}    
}

output{
    elasticsearch {
        hosts => ["127.0.0.1:9200"]
        index => "%{[@metadata][env]}-log-%{+YYYY.MM}"
        document_type => "_doc"
    }
}

除了简化我们的配置文件、减少冗余字段意外，同时也能提高logstash的处理速度。

Elasticsearch input插件

有些插件会用到metadata这个特性，比如elasticsearch input插件：

input {
  elasticsearch {
    host => "127.0.0.1"
    # 把 ES document metadata (_index, _type, _id) 包存到 @metadata 中
    docinfo_in_metadata => true
  }
}

filter{
    ......
}

output {
  elasticsearch {
    document_id => "%{[@metadata][_id]}"
    index => "transformed-%{[@metadata][_index]}"
    type => "%{[@metadata][_type]}"
  }
}

调试

一般来说metadata是不会出现在输出中的，除非使用 rubydebug codec 的方式输出：

output { 
  stdout { 
    codec  => rubydebug {
      metadata => true
    }
  }
}

日志经过处理后输出中会包含：

{
    ....,
    "@metadata" => {
        "env" => "production",
        "log_level" => "INFO"
    }
}

总结

由上可见，metadata提供了一种简单、方便的方式来保存中间数据。这样一方面减少了logstash配置文件的复杂性：避免调用remove_field，另一方面也减少了输出中的一些不必要的数据。通过这篇对metadata的介绍，希望能对大家有所帮助。

elasticTalk,qrcode

继续阅读 »

从Logstash 1.5开始，我们可以在logstash配置中使用metadata。metadata不会在output中被序列化输出，这样我们便可以在metadata中添加一些临时的中间数据，而不需要去删除它。

我们可以通过以下方式来访问metadata:

[@metadata][foo]

用例

假设我们有这样一条日志：

[2017-04-01 22:21:21] production.INFO: this is a test log message by leon

我们可以在filter中使用grok来做解析:

grok {
      match => { "message" => "\[%{TIMESTAMP_ISO8601:timestamp}\] %{DATA:env}\.%{DATA:log_level}: %{DATA:content}" }
    }

解析的结果为

{
      "env" => "production",
      "timestamp" => "2017-04-01 22:21:21",
      "log_level" => "INFO",
      "content" => "{\"message\":\"[2017-04-01 22:21:21] production.INFO: this is a test log message by leon\"}"
}

假设我们希望

能把log_level为INFO的日志丢弃掉，但又不想让该字段出现在最终的输出中
输出的索引名中能体现出env，但也不想让该字段出现在输出结果里

对于1，一种方案是在输出之前通过mutate插件把不需要的字段删除掉，但是一旦这样的处理多了，会让配置文件变得“不干净”。

通过 metadata，我们可以轻松地处理这些问题：

grok {
    match => { "message" => "\[%{TIMESTAMP_ISO8601:timestamp}\] %{DATA:[@metadata][env]}\.%{DATA:[@metadata][log_level]}: %{DATA:content}" }
}

if [@metadata][log_level] == "INFO"{
    drop{}    
}

output{
    elasticsearch {
        hosts => ["127.0.0.1:9200"]
        index => "%{[@metadata][env]}-log-%{+YYYY.MM}"
        document_type => "_doc"
    }
}

除了简化我们的配置文件、减少冗余字段意外，同时也能提高logstash的处理速度。

Elasticsearch input插件

有些插件会用到metadata这个特性，比如elasticsearch input插件：

input {
  elasticsearch {
    host => "127.0.0.1"
    # 把 ES document metadata (_index, _type, _id) 包存到 @metadata 中
    docinfo_in_metadata => true
  }
}

filter{
    ......
}

output {
  elasticsearch {
    document_id => "%{[@metadata][_id]}"
    index => "transformed-%{[@metadata][_index]}"
    type => "%{[@metadata][_type]}"
  }
}

调试

一般来说metadata是不会出现在输出中的，除非使用 rubydebug codec 的方式输出：

output { 
  stdout { 
    codec  => rubydebug {
      metadata => true
    }
  }
}

日志经过处理后输出中会包含：

{
    ....,
    "@metadata" => {
        "env" => "production",
        "log_level" => "INFO"
    }
}

总结

由上可见，metadata提供了一种简单、方便的方式来保存中间数据。这样一方面减少了logstash配置文件的复杂性：避免调用remove_field，另一方面也减少了输出中的一些不必要的数据。通过这篇对metadata的介绍，希望能对大家有所帮助。

elasticTalk,qrcode

收起阅读 »

社区日报第384期 (2018-09-04)

1.Bulk 异常引发的 Elasticsearch 内存泄漏排查。
http://t.cn/RFBHC1p
2.使用elastichq监控Elasticsearch机器。
http://t.cn/RFBHHLy
3.使用ELK分析应用事件和日志。
http://t.cn/RFBHnLN

活动预告
1、Elastic 中国开发者大会门票发售中
https://conf.elasticsearch.cn/2018/shenzhen.html
2、Elastic Meetup 9月8日北京线下交流活动免费报名中
https://elasticsearch.cn/article/759

编辑：叮咚光军
归档：https://elasticsearch.cn/article/785
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

[招聘] Community Advocate - China

Elasticsearch 的排名又升了，要不要加入这么一家蒸蒸日上的全球领先的开源软件公司？

职位链接及描述如下： https://boards.greenhouse.io/elastic/jobs/1272161

At Elastic, we have a simple goal: to solve the world's data problems with products that delight and inspire. As the company behind the popular open source projects — Elasticsearch, Kibana, Logstash, and Beats — we help people around the world do great things with their data. From stock quotes to Twitter streams, Apache logs to WordPress blogs, our products are extending what's possible with data, delivering on the promise that good things come from connecting the dots. The Elastic family unites employees across 32 countries into one coherent team, while the broader community spans across over 100 countries.

For all of us at Elastic, community matters. Our users and contributors have helped to ensure that Elasticsearch, Kibana, Logstash, and Beats are more than just code — they are open source projects that people love to use, and love to talk about! As our Community Advocate you will champion our Elastic community.

What You Will Be Doing:

Are you that kind of person who is invigorated by sharing juicy technology goodness with the world? Do you feel at home connecting with the community members: in person, on blogs, in forums, via social channels, and at events? Is presenting at local meetups your jam and are you passionate about the Elastic Stack?

Well, this might just be your dream job.

As a Community Advocate at Elastic, you will be based in China. You will wake up each morning eager to design and deliver presentations at a wide-variety of events from customer meetings, meetups, tradeshows, and other events to help showcase technology. You will do this while traveling the region and, at times, the world, representing Elastic. Maintaining the trust of our community, as well as the respect and trust within the team, is foundational.

What You Bring Along:

Bachelor’s degree in a technical field (e.g. CS, CSE, EE) or relevant work experience as a software developer (mandatory)
Demonstrated ability to craft compelling content - including speaking engagements, blog posts, demos, messaging, etc. (mandatory)
You are comfortable presenting, whether it's at a local meetup or to the office of a C-suite member
Familiarity with, and real passion for, the Elastic Stack
Comfort working with a globally distributed team
Fluency or high working proficiency in Mandarin (mandatory)
Excellent spoken and written English communication skills, since this is our company's language (mandatory)

Please send us your CV in English. Things We'd Be Stoked to See on Your CV:

Conversations in person, on blogs, in forums, via social channels, at events give you energy and you have a proven publication history to show that
Experience working for a startup or an early stage company
Experience with open source software and/or commercial open source companies
Technical background and abilities in APM, PHP, node.js, JS, and/or security (nice-to-have, not mandatory)
Other languages

Additional Information:

Competitive pay based on the work you do here and not your previous salary
Stock options
Global minimum of 16 weeks of paid parental leave (moms and dads)
Generous vacation time and one week of volunteer time off
An environment in which you can balance great work with a great life
Your age is only a number. It doesn't matter if you're just out of college or your children are; we need you for what you can do.
Distributed-first company with Elasticians in over 30 countries, spread across 18 time zones, and speaking over 30 languages!

LI-KE1

Target locations: Beijing, China; Shanghai, China; Hangzhou, China

Elastic is an Equal Employment employer committed to the principles of equal employment opportunity and affirmative action for all applicants and employees. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status or any other basis protected by federal, state or local law, ordinance or regulation. Elastic also makes reasonable accommodations for disabled employees consistent with applicable law.

继续阅读 »

Elasticsearch 的排名又升了，要不要加入这么一家蒸蒸日上的全球领先的开源软件公司？

职位链接及描述如下： https://boards.greenhouse.io/elastic/jobs/1272161

At Elastic, we have a simple goal: to solve the world's data problems with products that delight and inspire. As the company behind the popular open source projects — Elasticsearch, Kibana, Logstash, and Beats — we help people around the world do great things with their data. From stock quotes to Twitter streams, Apache logs to WordPress blogs, our products are extending what's possible with data, delivering on the promise that good things come from connecting the dots. The Elastic family unites employees across 32 countries into one coherent team, while the broader community spans across over 100 countries.

For all of us at Elastic, community matters. Our users and contributors have helped to ensure that Elasticsearch, Kibana, Logstash, and Beats are more than just code — they are open source projects that people love to use, and love to talk about! As our Community Advocate you will champion our Elastic community.

What You Will Be Doing:

Are you that kind of person who is invigorated by sharing juicy technology goodness with the world? Do you feel at home connecting with the community members: in person, on blogs, in forums, via social channels, and at events? Is presenting at local meetups your jam and are you passionate about the Elastic Stack?

Well, this might just be your dream job.

As a Community Advocate at Elastic, you will be based in China. You will wake up each morning eager to design and deliver presentations at a wide-variety of events from customer meetings, meetups, tradeshows, and other events to help showcase technology. You will do this while traveling the region and, at times, the world, representing Elastic. Maintaining the trust of our community, as well as the respect and trust within the team, is foundational.

What You Bring Along:

Bachelor’s degree in a technical field (e.g. CS, CSE, EE) or relevant work experience as a software developer (mandatory)
Demonstrated ability to craft compelling content - including speaking engagements, blog posts, demos, messaging, etc. (mandatory)
You are comfortable presenting, whether it's at a local meetup or to the office of a C-suite member
Familiarity with, and real passion for, the Elastic Stack
Comfort working with a globally distributed team
Fluency or high working proficiency in Mandarin (mandatory)
Excellent spoken and written English communication skills, since this is our company's language (mandatory)

Please send us your CV in English. Things We'd Be Stoked to See on Your CV:

Conversations in person, on blogs, in forums, via social channels, at events give you energy and you have a proven publication history to show that
Experience working for a startup or an early stage company
Experience with open source software and/or commercial open source companies
Technical background and abilities in APM, PHP, node.js, JS, and/or security (nice-to-have, not mandatory)
Other languages

Additional Information:

Competitive pay based on the work you do here and not your previous salary
Stock options
Global minimum of 16 weeks of paid parental leave (moms and dads)
Generous vacation time and one week of volunteer time off
An environment in which you can balance great work with a great life
Your age is only a number. It doesn't matter if you're just out of college or your children are; we need you for what you can do.
Distributed-first company with Elasticians in over 30 countries, spread across 18 time zones, and speaking over 30 languages!

LI-KE1

Target locations: Beijing, China; Shanghai, China; Hangzhou, China

Elastic is an Equal Employment employer committed to the principles of equal employment opportunity and affirmative action for all applicants and employees. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status or any other basis protected by federal, state or local law, ordinance or regulation. Elastic also makes reasonable accommodations for disabled employees consistent with applicable law.

收起阅读 »

社区日报第383期 (2018-09-03)

1.kibana Prometheus 监控插件
http://t.cn/RFT1agw

2.logstash 6.4 新特性简介
http://t.cn/RFYDKng

3.elasticsearch rest read only 插件
http://t.cn/RZpF03g

活动预告
1、Elastic 中国开发者大会门票发售中
https://conf.elasticsearch.cn/2018/shenzhen.html
2、Elastic Meetup 9月8日北京线下交流活动免费报名中
https://elasticsearch.cn/article/759

编辑：cyberdak
归档：https://elasticsearch.cn/article/783
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

社区日报第382期 (2018-09-02)

1.如何将Heroku日志导入到Logsene / Managed ELK Stack。
http://t.cn/RFSfTz6
2.5分钟将CoreOS日志导入到ELK。
http://t.cn/RFSxz0S
3.苹果圈：iPhone XS推出9月12日确认，新iPhone SE 2泄漏，苹果公司的恐慌方案。
http://t.cn/RFS9bWj

活动预告：
1、Elastic 中国开发者大会门票发售中
https://conf.elasticsearch.cn/2018/shenzhen.html
2、Elastic Meetup 9月8日北京线下交流活动免费报名中
https://elasticsearch.cn/article/759

编辑：至尊宝
归档：https://elasticsearch.cn/article/782
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

社区日报第381期 (2018-09-01)

Elasticsearch 5.x 字段折叠的使用。 http://t.cn/RFodkB2
一例Query Cache引起的性能问题分析。 http://t.cn/RnOejt6
ES性能优化。 http://t.cn/RFoFaRM

活动预告

1、Elastic 中国开发者大会最后一波早鸟票发售进行中 https://conf.elasticsearch.cn/2018/shenzhen.html

2、Elastic Meetup 9月8日北京线下沙龙正在报名中 https://elasticsearch.cn/article/759

继续阅读 »

社区日报第380期 (2018-08-31)

1、Elasticsearch存储详解
http://t.cn/RFcyAtp
2、Elastic stack 针对 Azure 云的监控解决方案
http://t.cn/RFc4ew4
3、SpringBoot集成ElasticSearch
http://t.cn/RFcyKM2

活动预告：
1、Elastic 中国开发者大会最后一波早鸟票发售进行中
https://conf.elasticsearch.cn/2018/shenzhen.html
2、Elastic Meetup 9月8日北京线下沙龙正在报名中
https://elasticsearch.cn/article/759

编辑：铭毅天下
归档：https://elasticsearch.cn/article/780
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

Curator从入门到实战

Curator 是elasticsearch 官方的一个索引管理工具，可以通过配置文件的方式帮助我们对指定的一批索引进行创建/删除、打开/关闭、快照/恢复等管理操作。

场景

比如，出于读写性能的考虑，我们通常会把基于时间的数据按时间来创建索引。

indices 当数据量到达一定量级时，为了节省内存或者磁盘空间，我们往往会根据实际情况选择关闭或者删除一定时间之前的索引。通常我们会写一段脚本调用elasticsearch的api，放到crontab中定期执行。这样虽然可以达到目的，但是脚本多了之后会变得难以维护。

Curator是如何解决这类问题的呢？我们一步一步来：

安装

首先，Curator是基于python实现的，我们可以直接通过pip来安装，这种方式最简单。

pip install elasticsearch-curator

基本配置

接下来，需要为 Curator 配置es连接:

# ~/.curator/curator.yml

client:
  hosts:
    - 127.0.0.1
  port: 9200

logging:
  loglevel: INFO

其中hosts 允许配置多个地址，但是只能属于同一个集群。

这边只列举了最基本的配置，官方文档中包含了更详细的配置。

动作配置

然后需要配置我们需要执行的动作，每个动作会按顺序执行：

# /etc/curator/actions/maintain_log.yml

actions:
  1:
    #创建第二天的索引
    action: create_index
    description: "create new time-based index for log-*"
    options:
      name: '<log-{now/d+1d}>'
  2:
    #删除3天前的索引
    action: delete_indices
    description: "delete outdated indices for log-*"
    filters:
    - filtertype: pattern
      kind: prefix
      value: log
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 3

action 定义了需要执行的动作，curator支持十多种动作，可以在官方文档查看完整的动作列表。

options 定义了执行动作所需的参数，不同动作的参数也不尽相同，具体文档中都有写明。

filters 定义了动作的执行对象，通过设置filter，可以过滤出我们需要操作的索引。同一个action下的filter之间是且的关系。比如在上面的定义中，delete_indices下定义了两个filters：

模式匹配：匹配前缀为log的索引
“年龄”匹配：根据索引名中“%Y.%m.%d”时间格式，过滤出3天以前的索引

curator支持十多种filter，可以在官方文档查看完整列表。

执行

最后，我们通过curator命令行工具来执行：

curator --config /etc/curator/curator.yml /etc/curator/actions/maintain_log.yml

得到命令行输出：

2018-08-30 12:31:26,829 INFO      Preparing Action ID: 1, "create_index"
2018-08-30 12:31:26,841 INFO      Trying Action ID: 1, "create_index": create new time-based index for log-*
2018-08-30 12:31:26,841 INFO      "<log-{now/d+1d}>" is using Elasticsearch date math.
2018-08-30 12:31:26,841 INFO      Creating index "<log-{now/d+1d}>" with settings: {}
2018-08-30 12:31:27,049 INFO      Action ID: 1, "create_index" completed.
2018-08-30 12:31:27,050 INFO      Preparing Action ID: 2, "delete_indices"
2018-08-30 12:31:27,058 INFO      Trying Action ID: 2, "delete_indices": delete outdated indices for log-*
2018-08-30 12:31:27,119 INFO      Deleting selected indices: ['log-2018.08.24', 'log-2018.08.25', 'log-2018.08.27', 'log-2018.08.26', 'log-2018.08.23']
2018-08-30 12:31:27,119 INFO      ---deleting index log-2018.08.24
2018-08-30 12:31:27,120 INFO      ---deleting index log-2018.08.25
2018-08-30 12:31:27,120 INFO      ---deleting index log-2018.08.27
2018-08-30 12:31:27,120 INFO      ---deleting index log-2018.08.26
2018-08-30 12:31:27,120 INFO      ---deleting index log-2018.08.23
2018-08-30 12:31:27,282 INFO      Action ID: 2, "delete_indices" completed.
2018-08-30 12:31:27,283 INFO      Job completed.

从日志中可以看到，我们已经成功创建了隔天的索引，并删除了28号以前的索引。

定时执行

配置好curator后，还需要配置定时任务

使用crontab -e编辑crontab，

添加一行：

0 23 * * * /usr/local/bin/curator --config /root/.curator/curator.yml /etc/curator/actions/maintain_log.yml >> /var/curator.log 2>&1

crontab配置中的第一段是执行的周期，6个值分别是“分时日月周”，*表示全部。所以这段配置的含义是在每天23点执行我们的这段脚本。

单个执行

除了定时任务，我们也可以在不依赖action配置文件的情况下用curator执行一些临时的批量操作。curator提供了curator_cli的命令来执行单个action，比如我们想对所有log开头的索引做快照，使用一条命令即可完成：

curator_cli snapshot --repository repo_name --filter_list {"filtertype": "pattern","kind": "prefix", "value": "log"}

是不是特别方便？

执行流程

在命令执行过程中，Curator 会进行以下几步操作：

从ES拉取所有的索引信息
根据设置的过滤条件过滤出需要操作的索引
对过滤后的索引执行指定的动作

复杂需求

实际生产中，会有一些更复杂的需求，简单的action和filter组合并不能满足我们的业务。Curator还提供了python包，方便我们自己写脚本时调用它提供的actions和filters，减少我们的开发工作量。

以上通过一个实际的场景向大家介绍了Curator的使用方式，但是只用到了它一小部分的功能。大家可以通过文中的链接查看官方文档，发掘出更多的使用姿势。希望对大家有所帮助！

elasticTalk,qrcode

继续阅读 »

Curator 是elasticsearch 官方的一个索引管理工具，可以通过配置文件的方式帮助我们对指定的一批索引进行创建/删除、打开/关闭、快照/恢复等管理操作。

场景

比如，出于读写性能的考虑，我们通常会把基于时间的数据按时间来创建索引。

indices 当数据量到达一定量级时，为了节省内存或者磁盘空间，我们往往会根据实际情况选择关闭或者删除一定时间之前的索引。通常我们会写一段脚本调用elasticsearch的api，放到crontab中定期执行。这样虽然可以达到目的，但是脚本多了之后会变得难以维护。

Curator是如何解决这类问题的呢？我们一步一步来：

安装

首先，Curator是基于python实现的，我们可以直接通过pip来安装，这种方式最简单。

pip install elasticsearch-curator

基本配置

接下来，需要为 Curator 配置es连接:

# ~/.curator/curator.yml

client:
  hosts:
    - 127.0.0.1
  port: 9200

logging:
  loglevel: INFO

其中hosts 允许配置多个地址，但是只能属于同一个集群。

这边只列举了最基本的配置，官方文档中包含了更详细的配置。

动作配置

然后需要配置我们需要执行的动作，每个动作会按顺序执行：

# /etc/curator/actions/maintain_log.yml

actions:
  1:
    #创建第二天的索引
    action: create_index
    description: "create new time-based index for log-*"
    options:
      name: '<log-{now/d+1d}>'
  2:
    #删除3天前的索引
    action: delete_indices
    description: "delete outdated indices for log-*"
    filters:
    - filtertype: pattern
      kind: prefix
      value: log
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 3

action 定义了需要执行的动作，curator支持十多种动作，可以在官方文档查看完整的动作列表。

options 定义了执行动作所需的参数，不同动作的参数也不尽相同，具体文档中都有写明。

filters 定义了动作的执行对象，通过设置filter，可以过滤出我们需要操作的索引。同一个action下的filter之间是且的关系。比如在上面的定义中，delete_indices下定义了两个filters：

模式匹配：匹配前缀为log的索引
“年龄”匹配：根据索引名中“%Y.%m.%d”时间格式，过滤出3天以前的索引

curator支持十多种filter，可以在官方文档查看完整列表。

执行

最后，我们通过curator命令行工具来执行：

curator --config /etc/curator/curator.yml /etc/curator/actions/maintain_log.yml

得到命令行输出：

2018-08-30 12:31:26,829 INFO      Preparing Action ID: 1, "create_index"
2018-08-30 12:31:26,841 INFO      Trying Action ID: 1, "create_index": create new time-based index for log-*
2018-08-30 12:31:26,841 INFO      "<log-{now/d+1d}>" is using Elasticsearch date math.
2018-08-30 12:31:26,841 INFO      Creating index "<log-{now/d+1d}>" with settings: {}
2018-08-30 12:31:27,049 INFO      Action ID: 1, "create_index" completed.
2018-08-30 12:31:27,050 INFO      Preparing Action ID: 2, "delete_indices"
2018-08-30 12:31:27,058 INFO      Trying Action ID: 2, "delete_indices": delete outdated indices for log-*
2018-08-30 12:31:27,119 INFO      Deleting selected indices: ['log-2018.08.24', 'log-2018.08.25', 'log-2018.08.27', 'log-2018.08.26', 'log-2018.08.23']
2018-08-30 12:31:27,119 INFO      ---deleting index log-2018.08.24
2018-08-30 12:31:27,120 INFO      ---deleting index log-2018.08.25
2018-08-30 12:31:27,120 INFO      ---deleting index log-2018.08.27
2018-08-30 12:31:27,120 INFO      ---deleting index log-2018.08.26
2018-08-30 12:31:27,120 INFO      ---deleting index log-2018.08.23
2018-08-30 12:31:27,282 INFO      Action ID: 2, "delete_indices" completed.
2018-08-30 12:31:27,283 INFO      Job completed.

从日志中可以看到，我们已经成功创建了隔天的索引，并删除了28号以前的索引。

定时执行

配置好curator后，还需要配置定时任务

使用crontab -e编辑crontab，

添加一行：

0 23 * * * /usr/local/bin/curator --config /root/.curator/curator.yml /etc/curator/actions/maintain_log.yml >> /var/curator.log 2>&1

crontab配置中的第一段是执行的周期，6个值分别是“分时日月周”，*表示全部。所以这段配置的含义是在每天23点执行我们的这段脚本。

单个执行

除了定时任务，我们也可以在不依赖action配置文件的情况下用curator执行一些临时的批量操作。curator提供了curator_cli的命令来执行单个action，比如我们想对所有log开头的索引做快照，使用一条命令即可完成：

curator_cli snapshot --repository repo_name --filter_list {"filtertype": "pattern","kind": "prefix", "value": "log"}

是不是特别方便？

执行流程

在命令执行过程中，Curator 会进行以下几步操作：

从ES拉取所有的索引信息
根据设置的过滤条件过滤出需要操作的索引
对过滤后的索引执行指定的动作

复杂需求

实际生产中，会有一些更复杂的需求，简单的action和filter组合并不能满足我们的业务。Curator还提供了python包，方便我们自己写脚本时调用它提供的actions和filters，减少我们的开发工作量。

以上通过一个实际的场景向大家介绍了Curator的使用方式，但是只用到了它一小部分的功能。大家可以通过文中的链接查看官方文档，发掘出更多的使用姿势。希望对大家有所帮助！

elasticTalk,qrcode

收起阅读 »

社区日报第379期 (2018-08-30)

1.OTTO Motors: 使用elastic stack扩展物联网环境
http://t.cn/RF5H9OG
2.ELK构建MySQL慢日志收集平台详解
http://t.cn/Rk7zKT7
3.千亿级数量下日志分析系统的技术架构选型
http://t.cn/RF5HEhS

活动预告：
1.Elastic 中国开发者大会最后一波早鸟票发售进行中
https://conf.elasticsearch.cn/2018/shenzhen.html
2.Elastic Meetup 9月8日北京线下沙龙正在报名中
https://elasticsearch.cn/article/759

编辑：金桥
归档：https://elasticsearch.cn/article/778
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

社区日报第378期 (2018-08-29)

1.基于 Elasticsearch 的人才搜索架构
http://t.cn/RKYPGL3
2.ElasticSearch 深入理解系列
http://t.cn/RF2LdiG
http://t.cn/RF2zPF9
http://t.cn/RF2zABQ
http://t.cn/RF22Efd
3.使用ELK构建微服务的日志平台
http://t.cn/Rkb1wdM

1、活动预告：Elastic 中国开发者大会最后一波早鸟票发售进行中
https://conf.elasticsearch.cn/2018/shenzhen.html
2、Elastic Meetup 9月8日北京线下沙龙正在报名中
https://elasticsearch.cn/article/759

编辑：江水
归档：https://elasticsearch.cn/article/777
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

Elastic 中国开发者大会 2018 疯狂来袭！

10月5日，Elastic 正式在纽交所上市了，股票代码 ESTC，当日股票涨幅超过100%，超越14年阿里巴巴，开盘价创有史以来新高。

Elastic这么受欢迎，说明大家手里的 Elastic 技术更值钱了，那么在国内的年度开发者交流大会更是不能错过啦，并且现在福利来了，大会门票抢购中，https://www.bagevent.com/event/1654662?discountCode=50OFF 手快有，手慢无啊！

知道 ELK 么？知道 Elasticsearch 么？目前最流行的开源数据库及分析类软件，目前已新晋级到数据库兵器谱排名第七位，搜索引擎排行榜长期霸占第一位，想要了解更多他的本事，快来了解一下他的官方用户大会：Elastic 中国开发者大会，时间2018年11月10日周六，地点深圳金茂 JW 万豪酒店。届时，将有来自 Elastic、eBay、暴雪、Grab、华为、阿里巴巴、顺丰等公司的25位各领域的专家大拿为你带来围绕 Elastic 开源技术的各自精彩干货分享。

Elastic Stack 作为目前全球最流行的数据搜索与实时分析引擎套件，其产品累计下载次数已超过三亿五千万次，各行各业从一线互联网公司到传统的行业都能找到使用 Elasticsearch 的身影。Elastic 的开源技术正越来越受到众多开发者的青睐，已然成为大数据领域分析工具的最佳选择。

[来自 db-engines.com的最新综合排名]

Elastic 中国开发者大会 2018（Elastic Developers China 2018）是由 Elastic 官方在中国举办的第二次开发者大会，主要围绕 Elastic 的开源产品: Elasticsearch、Logstash、Kibana 和 Beats，探讨在搜索、数据实时分析、日志分析、安全等领域的实践与应用。

举办 Elastic 开发者大会的目的是为中国广大的 Elastic 开发者提供一个技术交流和学习切磋的地方，汇集业界众多的成功案例，集思广益，发散思维，促进社区和行业的进步。

不管您是 Elasticsearch 的初学者还是资深的用户，您都应该参加！

大会亮点

01 部分精彩议题

Beats 创始人 Monica Sarbu 带来的运维分析的三连击
新产品 Codesearch（原 Insight.io）的初次亮相
Elastic 内部是如何使用 Elastic 产品的案例
暴雪中国借助 ELK 运营游戏的经验分享
东南亚打车软件 Grab 的 POI 搜索平台迁移史
东半球对 Kibana 二次开发最多团队带来的经验分享
千亿数据规模下的 Elasticsearch 深度应用
Elasticsearch 和 AI 的深度结合与用户画像系统

更多在互联网、证券、快递、新零售、安全等领域的分享，点击这里了解更多

02 部分嘉宾阵容

03 Elastic AMA 展台

AMA 即 Ask Me Anything，意思是尽管随便问，您可以在大会当天，尽情的在现场咨询 Elastic 官方工作人员任意的问题，可以是技术的咨询，可以是特性的讲解，可以是最佳实践，可以是商务合作。AMA 展台，您一定不要错过。

04 Elastic Demo 展台

如果您对 Elastic 能够帮您做哪些事情比较感兴趣，最直观的方式就是来到 Elastic 的 Demo 展台，现场有 Elastic 技术专家为您讲解各种酷炫的 Demo 以及具体的如何使用 Elastic Stack 来完成特定的任务。 Demo 展台有趣又好玩，记得打卡。

05 闪电演讲

您也来讲讲，大会的最后一个环节名叫闪电演讲，参会者可以现场报名，每位分享者可以有5分钟的时间来进行分享，可以是任何相关的话题，可以是 Demo 演示，可以是技术脱口秀，或是您的一个开源的项目，名额有限，先报先得。

最后，赶紧报名吧！ https://www.bagevent.com/event/1654662?discountCode=50OFF

继续阅读 »

10月5日，Elastic 正式在纽交所上市了，股票代码 ESTC，当日股票涨幅超过100%，超越14年阿里巴巴，开盘价创有史以来新高。

Elastic这么受欢迎，说明大家手里的 Elastic 技术更值钱了，那么在国内的年度开发者交流大会更是不能错过啦，并且现在福利来了，大会门票抢购中，https://www.bagevent.com/event/1654662?discountCode=50OFF 手快有，手慢无啊！

知道 ELK 么？知道 Elasticsearch 么？目前最流行的开源数据库及分析类软件，目前已新晋级到数据库兵器谱排名第七位，搜索引擎排行榜长期霸占第一位，想要了解更多他的本事，快来了解一下他的官方用户大会：Elastic 中国开发者大会，时间2018年11月10日周六，地点深圳金茂 JW 万豪酒店。届时，将有来自 Elastic、eBay、暴雪、Grab、华为、阿里巴巴、顺丰等公司的25位各领域的专家大拿为你带来围绕 Elastic 开源技术的各自精彩干货分享。

Elastic Stack 作为目前全球最流行的数据搜索与实时分析引擎套件，其产品累计下载次数已超过三亿五千万次，各行各业从一线互联网公司到传统的行业都能找到使用 Elasticsearch 的身影。Elastic 的开源技术正越来越受到众多开发者的青睐，已然成为大数据领域分析工具的最佳选择。

[来自 db-engines.com的最新综合排名]

Elastic 中国开发者大会 2018（Elastic Developers China 2018）是由 Elastic 官方在中国举办的第二次开发者大会，主要围绕 Elastic 的开源产品: Elasticsearch、Logstash、Kibana 和 Beats，探讨在搜索、数据实时分析、日志分析、安全等领域的实践与应用。

举办 Elastic 开发者大会的目的是为中国广大的 Elastic 开发者提供一个技术交流和学习切磋的地方，汇集业界众多的成功案例，集思广益，发散思维，促进社区和行业的进步。

不管您是 Elasticsearch 的初学者还是资深的用户，您都应该参加！

大会亮点

01 部分精彩议题

Beats 创始人 Monica Sarbu 带来的运维分析的三连击
新产品 Codesearch（原 Insight.io）的初次亮相
Elastic 内部是如何使用 Elastic 产品的案例
暴雪中国借助 ELK 运营游戏的经验分享
东南亚打车软件 Grab 的 POI 搜索平台迁移史
东半球对 Kibana 二次开发最多团队带来的经验分享
千亿数据规模下的 Elasticsearch 深度应用
Elasticsearch 和 AI 的深度结合与用户画像系统

更多在互联网、证券、快递、新零售、安全等领域的分享，点击这里了解更多

02 部分嘉宾阵容

03 Elastic AMA 展台

AMA 即 Ask Me Anything，意思是尽管随便问，您可以在大会当天，尽情的在现场咨询 Elastic 官方工作人员任意的问题，可以是技术的咨询，可以是特性的讲解，可以是最佳实践，可以是商务合作。AMA 展台，您一定不要错过。

04 Elastic Demo 展台

如果您对 Elastic 能够帮您做哪些事情比较感兴趣，最直观的方式就是来到 Elastic 的 Demo 展台，现场有 Elastic 技术专家为您讲解各种酷炫的 Demo 以及具体的如何使用 Elastic Stack 来完成特定的任务。 Demo 展台有趣又好玩，记得打卡。

05 闪电演讲

您也来讲讲，大会的最后一个环节名叫闪电演讲，参会者可以现场报名，每位分享者可以有5分钟的时间来进行分享，可以是任何相关的话题，可以是 Demo 演示，可以是技术脱口秀，或是您的一个开源的项目，名额有限，先报先得。

最后，赶紧报名吧！ https://www.bagevent.com/event/1654662?discountCode=50OFF

收起阅读 »

听说你还没掌握 Normalizer 的使用方法？

在 Elasticsearch 中处理字符串类型的数据时，如果我们想把整个字符串作为一个完整的 term 存储，我们通常会将其类型 type 设定为 keyword。但有时这种设定又会给我们带来麻烦，比如同一个数据再写入时由于没有做好清洗，导致大小写不一致，比如 apple、Apple两个实际都是 apple，但当我们去搜索 apple时却无法返回 Apple的文档。要解决这个问题，就需要 Normalizer出场了。废话不多说，直接上手看！

1. 上手

我们先来重现一下开篇的问题

PUT test_normalizer
{
  "mappings": {
    "doc":{
      "properties": {
        "type":{
          "type":"keyword"
        }
      }
    }
  }
}

PUT test_normalizer/doc/1
{
  "type":"apple"
}

PUT test_normalizer/doc/2
{
  "type":"Apple"
}

# 查询一 
GET test_normalizer/_search
{
  "query": {
    "match":{
      "type":"apple"
    }
  }
}

# 查询二
GET test_normalizer/_search
{
  "query": {
    "match":{
      "type":"aPple"
    }
  }
}

大家执行后会发现 查询一返回了文档1，而 查询二没有文档返回，原因如下图所示：

Docs写入 Elasticsearch时由于 type是 keyword,分词结果为原始字符串
查询 Query 时分词默认是采用和字段写时相同的配置，因此这里也是 keyword，因此分词结果也是原始字符
两边的分词进行匹对，便得出了我们上面的结果

2. Normalizer

normalizer是 keyword的一个属性，可以对 keyword生成的单一 Term再做进一步的处理，比如 lowercase，即做小写变换。使用方法和自定义分词器有些类似，需要自定义，如下所示：

DELETE test_normalizer
# 自定义 normalizer
PUT test_normalizer
{
  "settings": {
    "analysis": {
      "normalizer": {
        "lowercase": {
          "type": "custom",
          "filter": [
            "lowercase"
          ]
        }
      }
    }
  },
  "mappings": {
    "doc": {
      "properties": {
        "type": {
          "type": "keyword"
        },
        "type_normalizer": {
          "type": "keyword",
          "normalizer": "lowercase"
        }
      }
    }
  }
}

PUT test_normalizer/doc/1
{
  "type": "apple",
  "type_normalizer": "apple"
}

PUT test_normalizer/doc/2
{
  "type": "Apple",
  "type_normalizer": "Apple"
}
# 查询三
GET test_normalizer/_search
{
  "query": {
    "term":{
      "type":"aPple"
    }
  }
}

# 查询四
GET test_normalizer/_search
{
  "query": {
    "term":{
      "type_normalizer":"aPple"
    }
  }
}

我们第一步是自定义了名为 lowercase的 normalizer，其中filter 类似自定义分词器中的 filter ，但是可用的种类很少，详情大家可以查看官方文档。然后通过 normalizer属性设定到字段type_normalizer中，然后插入相同的2条文档。执行发现，查询三无结果返回，查询四返回2条文档。

问题解决了！我们来看下是如何解决的

文档写入时由于加入了 normalizer,所有的 term都会被做小写处理
查询时搜索词同样采用有 normalizer的配置，因此处理后的 term也是小写的
两边分词匹对，就得到了我们上面的结果

3. 总结

本文通过一个实例来给大家讲解了 Normalizer的实际使用场景，希望对大家有所帮助！

继续阅读 »

在 Elasticsearch 中处理字符串类型的数据时，如果我们想把整个字符串作为一个完整的 term 存储，我们通常会将其类型 type 设定为 keyword。但有时这种设定又会给我们带来麻烦，比如同一个数据再写入时由于没有做好清洗，导致大小写不一致，比如 apple、Apple两个实际都是 apple，但当我们去搜索 apple时却无法返回 Apple的文档。要解决这个问题，就需要 Normalizer出场了。废话不多说，直接上手看！

1. 上手

我们先来重现一下开篇的问题

PUT test_normalizer
{
  "mappings": {
    "doc":{
      "properties": {
        "type":{
          "type":"keyword"
        }
      }
    }
  }
}

PUT test_normalizer/doc/1
{
  "type":"apple"
}

PUT test_normalizer/doc/2
{
  "type":"Apple"
}

# 查询一 
GET test_normalizer/_search
{
  "query": {
    "match":{
      "type":"apple"
    }
  }
}

# 查询二
GET test_normalizer/_search
{
  "query": {
    "match":{
      "type":"aPple"
    }
  }
}

大家执行后会发现 查询一返回了文档1，而 查询二没有文档返回，原因如下图所示：

Docs写入 Elasticsearch时由于 type是 keyword,分词结果为原始字符串
查询 Query 时分词默认是采用和字段写时相同的配置，因此这里也是 keyword，因此分词结果也是原始字符
两边的分词进行匹对，便得出了我们上面的结果

2. Normalizer

normalizer是 keyword的一个属性，可以对 keyword生成的单一 Term再做进一步的处理，比如 lowercase，即做小写变换。使用方法和自定义分词器有些类似，需要自定义，如下所示：

DELETE test_normalizer
# 自定义 normalizer
PUT test_normalizer
{
  "settings": {
    "analysis": {
      "normalizer": {
        "lowercase": {
          "type": "custom",
          "filter": [
            "lowercase"
          ]
        }
      }
    }
  },
  "mappings": {
    "doc": {
      "properties": {
        "type": {
          "type": "keyword"
        },
        "type_normalizer": {
          "type": "keyword",
          "normalizer": "lowercase"
        }
      }
    }
  }
}

PUT test_normalizer/doc/1
{
  "type": "apple",
  "type_normalizer": "apple"
}

PUT test_normalizer/doc/2
{
  "type": "Apple",
  "type_normalizer": "Apple"
}
# 查询三
GET test_normalizer/_search
{
  "query": {
    "term":{
      "type":"aPple"
    }
  }
}

# 查询四
GET test_normalizer/_search
{
  "query": {
    "term":{
      "type_normalizer":"aPple"
    }
  }
}

我们第一步是自定义了名为 lowercase的 normalizer，其中filter 类似自定义分词器中的 filter ，但是可用的种类很少，详情大家可以查看官方文档。然后通过 normalizer属性设定到字段type_normalizer中，然后插入相同的2条文档。执行发现，查询三无结果返回，查询四返回2条文档。

问题解决了！我们来看下是如何解决的

文档写入时由于加入了 normalizer,所有的 term都会被做小写处理
查询时搜索词同样采用有 normalizer的配置，因此处理后的 term也是小写的
两边分词匹对，就得到了我们上面的结果

3. 总结

本文通过一个实例来给大家讲解了 Normalizer的实际使用场景，希望对大家有所帮助！

收起阅读 »

社区日报第377期 (2018-08-28)

1.在Elasticsearch Service上部署hot-warm-logging 集群。
http://t.cn/RksKnp8
2.Elasticsearch6.4尝新和骚动的Redis。
http://t.cn/RksKey6
3.LEAISTIC:管理Elasticsearch的微服务库。
http://t.cn/Rks9Pw7

1、活动预告：Elastic 中国开发者大会最后一波早鸟票发售进行中
https://conf.elasticsearch.cn/2018/shenzhen.html
2、Elastic Meetup 9月8日北京线下沙龙正在报名中
https://elasticsearch.cn/article/759

编辑：叮咚光军
归档：https://elasticsearch.cn/article/774
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

社区日报第376期 (2018-08-27)

1、elastic 官方韩语分析器
http://t.cn/RkdWBXP
2、（自备梯子）运行和扩展巨大的es集群
http://t.cn/RkgA6dF
3、跟随elastic解决方案架构团队成员，了解elastic stack架构最佳实践
http://t.cn/RkdT4CC

活动预告：
1、Elastic Meetup 北京线下沙龙征稿中
https://elasticsearch.cn/article/759
2、Elastic 中国开发者大会 2018 ，开始接受演讲申请和赞助合作
https://conf.elasticsearch.cn/2018/shenzhen.html

编辑：cyberdak
归档：https://elasticsearch.cn/article/773
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

社区日报第375期 (2018-08-26)

1.Kibana高级搜索入门。
http://t.cn/Rk1SYUC
2.(自备梯子)四大NoSQL数据库。
http://t.cn/Rk1anR2
3.(自备梯子)您必须在“按时交付的软件”和“良好软件”之间进行选择。
http://t.cn/Rk1abPX

活动预告：
1、Elastic 中国开发者大会最后一波早鸟票发售进行中
https://conf.elasticsearch.cn/2018/shenzhen.html
2、Elastic Meetup 9月8日北京线下沙龙正在报名中
https://elasticsearch.cn/article/759

编辑：至尊宝
归档：https://elasticsearch.cn/article/772
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

用例

Elasticsearch input插件

调试

总结

用例

Elasticsearch input插件

调试

总结

What You Will Be Doing:

Well, this might just be your dream job.

What You Bring Along:

Please send us your CV in English. Things We'd Be Stoked to See on Your CV:

Additional Information:

LI-KE1

What You Will Be Doing:

Well, this might just be your dream job.

What You Bring Along:

Please send us your CV in English. Things We'd Be Stoked to See on Your CV:

Additional Information:

LI-KE1

场景

安装

基本配置

动作配置

执行

定时执行

单个执行

执行流程

复杂需求

场景

安装

基本配置

动作配置

执行

定时执行

单个执行

执行流程

复杂需求

大会亮点

01 部分精彩议题

02 部分嘉宾阵容

03 Elastic AMA 展台

04 Elastic Demo 展台

05 闪电演讲

大会亮点

01 部分精彩议题

02 部分嘉宾阵容

03 Elastic AMA 展台

04 Elastic Demo 展台

05 闪电演讲

1. 上手

2. Normalizer

3. 总结

1. 上手

2. Normalizer

3. 总结

活动推荐

热门文章

热门话题