查询语句如下: 数据的范围限定在两周内, 想把每七天的数据分到一个桶。但是结果却出现了三个桶。
{
"size": 0,
"query": {
"range": {
"timestamp": {
"time_zone": "+08:00",
"gte": "now-13d/d",
"lt": "now/d"
}
}
},
"aggs": {
"secondAggs": {
"date_histogram": {
"field": "timestamp",
"interval": "7d",
"time_zone": "+08:00"
}
}
}
}
result:
"aggregations" : {
"secondAggs" : {
"buckets" : [
{
"key_as_string" : "2018-05-17T00:00:00.000+08:00",
"key" : 1526486400000,
"doc_count" : 1035155
},
{
"key_as_string" : "2018-05-24T00:00:00.000+08:00",
"key" : 1527091200000,
"doc_count" : 1370881
},
{
"key_as_string" : "2018-05-31T00:00:00.000+08:00",
"key" : 1527696000000,
"doc_count" : 198188
}
]
}
}
实际上数据范围是 2018-05-19 00:00:00 ~ 2018-06-01 00:00:00, 分两个桶不是正好吗, 为什么会出现三个桶?
{
"size": 0,
"query": {
"range": {
"timestamp": {
"time_zone": "+08:00",
"gte": "now-13d/d",
"lt": "now/d"
}
}
},
"aggs": {
"secondAggs": {
"date_histogram": {
"field": "timestamp",
"interval": "7d",
"time_zone": "+08:00"
}
}
}
}
result:
"aggregations" : {
"secondAggs" : {
"buckets" : [
{
"key_as_string" : "2018-05-17T00:00:00.000+08:00",
"key" : 1526486400000,
"doc_count" : 1035155
},
{
"key_as_string" : "2018-05-24T00:00:00.000+08:00",
"key" : 1527091200000,
"doc_count" : 1370881
},
{
"key_as_string" : "2018-05-31T00:00:00.000+08:00",
"key" : 1527696000000,
"doc_count" : 198188
}
]
}
}
实际上数据范围是 2018-05-19 00:00:00 ~ 2018-06-01 00:00:00, 分两个桶不是正好吗, 为什么会出现三个桶?
2 个回复
strglee
赞同来自: guoxiaoguo
https://www.elastic.co/guide/e ... nse_3
A multi-bucket aggregation similar to the histogram except it can only be applied on date values. Since dates are represented in Elasticsearch internally as long values, it is possible to use the normal histogram on dates as well, though accuracy will be compromised. The reason for this is in the fact that time based intervals are not fixed (think of leap years and on the number of days in a month). For this reason, we need special support for time based data. From a functionality perspective, this histogram supports the same features as the normal histogram. The main difference is that the interval can be specified by date/time expressions.
文档开始解释了,es存储时间其实是存储的long int也就是时间戳
那按照什么方式计算bucket_key呢?
https://www.elastic.co/guide/e ... .html
A multi-bucket values source based aggregation that can be applied on numeric values extracted from the documents. It dynamically builds fixed size (a.k.a. interval) buckets over the values.
计算es桶聚合bucket_key的公式是
2018-05-19 00:00:00 的时间戳是 1526659200 带入公式 1526515200 的代表的时间是 2018-05-17 00:00:00
zhongkouwei
赞同来自: