当使用query查询时,curl -XPOST 'http://localhost:9200/index1/type1/_search' -d '{
"query": {
"bool": {
"must": [
{
"range": {
"date": {
"gte": "2016-08-01 00:00:00",
"lte": "2016-08-30 23:59:59",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
},
{
"term":{
"sc": "0"
}
},
{
"terms": {
"channel": [
".4399sj.com"//频道列表
]
}
}
]
}
}
}'
耗时332ms,查询到的记录数"hits":{"total":13775}
当加上聚合分组时:
curl -XPOST 'http://localhost:9200/index1/type1/_search' -d '{
"query": {
"bool": {
"must": [
{
"range": {
"date": {
"gte": "2016-08-01 00:00:00",
"lte": "2016-08-30 23:59:59",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
},
{
"term":{
"sc": "0"
}
},
{
"terms": {
"channel": [
".4399sj.com" // 频道列表
]
}
}
]
}
},
"aggs": {
"new_to_url": {
"terms": {
"field": "to_url"
},
"aggs": {
"new_from_url": {
"terms": {
"field": "from_url"
},
"aggs": {
"sum_m__visit": {
"sum": {
"field": "m_visit"
}
}
}
}
}
}
}
}'
查询耗时:37217ms即37s,查询结果记录数:hits为13755。
其中mappings设计为{"index1":{"mappings":{"type1":{"_ttl":{"enabled":true,"default":63072000000},"properties":{"channel":{"type":"string","index":"not_analyzed"},"date":{"type":"date","format":"yyyy-MM-dd HH:mm:ss"},"from_url":{"type":"string","index":"not_analyzed"},"m_visit":{"type":"long","index":"no","doc_values":true},"sc":{"type":"string","index":"not_analyzed"},"to_url":{"type":"string","index":"not_analyzed"}}}}}}
为何query查询数据量少,做二维分组聚合时,耗时要如此之久?elasticsearch版本是2.3,16个shard,8台机器,1个副本。
是查询语句写错了,还是?有何优化可以使聚合性能提升。
"query": {
"bool": {
"must": [
{
"range": {
"date": {
"gte": "2016-08-01 00:00:00",
"lte": "2016-08-30 23:59:59",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
},
{
"term":{
"sc": "0"
}
},
{
"terms": {
"channel": [
".4399sj.com"//频道列表
]
}
}
]
}
}
}'
耗时332ms,查询到的记录数"hits":{"total":13775}
当加上聚合分组时:
curl -XPOST 'http://localhost:9200/index1/type1/_search' -d '{
"query": {
"bool": {
"must": [
{
"range": {
"date": {
"gte": "2016-08-01 00:00:00",
"lte": "2016-08-30 23:59:59",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
},
{
"term":{
"sc": "0"
}
},
{
"terms": {
"channel": [
".4399sj.com" // 频道列表
]
}
}
]
}
},
"aggs": {
"new_to_url": {
"terms": {
"field": "to_url"
},
"aggs": {
"new_from_url": {
"terms": {
"field": "from_url"
},
"aggs": {
"sum_m__visit": {
"sum": {
"field": "m_visit"
}
}
}
}
}
}
}
}'
查询耗时:37217ms即37s,查询结果记录数:hits为13755。
其中mappings设计为{"index1":{"mappings":{"type1":{"_ttl":{"enabled":true,"default":63072000000},"properties":{"channel":{"type":"string","index":"not_analyzed"},"date":{"type":"date","format":"yyyy-MM-dd HH:mm:ss"},"from_url":{"type":"string","index":"not_analyzed"},"m_visit":{"type":"long","index":"no","doc_values":true},"sc":{"type":"string","index":"not_analyzed"},"to_url":{"type":"string","index":"not_analyzed"}}}}}}
为何query查询数据量少,做二维分组聚合时,耗时要如此之久?elasticsearch版本是2.3,16个shard,8台机器,1个副本。
是查询语句写错了,还是?有何优化可以使聚合性能提升。
2 个回复
kennywu76 - Wood
赞同来自: linyongzhi 、laoyang360 、WisZhou 、ezio_o 、byx313
"aggs": {
"new_to_url": {
"terms": {
"field": "to_url",
"execution_hint": "map"
},
"aggs": {
"new_from_url": {
"terms": {
"field": "from_url",
"execution_hint": "map"
},
"aggs": {
"sum_m__visit": {
"sum": {
"field": "m_visit"
}
}
}
}
ybtsdst - focus on lucene & es
赞同来自:
fielddata默认是在查询时生成出来的; 所以前几次的terms agg会比较慢, 等所有节点的fielddata生成后性能就好了.
fielddata的doc: https://www.elastic.co/guide/e ... .html