绊脚石乃是进身之阶。

elasticsearch模糊查询为什么搜不到啊, 我用的就是ngram啊

Elasticsearch | 作者 yuechen323 | 发布于2016年10月26日 | 阅读数:7192

我要实现模糊查询, 就好比like '%keyword%', 建立了如下的索引
PUT myidx1
{
"_all": {
"enabled": false
},
"settings": {
"analysis": {
"tokenizer": {
"my_ngram": {
"type": "nGram",
"min_gram": "1",
"max_gram": "20",
"token_chars": [
"letter",
"digit"
]
}
},
"analyzer": {
"mylike": {
"tokenizer": "my_ngram",
"filter": [
"lowercase"
]
}
}
}
},
"mapping": {
"mytype": {
"dynamic": false,
"properties": {
"name": {
"type": "string",
"analyzer": "mylike"
}
}
}
}
}
测试一个分词
POST myidx1/_analyze
{
"analyzer": "mylike",
"text": "文档3-aaa111"
}
结果是ok的, 如下: 
{
"tokens": [
{
"token": "文",
"start_offset": 0,
"end_offset": 1,
"type": "word",
"position": 0
},
{
"token": "文档",
"start_offset": 0,
"end_offset": 2,
"type": "word",
"position": 1
},
{
"token": "文档3",
"start_offset": 0,
"end_offset": 3,
"type": "word",
"position": 2
},
{
"token": "档",
"start_offset": 1,
"end_offset": 2,
"type": "word",
"position": 3
},
{
"token": "档3",
"start_offset": 1,
"end_offset": 3,
"type": "word",
"position": 4
},
{
"token": "3",
"start_offset": 2,
"end_offset": 3,
"type": "word",
"position": 5
},
{
"token": "a",
"start_offset": 4,
"end_offset": 5,
"type": "word",
"position": 6
},
{
"token": "aa",
"start_offset": 4,
"end_offset": 6,
"type": "word",
"position": 7
},
{
"token": "aaa",
"start_offset": 4,
"end_offset": 7,
"type": "word",
"position": 8
},
{
"token": "aaa1",
"start_offset": 4,
"end_offset": 8,
"type": "word",
"position": 9
},
{
"token": "aaa11",
"start_offset": 4,
"end_offset": 9,
"type": "word",
"position": 10
},
{
"token": "aaa111",
"start_offset": 4,
"end_offset": 10,
"type": "word",
"position": 11
},
{
"token": "a",
"start_offset": 5,
"end_offset": 6,
"type": "word",
"position": 12
},
{
"token": "aa",
"start_offset": 5,
"end_offset": 7,
"type": "word",
"position": 13
},
{
"token": "aa1",
"start_offset": 5,
"end_offset": 8,
"type": "word",
"position": 14
},
{
"token": "aa11",
"start_offset": 5,
"end_offset": 9,
"type": "word",
"position": 15
},
{
"token": "aa111",
"start_offset": 5,
"end_offset": 10,
"type": "word",
"position": 16
},
{
"token": "a",
"start_offset": 6,
"end_offset": 7,
"type": "word",
"position": 17
},
{
"token": "a1",
"start_offset": 6,
"end_offset": 8,
"type": "word",
"position": 18
},
{
"token": "a11",
"start_offset": 6,
"end_offset": 9,
"type": "word",
"position": 19
},
{
"token": "a111",
"start_offset": 6,
"end_offset": 10,
"type": "word",
"position": 20
},
{
"token": "1",
"start_offset": 7,
"end_offset": 8,
"type": "word",
"position": 21
},
{
"token": "11",
"start_offset": 7,
"end_offset": 9,
"type": "word",
"position": 22
},
{
"token": "111",
"start_offset": 7,
"end_offset": 10,
"type": "word",
"position": 23
},
{
"token": "1",
"start_offset": 8,
"end_offset": 9,
"type": "word",
"position": 24
},
{
"token": "11",
"start_offset": 8,
"end_offset": 10,
"type": "word",
"position": 25
},
{
"token": "1",
"start_offset": 9,
"end_offset": 10,
"type": "word",
"position": 26
}
]
}
词全分出来了, 然后插入几条数据
POST myidx1/mytype/_bulk
{ "index": { "_id": 4 }}
{ "name": "文档3-aaa111" }
{ "index": { "_id": 5 }}
{ "name": "yyy111"}
{ "index": { "_id": 6 }}
{ "name": "yyy111"}
测试一下查询效果, 但是搜不到, 快崩溃了
GET myidx1/mytype/_search
{
"query": {
"match": {
"name": "1"
}
}
}
难道只能让我用 wildcard, 我不甘心
已邀请:

medcl - 今晚打老虎。

赞同来自: yuechen323

你执行
GET myidx1/_mapping
看看

要回复问题请先登录注册