同义词是这样的:英文绘本,原版绘本,英语绘本,英文图画书
我搜索英文绘本,结果高亮是这样的:2-8岁英文原版绘本大推荐
高亮的内容为:英文原版绘本 推荐
索引的配置是这样的:
分词测试是这样的:
我搜索英文绘本,结果高亮是这样的:2-8岁英文原版绘本大推荐
高亮的内容为:英文原版绘本 推荐
索引的配置是这样的:
PUT _template/template_default
{
"index_patterns": ["*"],
"order" : 0,
"settings": {
"index": {
"analysis": {
"analyzer": {
"ik_syno": {
"filter": ["my_synonym_filter"],
"tokenizer": "ik_max_word",
"type": "custom"
},
"ik_syno_smart": {
"filter": ["my_synonym_filter"],
"tokenizer": "ik_smart",
"type": "custom"
}
},
"filter": {
"my_synonym_filter": {
"synonyms_path": "synonyms.txt",
"type": "synonym"
}
}
}
}
}
}
分词测试是这样的:
POST /viw_experience/_analyze
{
"analyzer": "ik_syno",
"text": "2-8岁英文原版绘本大推荐"
}
分词结果:{
"tokens": [
{
"token": "2-8",
"start_offset": 0,
"end_offset": 3,
"type": "LETTER",
"position": 0
},
{
"token": "8岁",
"start_offset": 2,
"end_offset": 4,
"type": "CN_WORD",
"position": 1
},
{
"token": "英文原版绘本",
"start_offset": 4,
"end_offset": 10,
"type": "CN_WORD",
"position": 2
},
{
"token": "英文原版",
"start_offset": 4,
"end_offset": 8,
"type": "CN_WORD",
"position": 3
},
{
"token": "英文",
"start_offset": 4,
"end_offset": 6,
"type": "CN_WORD",
"position": 4
},
{
"token": "英语",
"start_offset": 4,
"end_offset": 6,
"type": "SYNONYM",
"position": 4
},
{
"token": "原版绘本",
"start_offset": 6,
"end_offset": 10,
"type": "CN_WORD",
"position": 5
},
{
"token": "英文绘本",
"start_offset": 6,
"end_offset": 10,
"type": "SYNONYM",
"position": 5
},
{
"token": "英语绘本",
"start_offset": 6,
"end_offset": 10,
"type": "SYNONYM",
"position": 5
},
{
"token": "英文",
"start_offset": 6,
"end_offset": 10,
"type": "SYNONYM",
"position": 5
},
{
"token": "原版",
"start_offset": 6,
"end_offset": 8,
"type": "CN_WORD",
"position": 6
},
{
"token": "英文绘",
"start_offset": 6,
"end_offset": 8,
"type": "SYNONYM",
"position": 6
},
{
"token": "英语",
"start_offset": 6,
"end_offset": 8,
"type": "SYNONYM",
"position": 6
},
{
"token": "图画书",
"start_offset": 6,
"end_offset": 8,
"type": "SYNONYM",
"position": 6
},
{
"token": "绘本",
"start_offset": 8,
"end_offset": 10,
"type": "CN_WORD",
"position": 7
},
{
"token": "英文",
"start_offset": 8,
"end_offset": 10,
"type": "SYNONYM",
"position": 7
},
{
"token": "绘本",
"start_offset": 8,
"end_offset": 10,
"type": "SYNONYM",
"position": 7
},
{
"token": "图画",
"start_offset": 8,
"end_offset": 10,
"type": "SYNONYM",
"position": 7
},
{
"token": "推荐",
"start_offset": 11,
"end_offset": 13,
"type": "CN_WORD",
"position": 8
},
{
"token": "绘本",
"start_offset": 11,
"end_offset": 13,
"type": "SYNONYM",
"position": 8
}
]
}
最后的这个分词是不对的: {
"token": "绘本",
"start_offset": 11,
"end_offset": 13,
"type": "SYNONYM",
"position": 8
}
1 个回复
hezhiqiang
赞同来自: