三种高亮方式
1 2 3 4 5
| unified:默认的高亮方式,使用Lucene的实现方式
plain:性能较高,消耗少量内存,性价比高
fvh => fast vactor highlighter 适合字段较大,较复杂的查询情况
|
指定高亮类型
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
| # type => unified plain fvh(fvh需要在mapping中指定该字段的属性term_vector:with_positions_offsets) GET /person/_search { "query": { "match": { "name": "测试" } }, "highlight": { "fields": { "name": { "type":"unified" } } } }
#创建mapping,指定term_vector开可以使用fast vactor highlighter方式 PUT /person { "mappings": { "properties": { "name":{ "analyzer": "ik_max_word", "type": "text", "term_vector": "with_positions_offsets" }, "age":{ "type": "long" }, "des":{ "analyzer": "ik_max_word", "type": "text" } } } }
|
单字段高亮
默认的高亮标签为<em>
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
| GET /person/_search { "query": { "bool": { "should": [ { "match": { "name": "测试" } },{ "match": { "des": "测试" } } ] } }, "highlight": { "fields": { "name": { "type": "fvh", "post_tags": "</b>", "pre_tags": "<b>" }, "des": {} } } }
|
全局字段高亮
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
| GET /person/_search { "query": { "bool": { "should": [ { "match": { "name": "测试" } }, { "match": { "des": "测试" } } ] } }, "highlight": { "post_tags": "</b>", "pre_tags": "<b>", "fields": { "name": { }, "des": {} } } }
|
二、Suggest-搜索推荐
四种suggest:term suggester、phrase suggester、completion suggester、context suggester
2.1、term suggester
根据词项的词频来推荐
参数说明
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| text:用户搜索的文本 field:要从哪个字段选取推荐数据 analyzer:使用哪种分词器 size:每个建议返回的最大结果数 sort:如何按照提示词项排序,参数值只可以是以下两个枚举: - score:分数>词频>词项本身 - frequency:词频>分数>词项本身 max_edits:可以具有最大偏移距离候选建议以便被认为是建议。只能是1到2之间的值。任何其他值都将导致引发错误的请求错误。默认为2 prefix_length:前缀匹配的时候,必须满足的最少字符 min_doc_freq:最少的文档频率 suggest_mode:搜索推荐的推荐模式,参数值亦是枚举: - missing 匹配不再索引中的词项(不包含自己的结果) - popular 匹配比原始词项的文档词频更高的词项(比自己高的结果) - always 匹配推荐的任意词项(匹配所有结果)
|
推荐模式(默认missing)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| GET /news/_search { "suggest": { "missing_suggest": { "text": "baoqian baoqiang", "term": { "suggest_mode": "missing", "field": "title" } }, "popular_suggest": { "text": "baoqian baoqiang", "term": { "suggest_mode": "popular", "field": "title" } }, "always_suggest": { "text": "baoqian baoqiang", "term": { "suggest_mode": "always", "field": "title" } } } }
|
2.2、phrase suggester
phrase suggester和term suggester相比,对建议的文本会参考上下文,也就是一个句子的其他token,不只是单纯的token距离匹配,它可以基于共生和频率选出更好的建议。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| GET /news/_search { "suggest": { "my-suggestion": { "text": "baoqing baoqiang", "phrase": { "field": "title", "size": 3, "highlight": { "pre_tag": "<h1>", "post_tag": "</h1>" }, "direct_generator": [ { "suggest_mode": "always", "field": "content" },{ "suggest_mode": "popular", "field": "content" } ] } } } }
|
2.3、completion suggester(支持中文)
自动补全,自动完成,基于内存,性能很高,支持三种查询【前缀查询(prefix)/模糊查询(fuzzy)/正则表达式查询(regex)】
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| #创建mapping,指端suggest类型 PUT suggest_carinfo { "mappings": { "properties": { "title": { "type": "text", "analyzer": "ik_max_word", "fields": { "suggest": { "type": "completion", "analyzer": "ik_max_word" } } }, "content": { "type": "text", "analyzer": "ik_max_word" } } } }
|
prefix query
基于前缀查询的搜索提示,是最常用的一种搜索推荐查询。只匹配前缀的话召回率低。
1 2 3 4
| prefix:客户端搜索词 field:建议词字段 size:需要返回的建议词数量 skip_duplicates:是否过滤掉重复建议,默认false
|
1 2 3 4 5 6 7 8 9 10 11
| GET suggest_carinfo/_search?pretty { "suggest": { "car_suggest" : { "prefix" : "A6", "completion" : { "field" : "title.suggest" } } } }
|
fuzzy query
1 2 3 4 5 6
| fuzziness:允许的偏移量,默认auto transpositions:如果设置为true,则换位计为一次更改而不是两次更改,默认为true。 min_length:返回模糊建议之前的最小输入长度,默认 3 prefix_length:输入的最小长度(不检查模糊替代项)默认为 1 unicode_aware:如果为true,则所有度量(如模糊编辑距离,换位和长度)均以Unicode代码点而不是以字节为单位。这比原始字节略慢,因此默认情况下将其设置为false。 skip_duplicates:是否过滤掉重复建议,默认false
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| POST suggest_carinfo/_search { "suggest": { "car_suggest": { "prefix": "宝马5系", "completion": { "field": "title.suggest", "skip_duplicates":true, "fuzzy": { "fuzziness": 2 } } } } }
|
regex query
可以用正则表示前缀,不建议使用
1 2 3 4 5 6 7 8 9 10 11
| POST suggest_carinfo/_search { "suggest": { "car_suggest": { "regex": "[\\s\\S]*", "completion": { "field": "title.suggest" } } } }
|