TokenCountPayloadFilter
De JDONREF Wiki
Include in integer payloads the count of tokens with the same payload within the same field.
Sample
For example, the document :
{ "fullName": "BOULEVARD|1 DE|1 PARIS|1 L|2 HOPITAL|2" }
indexed with a mapping like :
"fullName" : {"type": "string", "term_vector" : "with_positions_offsets_payloads", "index_analyzer":"myAnalyzer"}
and settings like :
{ "index" : { "analysis" : { "analyzer": { "myAnalyzer" : { "type" : "custom", "tokenizer" : "whitespace", "filter" : ["delimited_payload_filter", "lowercase", "tokencount_payload_filter"] }, "filter" : { "delimited_payload_filter" : { "type": "delimited_payload_filter", "delimiter" : "|", "encoding" : "int" }, "tokencount_payload_filter" : { "type": "tokencountpayloads", "factor": 1000 } } }
will index the tokens BOULEVARD, DE, PARIS, L, HOPITAL with the respective payloads : 3001, 3001, 3001, 2002, 2002.
- 3001 means there is 3 tokens with payload 1 ( 3*factor +1 ).
- 2002 means there is 2 tokens with payload 2 ( 2*factor +2 ).
These factored tokens can be used with the checker All from PayloadCheckerSpanQuery.
Features
Setting | description |
factor | (Mandatory) The factor by which the count of tokens with a given payload will be multiplied. |
ignored_types | (none) The token's payload associated with these types won't be modified. Others will. |