TokenCountPayloadFilter

De JDONREF Wiki

Include in integer payloads the count of tokens with the same payload within the same field.

Sample

For example, the document :

 { "fullName": "BOULEVARD|1 DE|1 PARIS|1 L|2 HOPITAL|2" }

indexed with a mapping like :

 "fullName" : {"type": "string", "term_vector" : "with_positions_offsets_payloads", "index_analyzer":"myAnalyzer"}

and settings like :

 {
   "index" : {
       "analysis" : {
           "analyzer": {
               "myAnalyzer" : {
                   "type" : "custom",
                   "tokenizer" : "whitespace",
                   "filter" : ["delimited_payload_filter", "lowercase", "tokencount_payload_filter"]
               },
           "filter" : {
               "delimited_payload_filter" : {
                 "type": "delimited_payload_filter",
                 "delimiter" : "|",
                 "encoding" : "int"
               },
               "tokencount_payload_filter" : {
                 "type": "tokencountpayloads",
                 "factor": 1000
               }
       }
 }

will index the tokens BOULEVARD, DE, PARIS, L, HOPITAL with the respective payloads : 3001, 3001, 3001, 2002, 2002.

  • 3001 means there is 3 tokens with payload 1 ( 3*factor +1 ).
  • 2002 means there is 2 tokens with payload 2 ( 2*factor +2 ).

These factored tokens can be used with the checker All from PayloadCheckerSpanQuery.

Features
Setting description
factor (Mandatory) The factor by which the count of tokens with a given payload will be multiplied.
ignored_types (none) The token's payload associated with these types won't be modified. Others will.