PayloadVersusTypeSpanQuery

De JDONREF Wiki

Ensure tokens with a give payload are all matched if document match the specified type.


For example, the document of type "road" :

 { "fullName": "BOULEVARD|1003 DE|1003 PARIS|1003 L|2002 HOPITAL|2002" }

where payload 1003 means 3 tokens with payload 1

 and payload 2002 means 2 tokens with payload 2


indexed with a mapping like :

 "fullName" : {"type": "string", "term_vector" : "with_positions_offsets_payloads", "index_analyzer":"myAnalyzer"}

and settings like :

 {
   "index" : {
       "analysis" : {
           "analyzer": {
               "myAnalyzer" : {
                   "type" : "custom",
                   "tokenizer" : "whitespace",
                   "filter" : ["delimited_payload_filter", "lowercase"]
               },
           "filter" : {
               "delimited_payload_filter" : {
                 "type": "delimited_payload_filter",
                 "delimiter" : "|",
                 "encoding" : "int"
               }
       }
 }

will not match the request :

 curl -XPOST 'http://localhost:9200/_search' -d '{
   "query": {
     "span_payloadversustype" : {
       "clauses" : [
          { "span_termvectormultipayloadterm" :{ "fullName" : "BOULEVARD"}} ,
          { "span_termvectormultipayloadterm" :{ "fullName" : "HOPITAL"}} ,
          { "span_termvectormultipayloadterm" :{ "fullName" : "PARIS"}}
       ]
     },
     "requiredpayloads" : [
        { "type": "road",
          "payloads": [ 1 ]
        }
     ],
     "termcountpayloadfactor" : 1000
   }
 }'

because it miss the token 'DE' which got payload 1003 (that means one of 3 tokens with payload 1).

The "termcountpayloadfactor" is applyed on the token payload in order to get

  • the total number of token matches the same payload (eg 1003%1000) ;
  • the relative payload (eg 1003/1000);