フィールドデータ型 - パーコレーター（Percolator） - 《Elasticsearchガイドv8.15》日本語

Percolator field type
- Python
- Ruby
- Js
- Console
- Python
- Ruby
- Js
- Console
Reindexing your percolator queries
Optimizing query time text analysis
Optimizing wildcard queries.
- Python
- Ruby
- Js
- Console
- Js
- Python
- Ruby
- Js
- Console
- Python
- Ruby
- Js
- Console
- Console-Result
- Python
- Ruby
- Js
- Console
- Js
- Python
- Ruby
- Js
- Console
- Python
- Ruby
- Js
- Console
Dedicated Percolator Index
Forcing Unmapped Fields to be Handled as Strings
Limitations

Percolator field type

percolator フィールドタイプは、JSON構造をネイティブクエリに解析し、そのクエリを保存します。これにより、percolate queryが提供されたドキュメントと一致させることができます。

JSONオブジェクトを含む任意のフィールドは、パーコレーターフィールドとして構成できます。パーコレーターフィールドタイプには設定がありません。percolator フィールドタイプを構成するだけで、Elasticsearch にフィールドをクエリとして扱うよう指示できます。

次のマッピングが percolator フィールドタイプを query フィールドに構成します:

Python

resp = client.indices.create(
   index="my-index-000001",
   mappings={
   "properties": {
   "query": {
   "type": "percolator"
   },
   "field": {
   "type": "text"
   }
   }
   },
)
print(resp)

Ruby

response = client.indices.create(
  index: 'my-index-000001',
  body: {
   mappings: {
   properties: {
   query: {
   type: 'percolator'
   },
   field: {
   type: 'text'
   }
   }
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
   properties: {
   query: {
   type: "percolator",
   },
   field: {
   type: "text",
   },
   },
  },
});
console.log(response);

Console

PUT my-index-000001
{
  "mappings": {
   "properties": {
   "query": {
   "type": "percolator"
   },
   "field": {
   "type": "text"
   }
   }
  }
}

次に、クエリをインデックスできます:

Python

resp = client.index(
   index="my-index-000001",
   id="match_value",
   document={
   "query": {
   "match": {
   "field": "value"
   }
   }
   },
)
print(resp)

Ruby

response = client.index(
  index: 'my-index-000001',
  id: 'match_value',
  body: {
   query: {
   match: {
   field: 'value'
   }
   }
  }
)
puts response

Js

const response = await client.index({
  index: "my-index-000001",
  id: "match_value",
  document: {
   query: {
   match: {
   field: "value",
   },
   },
  },
});
console.log(response);

Console

PUT my-index-000001/_doc/match_value
{
  "query": {
   "match": {
   "field": "value"
   }
  }
}

パーコレータークエリで参照されるフィールドは、すでにパーコレーションに使用されるインデックスに関連付けられたマッピングに存在する必要があります。これらのフィールドが存在することを確認するには、create index または update mapping API を介してマッピングを追加または更新します。

Reindexing your percolator queries

パーコレータークエリの再インデックス化は、新しいリリースでの percolator フィールドタイプの改善を利用するために時々必要です。

パーコレータークエリの再インデックス化は、reindex api を使用して再インデックス化できます。次のインデックスを見てみましょう。これはパーコレーターフィールドタイプを持っています:

Python

resp = client.indices.create(
   index="index",
   mappings={
   "properties": {
   "query": {
   "type": "percolator"
   },
   "body": {
   "type": "text"
   }
   }
   },
)
print(resp)
resp1 = client.indices.update_aliases(
   actions=[
   {
   "add": {
   "index": "index",
   "alias": "queries"
   }
   }
   ],
)
print(resp1)
resp2 = client.index(
   index="queries",
   id="1",
   refresh=True,
   document={
   "query": {
   "match": {
   "body": "quick brown fox"
   }
   }
   },
)
print(resp2)

Ruby

response = client.indices.create(
  index: 'index',
  body: {
   mappings: {
   properties: {
   query: {
   type: 'percolator'
   },
   body: {
   type: 'text'
   }
   }
   }
  }
)
puts response
response = client.indices.update_aliases(
  body: {
   actions: [
   {
   add: {
   index: 'index',
   alias: 'queries'
   }
   }
   ]
  }
)
puts response
response = client.index(
  index: 'queries',
  id: 1,
  refresh: true,
  body: {
   query: {
   match: {
   body: 'quick brown fox'
   }
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "index",
  mappings: {
   properties: {
   query: {
   type: "percolator",
   },
   body: {
   type: "text",
   },
   },
  },
});
console.log(response);
const response1 = await client.indices.updateAliases({
  actions: [
   {
   add: {
   index: "index",
   alias: "queries",
   },
   },
  ],
});
console.log(response1);
const response2 = await client.index({
  index: "queries",
  id: 1,
  refresh: "true",
  document: {
   query: {
   match: {
   body: "quick brown fox",
   },
   },
  },
});
console.log(response2);

Console

PUT index
{
  "mappings": {
   "properties": {
   "query" : {
   "type" : "percolator"
   },
   "body" : {
   "type": "text"
   }
   }
  }
}
POST _aliases
{
  "actions": [
   {
   "add": {
   "index": "index",
   "alias": "queries"
   }
   }
  ]
}
PUT queries/_doc/1?refresh
{
  "query" : {
   "match" : {
   "body" : "quick brown fox"
   }
  }
}


	インデックスのエイリアスを定義することを常にお勧めします。これにより、再インデックスシステム/アプリケーションが、パーコレータークエリが異なるインデックスにあることを知るために変更する必要がなくなります。

新しいメジャーバージョンにアップグレードする場合、Elasticsearch の新しいバージョンがクエリを読み取ることができるようにするには、現在の Elasticsearch バージョンの新しいインデックスにクエリを再インデックス化する必要があります:

Python

resp = client.indices.create(
   index="new_index",
   mappings={
   "properties": {
   "query": {
   "type": "percolator"
   },
   "body": {
   "type": "text"
   }
   }
   },
)
print(resp)
resp1 = client.reindex(
   refresh=True,
   source={
   "index": "index"
   },
   dest={
   "index": "new_index"
   },
)
print(resp1)
resp2 = client.indices.update_aliases(
   actions=[
   {
   "remove": {
   "index": "index",
   "alias": "queries"
   }
   },
   {
   "add": {
   "index": "new_index",
   "alias": "queries"
   }
   }
   ],
)
print(resp2)

Ruby

response = client.indices.create(
  index: 'new_index',
  body: {
   mappings: {
   properties: {
   query: {
   type: 'percolator'
   },
   body: {
   type: 'text'
   }
   }
   }
  }
)
puts response
response = client.reindex(
  refresh: true,
  body: {
   source: {
   index: 'index'
   },
   dest: {
   index: 'new_index'
   }
  }
)
puts response
response = client.indices.update_aliases(
  body: {
   actions: [
   {
   remove: {
   index: 'index',
   alias: 'queries'
   }
   },
   {
   add: {
   index: 'new_index',
   alias: 'queries'
   }
   }
   ]
  }
)
puts response

Js

const response = await client.indices.create({
  index: "new_index",
  mappings: {
   properties: {
   query: {
   type: "percolator",
   },
   body: {
   type: "text",
   },
   },
  },
});
console.log(response);
const response1 = await client.reindex({
  refresh: "true",
  source: {
   index: "index",
  },
  dest: {
   index: "new_index",
  },
});
console.log(response1);
const response2 = await client.indices.updateAliases({
  actions: [
   {
   remove: {
   index: "index",
   alias: "queries",
   },
   },
   {
   add: {
   index: "new_index",
   alias: "queries",
   },
   },
  ],
});
console.log(response2);

Console

PUT new_index
{
  "mappings": {
   "properties": {
   "query" : {
   "type" : "percolator"
   },
   "body" : {
   "type": "text"
   }
   }
  }
}
POST /_reindex?refresh
{
  "source": {
   "index": "index"
  },
  "dest": {
   "index": "new_index"
  }
}
POST _aliases
{
  "actions": [
   {
   "remove": {
   "index" : "index",
   "alias": "queries"
   }
   },
   {
   "add": {
   "index": "new_index",
   "alias": "queries"
   }
   }
  ]
}


	エイリアスがある場合は、新しいインデックスを指すように忘れないでください。

queries エイリアスを介して percolate クエリを実行:

Python

resp = client.search(
   index="queries",
   query={
   "percolate": {
   "field": "query",
   "document": {
   "body": "fox jumps over the lazy dog"
   }
   }
   },
)
print(resp)

Ruby

response = client.search(
  index: 'queries',
  body: {
   query: {
   percolate: {
   field: 'query',
   document: {
   body: 'fox jumps over the lazy dog'
   }
   }
   }
  }
)
puts response

Js

const response = await client.search({
  index: "queries",
  query: {
   percolate: {
   field: "query",
   document: {
   body: "fox jumps over the lazy dog",
   },
   },
  },
});
console.log(response);

Console

GET /queries/_search
{
  "query": {
   "percolate" : {
   "field" : "query",
   "document" : {
   "body" : "fox jumps over the lazy dog"
   }
   }
  }
}

新しいインデックスからの一致を返します:

Console-Result

{
  "took": 3,
  "timed_out": false,
  "_shards": {
   "total": 1,
   "successful": 1,
   "skipped" : 0,
   "failed": 0
  },
  "hits": {
   "total" : {
   "value": 1,
   "relation": "eq"
   },
   "max_score": 0.13076457,
   "hits": [
   {
   "_index": "new_index",
   "_id": "1",
   "_score": 0.13076457,
   "_source": {
   "query": {
   "match": {
   "body": "quick brown fox"
   }
   }
   },
   "fields" : {
   "_percolator_document_slot" : [0]
   }
   }
   ]
  }
}


	パーコレータークエリのヒットは、現在新しいインデックスから提示されています。

Optimizing query time text analysis

パーコレーターがパーコレーター候補の一致を検証するとき、クエリ時間のテキスト分析を実行し、実際にパーコレータークエリをパーコレーションされるドキュメントに対して実行します。これは、各候補一致と percolate クエリが実行されるたびに行われます。クエリ時間のテキスト分析がクエリ解析の比較的高価な部分である場合、テキスト分析はパーコレーション時に費やされる時間の支配的な要因になる可能性があります。このクエリ解析のオーバーヘッドは、パーコレーターが多くの候補パーコレータークエリの一致を検証する場合に顕著になる可能性があります。

パーコレーション時に最も高価なテキスト分析部分を回避するために、パーコレータークエリをインデックスする際に高価なテキスト分析を行うことを選択できます。これには、2 つの異なるアナライザーを使用する必要があります。最初のアナライザーは、実行する必要があるテキスト分析を実行します (高価な部分)。2 番目のアナライザー (通常はホワイトスペース) は、最初のアナライザーが生成したトークンを分割します。次に、パーコレータークエリをインデックスする前に、分析 API を使用して、より高価なアナライザーでクエリテキストを分析する必要があります。分析 API の結果、トークンは、パーコレータークエリ内の元のクエリテキストを置き換えるために使用される必要があります。クエリは、マッピングからアナライザーをオーバーライドし、2 番目のアナライザーのみを使用するように構成することが重要です。ほとんどのテキストベースのクエリは、analyzer オプション (match, query_string, simple_query_string) をサポートしています。このアプローチを使用すると、高価なテキスト分析が何度も実行されるのではなく、一度だけ実行されます。

このワークフローを簡略化した例で示しましょう。

次のパーコレータークエリをインデックスしたいとしましょう:

Js

{
  "query" : {
   "match" : {
   "body" : {
   "query" : "missing bicycles"
   }
   }
  }
}

これらの設定とマッピングで:

Python

resp = client.indices.create(
   index="test_index",
   settings={
   "analysis": {
   "analyzer": {
   "my_analyzer": {
   "tokenizer": "standard",
   "filter": [
   "lowercase",
   "porter_stem"
   ]
   }
   }
   }
   },
   mappings={
   "properties": {
   "query": {
   "type": "percolator"
   },
   "body": {
   "type": "text",
   "analyzer": "my_analyzer"
   }
   }
   },
)
print(resp)

Ruby

response = client.indices.create(
  index: 'test_index',
  body: {
   settings: {
   analysis: {
   analyzer: {
   my_analyzer: {
   tokenizer: 'standard',
   filter: [
   'lowercase',
   'porter_stem'
   ]
   }
   }
   }
   },
   mappings: {
   properties: {
   query: {
   type: 'percolator'
   },
   body: {
   type: 'text',
   analyzer: 'my_analyzer'
   }
   }
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "test_index",
  settings: {
   analysis: {
   analyzer: {
   my_analyzer: {
   tokenizer: "standard",
   filter: ["lowercase", "porter_stem"],
   },
   },
   },
  },
  mappings: {
   properties: {
   query: {
   type: "percolator",
   },
   body: {
   type: "text",
   analyzer: "my_analyzer",
   },
   },
  },
});
console.log(response);

Console

PUT /test_index
{
  "settings": {
   "analysis": {
   "analyzer": {
   "my_analyzer" : {
   "tokenizer": "standard",
   "filter" : ["lowercase", "porter_stem"]
   }
   }
   }
  },
  "mappings": {
   "properties": {
   "query" : {
   "type": "percolator"
   },
   "body" : {
   "type": "text",
   "analyzer": "my_analyzer"
   }
   }
  }
}


	この例の目的のために、このアナライザーは高価であると見なされます。

最初に、インデックスする前にテキスト分析を実行するために分析 API を使用する必要があります:

Python

resp = client.indices.analyze(
   index="test_index",
   analyzer="my_analyzer",
   text="missing bicycles",
)
print(resp)

Ruby

response = client.indices.analyze(
  index: 'test_index',
  body: {
   analyzer: 'my_analyzer',
   text: 'missing bicycles'
  }
)
puts response

Js

const response = await client.indices.analyze({
  index: "test_index",
  analyzer: "my_analyzer",
  text: "missing bicycles",
});
console.log(response);

Console

POST /test_index/_analyze
{
  "analyzer" : "my_analyzer",
  "text" : "missing bicycles"
}

これにより、次のレスポンスが得られます:

Console-Result

{
  "tokens": [
   {
   "token": "miss",
   "start_offset": 0,
   "end_offset": 7,
   "type": "<ALPHANUM>",
   "position": 0
   },
   {
   "token": "bicycl",
   "start_offset": 8,
   "end_offset": 16,
   "type": "<ALPHANUM>",
   "position": 1
   }
  ]
}

返された順序のすべてのトークンは、パーコレータークエリ内のクエリテキストを置き換える必要があります:

Python

resp = client.index(
   index="test_index",
   id="1",
   refresh=True,
   document={
   "query": {
   "match": {
   "body": {
   "query": "miss bicycl",
   "analyzer": "whitespace"
   }
   }
   }
   },
)
print(resp)

Ruby

response = client.index(
  index: 'test_index',
  id: 1,
  refresh: true,
  body: {
   query: {
   match: {
   body: {
   query: 'miss bicycl',
   analyzer: 'whitespace'
   }
   }
   }
  }
)
puts response

Js

const response = await client.index({
  index: "test_index",
  id: 1,
  refresh: "true",
  document: {
   query: {
   match: {
   body: {
   query: "miss bicycl",
   analyzer: "whitespace",
   },
   },
   },
  },
});
console.log(response);

Console

PUT /test_index/_doc/1?refresh
{
  "query" : {
   "match" : {
   "body" : {
   "query" : "miss bicycl",
   "analyzer" : "whitespace"
   }
   }
  }
}

ここでホワイトスペースアナライザーを選択することが重要です。そうしないと、マッピングで定義されたアナライザーが使用され、
このワークフローを使用する目的が無くなります。whitespace は組み込みアナライザーであり、異なるアナライザーを使用する必要がある場合は、最初にインデックスの設定で構成する必要があります。


	ここでホワイトスペースアナライザーを選択することが重要です。そうしないと、マッピングで定義されたアナライザーが使用され、このワークフローを使用する目的が無くなります。`whitespace` は組み込みアナライザーであり、異なるアナライザーを使用する必要がある場合は、最初にインデックスの設定で構成する必要があります。

パーコレーターフローのインデックス前に分析 API を実行する必要があります。

パーコレーション時には何も変更されず、percolate クエリは通常通り定義できます:

Python

resp = client.search(
   index="test_index",
   query={
   "percolate": {
   "field": "query",
   "document": {
   "body": "Bycicles are missing"
   }
   }
   },
)
print(resp)

Ruby

response = client.search(
  index: 'test_index',
  body: {
   query: {
   percolate: {
   field: 'query',
   document: {
   body: 'Bycicles are missing'
   }
   }
   }
  }
)
puts response

Js

const response = await client.search({
  index: "test_index",
  query: {
   percolate: {
   field: "query",
   document: {
   body: "Bycicles are missing",
   },
   },
  },
});
console.log(response);

Console

GET /test_index/_search
{
  "query": {
   "percolate" : {
   "field" : "query",
   "document" : {
   "body" : "Bycicles are missing"
   }
   }
  }
}

これにより、次のようなレスポンスが得られます:

Console-Result

{
  "took": 6,
  "timed_out": false,
  "_shards": {
   "total": 1,
   "successful": 1,
   "skipped" : 0,
   "failed": 0
  },
  "hits": {
   "total" : {
   "value": 1,
   "relation": "eq"
   },
   "max_score": 0.13076457,
   "hits": [
   {
   "_index": "test_index",
   "_id": "1",
   "_score": 0.13076457,
   "_source": {
   "query": {
   "match": {
   "body": {
   "query": "miss bicycl",
   "analyzer": "whitespace"
   }
   }
   }
   },
   "fields" : {
   "_percolator_document_slot" : [0]
   }
   }
   ]
  }
}

Optimizing wildcard queries.

ワイルドカードクエリは、特にワイルドカード式が大きい場合、パーコレーターにとって他のクエリよりも高価です。

wildcard クエリがプレフィックスワイルドカード式を使用している場合や、単に prefix クエリを使用している場合、edge_ngram トークンフィルターを使用して、これらのクエリを term フィールドで edge_ngram トークンフィルターが構成されているフィールドの通常のクエリに置き換えることができます。

カスタム分析設定でインデックスを作成:

Python

resp = client.indices.create(
   index="my_queries1",
   settings={
   "analysis": {
   "analyzer": {
   "wildcard_prefix": {
   "type": "custom",
   "tokenizer": "standard",
   "filter": [
   "lowercase",
   "wildcard_edge_ngram"
   ]
   }
   },
   "filter": {
   "wildcard_edge_ngram": {
   "type": "edge_ngram",
   "min_gram": 1,
   "max_gram": 32
   }
   }
   }
   },
   mappings={
   "properties": {
   "query": {
   "type": "percolator"
   },
   "my_field": {
   "type": "text",
   "fields": {
   "prefix": {
   "type": "text",
   "analyzer": "wildcard_prefix",
   "search_analyzer": "standard"
   }
   }
   }
   }
   },
)
print(resp)

Ruby

response = client.indices.create(
  index: 'my_queries1',
  body: {
   settings: {
   analysis: {
   analyzer: {
   wildcard_prefix: {
   type: 'custom',
   tokenizer: 'standard',
   filter: [
   'lowercase',
   'wildcard_edge_ngram'
   ]
   }
   },
   filter: {
   wildcard_edge_ngram: {
   type: 'edge_ngram',
   min_gram: 1,
   max_gram: 32
   }
   }
   }
   },
   mappings: {
   properties: {
   query: {
   type: 'percolator'
   },
   my_field: {
   type: 'text',
   fields: {
   prefix: {
   type: 'text',
   analyzer: 'wildcard_prefix',
   search_analyzer: 'standard'
   }
   }
   }
   }
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "my_queries1",
  settings: {
   analysis: {
   analyzer: {
   wildcard_prefix: {
   type: "custom",
   tokenizer: "standard",
   filter: ["lowercase", "wildcard_edge_ngram"],
   },
   },
   filter: {
   wildcard_edge_ngram: {
   type: "edge_ngram",
   min_gram: 1,
   max_gram: 32,
   },
   },
   },
  },
  mappings: {
   properties: {
   query: {
   type: "percolator",
   },
   my_field: {
   type: "text",
   fields: {
   prefix: {
   type: "text",
   analyzer: "wildcard_prefix",
   search_analyzer: "standard",
   },
   },
   },
   },
  },
});
console.log(response);

Console

PUT my_queries1
{
  "settings": {
   "analysis": {
   "analyzer": {
   "wildcard_prefix": {
   "type": "custom",
   "tokenizer": "standard",
   "filter": [
   "lowercase",
   "wildcard_edge_ngram"
   ]
   }
   },
   "filter": {
   "wildcard_edge_ngram": {
   "type": "edge_ngram",
   "min_gram": 1,
   "max_gram": 32
   }
   }
   }
  },
  "mappings": {
   "properties": {
   "query": {
   "type": "percolator"
   },
   "my_field": {
   "type": "text",
   "fields": {
   "prefix": {
   "type": "text",
   "analyzer": "wildcard_prefix",
   "search_analyzer": "standard"
   }
   }
   }
   }
  }
}


	プレフィックストークンをインデックス時にのみ生成するアナライザーです。
	プレフィックス検索のニーズに基づいて `min_gram` を増加させ、`max_gram` 設定を減少させます。
	このマルチフィールドは、`term` または `match` クエリを使用してプレフィックス検索を行うために使用する必要があります。

次に、次のクエリをインデックスする代わりに:

Js

{
  "query": {
   "wildcard": {
   "my_field": "abc*"
   }
  }
}

次のクエリをインデックスする必要があります:

Python

resp = client.index(
   index="my_queries1",
   id="1",
   refresh=True,
   document={
   "query": {
   "term": {
   "my_field.prefix": "abc"
   }
   }
   },
)
print(resp)

Ruby

response = client.index(
  index: 'my_queries1',
  id: 1,
  refresh: true,
  body: {
   query: {
   term: {
   'my_field.prefix' => 'abc'
   }
   }
  }
)
puts response

Js

const response = await client.index({
  index: "my_queries1",
  id: 1,
  refresh: "true",
  document: {
   query: {
   term: {
   "my_field.prefix": "abc",
   },
   },
  },
});
console.log(response);

Console

PUT /my_queries1/_doc/1?refresh
{
  "query": {
   "term": {
   "my_field.prefix": "abc"
   }
  }
}

この方法は、最初のクエリよりも2 番目のクエリをより効率的に処理できます。

次の検索リクエストは、以前にインデックスされたパーコレータークエリと一致します:

Python

resp = client.search(
   index="my_queries1",
   query={
   "percolate": {
   "field": "query",
   "document": {
   "my_field": "abcd"
   }
   }
   },
)
print(resp)

Ruby

response = client.search(
  index: 'my_queries1',
  body: {
   query: {
   percolate: {
   field: 'query',
   document: {
   my_field: 'abcd'
   }
   }
   }
  }
)
puts response

Js

const response = await client.search({
  index: "my_queries1",
  query: {
   percolate: {
   field: "query",
   document: {
   my_field: "abcd",
   },
   },
  },
});
console.log(response);

Console

GET /my_queries1/_search
{
  "query": {
   "percolate": {
   "field": "query",
   "document": {
   "my_field": "abcd"
   }
   }
  }
}

Console-Result

{
  "took": 6,
  "timed_out": false,
  "_shards": {
   "total": 1,
   "successful": 1,
   "skipped": 0,
   "failed": 0
  },
  "hits": {
   "total" : {
   "value": 1,
   "relation": "eq"
   },
   "max_score": 0.18864399,
   "hits": [
   {
   "_index": "my_queries1",
   "_id": "1",
   "_score": 0.18864399,
   "_source": {
   "query": {
   "term": {
   "my_field.prefix": "abc"
   }
   }
   },
   "fields": {
   "_percolator_document_slot": [
   0
   ]
   }
   }
   ]
  }
}

同じ技術を使用して、サフィックスワイルドカード検索を高速化することもできます。reverse トークンフィルターを edge_ngram トークンフィルターの前に使用します。

Python

resp = client.indices.create(
   index="my_queries2",
   settings={
   "analysis": {
   "analyzer": {
   "wildcard_suffix": {
   "type": "custom",
   "tokenizer": "standard",
   "filter": [
   "lowercase",
   "reverse",
   "wildcard_edge_ngram"
   ]
   },
   "wildcard_suffix_search_time": {
   "type": "custom",
   "tokenizer": "standard",
   "filter": [
   "lowercase",
   "reverse"
   ]
   }
   },
   "filter": {
   "wildcard_edge_ngram": {
   "type": "edge_ngram",
   "min_gram": 1,
   "max_gram": 32
   }
   }
   }
   },
   mappings={
   "properties": {
   "query": {
   "type": "percolator"
   },
   "my_field": {
   "type": "text",
   "fields": {
   "suffix": {
   "type": "text",
   "analyzer": "wildcard_suffix",
   "search_analyzer": "wildcard_suffix_search_time"
   }
   }
   }
   }
   },
)
print(resp)

Ruby

response = client.indices.create(
  index: 'my_queries2',
  body: {
   settings: {
   analysis: {
   analyzer: {
   wildcard_suffix: {
   type: 'custom',
   tokenizer: 'standard',
   filter: [
   'lowercase',
   'reverse',
   'wildcard_edge_ngram'
   ]
   },
   wildcard_suffix_search_time: {
   type: 'custom',
   tokenizer: 'standard',
   filter: [
   'lowercase',
   'reverse'
   ]
   }
   },
   filter: {
   wildcard_edge_ngram: {
   type: 'edge_ngram',
   min_gram: 1,
   max_gram: 32
   }
   }
   }
   },
   mappings: {
   properties: {
   query: {
   type: 'percolator'
   },
   my_field: {
   type: 'text',
   fields: {
   suffix: {
   type: 'text',
   analyzer: 'wildcard_suffix',
   search_analyzer: 'wildcard_suffix_search_time'
   }
   }
   }
   }
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "my_queries2",
  settings: {
   analysis: {
   analyzer: {
   wildcard_suffix: {
   type: "custom",
   tokenizer: "standard",
   filter: ["lowercase", "reverse", "wildcard_edge_ngram"],
   },
   wildcard_suffix_search_time: {
   type: "custom",
   tokenizer: "standard",
   filter: ["lowercase", "reverse"],
   },
   },
   filter: {
   wildcard_edge_ngram: {
   type: "edge_ngram",
   min_gram: 1,
   max_gram: 32,
   },
   },
   },
  },
  mappings: {
   properties: {
   query: {
   type: "percolator",
   },
   my_field: {
   type: "text",
   fields: {
   suffix: {
   type: "text",
   analyzer: "wildcard_suffix",
   search_analyzer: "wildcard_suffix_search_time",
   },
   },
   },
   },
  },
});
console.log(response);

Console

PUT my_queries2
{
  "settings": {
   "analysis": {
   "analyzer": {
   "wildcard_suffix": {
   "type": "custom",
   "tokenizer": "standard",
   "filter": [
   "lowercase",
   "reverse",
   "wildcard_edge_ngram"
   ]
   },
   "wildcard_suffix_search_time": {
   "type": "custom",
   "tokenizer": "standard",
   "filter": [
   "lowercase",
   "reverse"
   ]
   }
   },
   "filter": {
   "wildcard_edge_ngram": {
   "type": "edge_ngram",
   "min_gram": 1,
   "max_gram": 32
   }
   }
   }
  },
  "mappings": {
   "properties": {
   "query": {
   "type": "percolator"
   },
   "my_field": {
   "type": "text",
   "fields": {
   "suffix": {
   "type": "text",
   "analyzer": "wildcard_suffix",
   "search_analyzer": "wildcard_suffix_search_time"
   }
   }
   }
   }
  }
}


	検索時にもカスタムアナライザーが必要です。そうしないと、クエリ用語が逆転せず、予約されたサフィックストークンと一致しなくなります。

次に、次のクエリをインデックスする代わりに:

Js

{
  "query": {
   "wildcard": {
   "my_field": "*xyz"
   }
  }
}

次のクエリをインデックスする必要があります:

Python

resp = client.index(
   index="my_queries2",
   id="2",
   refresh=True,
   document={
   "query": {
   "match": {
   "my_field.suffix": "xyz"
   }
   }
   },
)
print(resp)

Ruby

response = client.index(
  index: 'my_queries2',
  id: 2,
  refresh: true,
  body: {
   query: {
   match: {
   'my_field.suffix' => 'xyz'
   }
   }
  }
)
puts response

Js

const response = await client.index({
  index: "my_queries2",
  id: 2,
  refresh: "true",
  document: {
   query: {
   match: {
   "my_field.suffix": "xyz",
   },
   },
  },
});
console.log(response);

Console

PUT /my_queries2/_doc/2?refresh
{
  "query": {
   "match": {
   "my_field.suffix": "xyz"
   }
  }
}


	`match` クエリは `term` クエリの代わりに使用する必要があります。テキスト分析はクエリ用語を逆転させる必要があります。

次の検索リクエストは、以前にインデックスされたパーコレータークエリと一致します:

Python

resp = client.search(
   index="my_queries2",
   query={
   "percolate": {
   "field": "query",
   "document": {
   "my_field": "wxyz"
   }
   }
   },
)
print(resp)

Ruby

response = client.search(
  index: 'my_queries2',
  body: {
   query: {
   percolate: {
   field: 'query',
   document: {
   my_field: 'wxyz'
   }
   }
   }
  }
)
puts response

Js

const response = await client.search({
  index: "my_queries2",
  query: {
   percolate: {
   field: "query",
   document: {
   my_field: "wxyz",
   },
   },
  },
});
console.log(response);

Console

GET /my_queries2/_search
{
  "query": {
   "percolate": {
   "field": "query",
   "document": {
   "my_field": "wxyz"
   }
   }
  }
}

Dedicated Percolator Index

パーコレータークエリは、任意のインデックスに追加できます。データが存在するインデックスにパーコレータークエリを追加する代わりに、専用のインデックスにこれらのクエリを追加することもできます。この利点は、この専用のパーコレーターインデックスが独自のインデックス設定を持つことができることです (たとえば、プライマリシャードとレプリカシャードの数)。専用のパーコレーターインデックスを持つことを選択した場合、通常のインデックスのマッピングもパーコレーターインデックスで利用できることを確認する必要があります。そうしないと、パーコレータークエリが正しく解析されない可能性があります。

Forcing Unmapped Fields to be Handled as Strings

特定のケースでは、どのようなパーコレータークエリが登録されるかが不明であり、パーコレータークエリによって参照されるフィールドにマッピングが存在しない場合、パーコレータークエリの追加が失敗します。これは、マッピングを更新して適切な設定を持つフィールドを持つ必要があることを意味し、その後、パーコレータークエリを追加できます。しかし、すべての未マッピングフィールドがデフォルトのテキストフィールドとして処理される場合は、十分な場合もあります。その場合、index.percolator.map_unmapped_fields_as_text 設定を true (デフォルトは false) に構成し、パーコレータークエリで参照されるフィールドが存在しない場合は、デフォルトのテキストフィールドとして処理されるため、パーコレータークエリの追加が失敗しません。

Limitations

Parent/child

percolate クエリは、1 回のドキュメントを処理しているため、has_child や has_parent のような子ドキュメントに対して実行されるクエリやフィルターをサポートしていません。

Fetching queries

クエリ解析中に get コールを介してデータを取得するクエリがいくつかあります。たとえば、terms クエリは、用語ルックアップを使用している場合、template クエリはインデックスされたスクリプトを使用している場合、geo_shape は事前インデックスされたシェイプを使用している場合です。これらのクエリが percolator フィールドタイプによってインデックスされると、get コールは 1 回実行されます。したがって、percolator クエリがこれらのクエリを評価するたびに、インデックス時にあった用語、シェイプなどが使用されます。重要な点は、これらのクエリが行う用語の取得は、パーコレータークエリがプライマリシャードとレプリカシャードの両方でインデックスされるたびに発生するため、インデックスされた用語はシャードコピー間で異なる可能性があることです。インデックス中にソースインデックスが変更された場合。

Script query

script クエリ内のスクリプトは、doc 値フィールドにのみアクセスできます。percolate クエリは、提供されたドキュメントをメモリ内インデックスにインデックスします。このメモリ内インデックスは、保存されたフィールドをサポートしておらず、そのため _source フィールドや他の保存されたフィールドは保存されません。これは、script クエリで _source および他の保存されたフィールドが利用できない理由です。

Field aliases

フィールドエイリアスを含むパーコレータークエリは、常に期待どおりに動作するとは限りません。特に、フィールドエイリアスを含むパーコレータークエリが登録され、その後そのエイリアスが異なるフィールドを参照するようにマッピングで更新された場合、保存されたクエリは元のターゲットフィールドを参照し続けます。フィールドエイリアスの変更を反映させるには、パーコレータークエリを明示的に再インデックス化する必要があります。