ベクトルクエリ - スパースベクトル（Sparse vector） - 《Elasticsearchガイドv8.15》日本語

スパースベクトルクエリ
自然言語処理モデルを使用した例のリクエスト
事前計算されたベクトルを使用した例のリクエスト
スパースベクトルのトップレベルパラメータ
ELSERクエリの例
プルーニング設定とリスコアを伴うELSERクエリの例

スパースベクトルクエリ

スパースベクトルクエリは、学習されたスパース検索モデルによって構築されたスパースベクトルからなるクエリを実行します。これは、次の2つの戦略のいずれかを使用して実現できます：

自然言語処理モデルを使用して、クエリテキストをトークン-ウェイトペアのリストに変換する
事前計算されたトークン-ウェイトペアをクエリベクトルとして送信する

これらのトークン-ウェイトペアは、スパースベクトルに対するクエリに使用されます。クエリ時には、トークンを作成するために使用されたのと同じ推論モデルを使用してクエリベクトルが計算されます。クエリを実行する際、これらのクエリベクトルはそれぞれのウェイトと共にOR演算され、スコアリングは実質的に保存された次元とクエリ次元の間のドット積計算となります。

例えば、保存されたベクトル{"feature_0": 0.12, "feature_1": 1.2, "feature_2": 3.0}とクエリベクトル{"feature_0": 2.5, "feature_2": 0.2}は、ドキュメント_score = 0.12*2.5 + 3.0*0.2 = 0.9にスコアを付けます。

自然言語処理モデルを使用した例のリクエスト

Python

resp = client.search(
   query={
   "sparse_vector": {
   "field": "ml.tokens",
   "inference_id": "the inference ID to produce the token weights",
   "query": "the query string"
   }
   },
)
print(resp)

Js

const response = await client.search({
  query: {
   sparse_vector: {
   field: "ml.tokens",
   inference_id: "the inference ID to produce the token weights",
   query: "the query string",
   },
  },
});
console.log(response);

コンソール

GET _search
{
   "query":{
   "sparse_vector": {
   "field": "ml.tokens",
   "inference_id": "the inference ID to produce the token weights",
   "query": "the query string"
   }
   }
}

事前計算されたベクトルを使用した例のリクエスト

Python

resp = client.search(
   query={
   "sparse_vector": {
   "field": "ml.tokens",
   "query_vector": {
   "token1": 0.5,
   "token2": 0.3,
   "token3": 0.2
   }
   }
   },
)
print(resp)

Js

const response = await client.search({
  query: {
   sparse_vector: {
   field: "ml.tokens",
   query_vector: {
   token1: 0.5,
   token2: 0.3,
   token3: 0.2,
   },
   },
  },
});
console.log(response);

コンソール

GET _search
{
   "query":{
   "sparse_vector": {
   "field": "ml.tokens",
   "query_vector": { "token1": 0.5, "token2": 0.3, "token3": 0.2 }
   }
   }
}

スパースベクトルのトップレベルパラメータ

field
(必須、文字列) 検索対象のトークン-ウェイトペアを含むフィールドの名前。
inference_id
(オプション、文字列) クエリテキストをトークン-ウェイトペアに変換するために使用するinference ID。これは、入力テキストからトークンを作成するために使用されたのと同じ推論IDでなければなりません。inference_idとquery_vectorのいずれか一方のみが許可されます。inference_idが指定されている場合、queryも指定されなければなりません。
query
(オプション、文字列) 検索に使用したいクエリテキスト。inference_idが指定されている場合、queryも指定されなければなりません。query_vectorが指定されている場合、queryは指定されてはなりません。
query_vector
(オプション、辞書) 検索するための事前計算されたクエリベクトルを表すトークン-ウェイトペアの辞書。このクエリベクトルを使用して検索すると、追加の推論をバイパスします。inference_idとquery_vectorのいずれか一方のみが許可されます。
prune
(オプション、ブール値) [プレビュー] この機能は技術プレビュー中であり、将来のリリースで変更または削除される可能性があります。Elasticは問題を修正するために取り組みますが、技術プレビューの機能は公式GA機能のサポートSLAの対象ではありません。クエリのパフォーマンスを向上させるために、重要でないトークンを省略するためにプルーニングを実行するかどうか。pruneがtrueであるがpruning_configが指定されていない場合、プルーニングは行われますが、デフォルト値が使用されます。デフォルト：false。
pruning_config
(オプション、オブジェクト) [プレビュー] この機能は技術プレビュー中であり、将来のリリースで変更または削除される可能性があります。Elasticは問題を修正するために取り組みますが、技術プレビューの機能は公式GA機能のサポートSLAの対象ではありません。オプションのプルーニング設定。これが有効な場合、クエリのパフォーマンスを向上させるために重要でないトークンを省略します。これは、pruneがtrueに設定されている場合のみ使用されます。pruneがtrueに設定されているがpruning_configが指定されていない場合、デフォルト値が使用されます。
```
-   `````tokens_freq_ratio_threshold
```
- (オプション、整数) [プレビュー] この機能は技術プレビュー中であり、将来のリリースで変更または削除される可能性があります。Elasticは問題を修正するために取り組みますが、技術プレビューの機能は公式GA機能のサポートSLAの対象ではありません。指定されたフィールド内のすべてのトークンの平均頻度のtokens_freq_ratio_threshold倍を超える頻度を持つトークンは外れ値と見なされ、プルーニングされます。この値は1から100の間でなければなりません。デフォルト：5。
- tokens_weight_threshold
- (オプション、浮動小数点数) [プレビュー] この機能は技術プレビュー中であり、将来のリリースで変更または削除される可能性があります。Elasticは問題を修正するために取り組みますが、技術プレビューの機能は公式GA機能のサポートSLAの対象ではありません。 tokens_weight_threshold未満の重みを持つトークンは重要でないと見なされ、プルーニングされます。この値は0から1の間でなければなりません。デフォルト：0.4。
- only_score_pruned_tokens
- (オプション、ブール値) [プレビュー] この機能は技術プレビュー中であり、将来のリリースで変更または削除される可能性があります。Elasticは問題を修正するために取り組みますが、技術プレビューの機能は公式GA機能のサポートSLAの対象ではありません。 trueの場合、スコアリングにプルーニングされたトークンのみを入力し、プルーニングされていないトークンを破棄します。メインクエリにはfalseに設定することを強く推奨しますが、リスコアクエリにはtrueに設定してより関連性の高い結果を得ることができます。デフォルト：false。
  tokens_freq_ratio_thresholdとtokens_weight_thresholdのデフォルト値は、最も最適な結果を提供するELSERv2を使用したテストに基づいて選択されました。

ELSERクエリの例

以下は、ELSERモデルを参照して意味検索を実行するsparse_vectorクエリの例です。ELSERを使用して意味検索を実行する方法の詳細な説明については、このチュートリアルを参照してください。

Python

resp = client.search(
   index="my-index",
   query={
   "sparse_vector": {
   "field": "ml.tokens",
   "inference_id": "my-elser-model",
   "query": "How is the weather in Jamaica?"
   }
   },
)
print(resp)

Js

const response = await client.search({
  index: "my-index",
  query: {
   sparse_vector: {
   field: "ml.tokens",
   inference_id: "my-elser-model",
   query: "How is the weather in Jamaica?",
   },
  },
});
console.log(response);

コンソール

GET my-index/_search
{
   "query":{
   "sparse_vector": {
   "field": "ml.tokens",
   "inference_id": "my-elser-model",
   "query": "How is the weather in Jamaica?"
   }
   }
}

複数のsparse_vectorクエリは互いにまたは他のクエリタイプと組み合わせることができます。これは、ブールクエリ句でラップし、線形ブースティングを使用することで実現できます：

Python

resp = client.search(
   index="my-index",
   query={
   "bool": {
   "should": [
   {
   "sparse_vector": {
   "field": "ml.inference.title_expanded.predicted_value",
   "inference_id": "my-elser-model",
   "query": "How is the weather in Jamaica?",
   "boost": 1
   }
   },
   {
   "sparse_vector": {
   "field": "ml.inference.description_expanded.predicted_value",
   "inference_id": "my-elser-model",
   "query": "How is the weather in Jamaica?",
   "boost": 1
   }
   },
   {
   "multi_match": {
   "query": "How is the weather in Jamaica?",
   "fields": [
   "title",
   "description"
   ],
   "boost": 4
   }
   }
   ]
   }
   },
)
print(resp)

Js

const response = await client.search({
  index: "my-index",
  query: {
   bool: {
   should: [
   {
   sparse_vector: {
   field: "ml.inference.title_expanded.predicted_value",
   inference_id: "my-elser-model",
   query: "How is the weather in Jamaica?",
   boost: 1,
   },
   },
   {
   sparse_vector: {
   field: "ml.inference.description_expanded.predicted_value",
   inference_id: "my-elser-model",
   query: "How is the weather in Jamaica?",
   boost: 1,
   },
   },
   {
   multi_match: {
   query: "How is the weather in Jamaica?",
   fields: ["title", "description"],
   boost: 4,
   },
   },
   ],
   },
  },
});
console.log(response);

コンソール

GET my-index/_search
{
  "query": {
   "bool": {
   "should": [
   {
   "sparse_vector": {
   "field": "ml.inference.title_expanded.predicted_value",
   "inference_id": "my-elser-model",
   "query": "How is the weather in Jamaica?",
   "boost": 1
   }
   },
   {
   "sparse_vector": {
   "field": "ml.inference.description_expanded.predicted_value",
   "inference_id": "my-elser-model",
   "query": "How is the weather in Jamaica?",
   "boost": 1
   }
   },
   {
   "multi_match": {
   "query": "How is the weather in Jamaica?",
   "fields": [
   "title",
   "description"
   ],
   "boost": 4
   }
   }
   ]
   }
  }
}

これは、逆順位融合 (RRF)を使用して、複数のstandardリトリーバーを持つrrfリトリーバーを通じて実現することもできます。

Python

resp = client.search(
   index="my-index",
   retriever={
   "rrf": {
   "retrievers": [
   {
   "standard": {
   "query": {
   "multi_match": {
   "query": "How is the weather in Jamaica?",
   "fields": [
   "title",
   "description"
   ]
   }
   }
   }
   },
   {
   "standard": {
   "query": {
   "sparse_vector": {
   "field": "ml.inference.title_expanded.predicted_value",
   "inference_id": "my-elser-model",
   "query": "How is the weather in Jamaica?",
   "boost": 1
   }
   }
   }
   },
   {
   "standard": {
   "query": {
   "sparse_vector": {
   "field": "ml.inference.description_expanded.predicted_value",
   "inference_id": "my-elser-model",
   "query": "How is the weather in Jamaica?",
   "boost": 1
   }
   }
   }
   }
   ],
   "window_size": 10,
   "rank_constant": 20
   }
   },
)
print(resp)

Js

const response = await client.search({
  index: "my-index",
  retriever: {
   rrf: {
   retrievers: [
   {
   standard: {
   query: {
   multi_match: {
   query: "How is the weather in Jamaica?",
   fields: ["title", "description"],
   },
   },
   },
   },
   {
   standard: {
   query: {
   sparse_vector: {
   field: "ml.inference.title_expanded.predicted_value",
   inference_id: "my-elser-model",
   query: "How is the weather in Jamaica?",
   boost: 1,
   },
   },
   },
   },
   {
   standard: {
   query: {
   sparse_vector: {
   field: "ml.inference.description_expanded.predicted_value",
   inference_id: "my-elser-model",
   query: "How is the weather in Jamaica?",
   boost: 1,
   },
   },
   },
   },
   ],
   window_size: 10,
   rank_constant: 20,
   },
  },
});
console.log(response);

コンソール

GET my-index/_search
{
  "retriever": {
   "rrf": {
   "retrievers": [
   {
   "standard": {
   "query": {
   "multi_match": {
   "query": "How is the weather in Jamaica?",
   "fields": [
   "title",
   "description"
   ]
   }
   }
   }
   },
   {
   "standard": {
   "query": {
   "sparse_vector": {
   "field": "ml.inference.title_expanded.predicted_value",
   "inference_id": "my-elser-model",
   "query": "How is the weather in Jamaica?",
   "boost": 1
   }
   }
   }
   },
   {
   "standard": {
   "query": {
   "sparse_vector": {
   "field": "ml.inference.description_expanded.predicted_value",
   "inference_id": "my-elser-model",
   "query": "How is the weather in Jamaica?",
   "boost": 1
   }
   }
   }
   }
   ],
   "window_size": 10,
   "rank_constant": 20
   }
  }
}

プルーニング設定とリスコアを伴うELSERクエリの例

以下は、sparse_vectorクエリにプルーニング設定を追加した上記の例の拡張です。このプルーニング設定は、クエリのパフォーマンスを向上させるためにプルーニングする重要でないトークンを特定します。

トークンプルーニングはシャードレベルで行われます。これにより、シャード全体で同じトークンが重要でないとラベル付けされるはずですが、各シャードの構成に基づいて保証されるものではありません。したがって、マルチシャードインデックスでsparse_vectorをpruning_configと共に実行している場合、元々クエリからプルーニングされたトークンを使用してリスコアフィルタリングされた検索結果機能を追加することを強くお勧めします。これにより、プルーニングされたトークンのシャードレベルの不整合を軽減し、全体的な関連性を向上させることができます。

Python

resp = client.search(
   index="my-index",
   query={
   "sparse_vector": {
   "field": "ml.tokens",
   "inference_id": "my-elser-model",
   "query": "How is the weather in Jamaica?",
   "prune": True,
   "pruning_config": {
   "tokens_freq_ratio_threshold": 5,
   "tokens_weight_threshold": 0.4,
   "only_score_pruned_tokens": False
   }
   }
   },
   rescore={
   "window_size": 100,
   "query": {
   "rescore_query": {
   "sparse_vector": {
   "field": "ml.tokens",
   "inference_id": "my-elser-model",
   "query": "How is the weather in Jamaica?",
   "prune": True,
   "pruning_config": {
   "tokens_freq_ratio_threshold": 5,
   "tokens_weight_threshold": 0.4,
   "only_score_pruned_tokens": True
   }
   }
   }
   }
   },
)
print(resp)

Js

const response = await client.search({
  index: "my-index",
  query: {
   sparse_vector: {
   field: "ml.tokens",
   inference_id: "my-elser-model",
   query: "How is the weather in Jamaica?",
   prune: true,
   pruning_config: {
   tokens_freq_ratio_threshold: 5,
   tokens_weight_threshold: 0.4,
   only_score_pruned_tokens: false,
   },
   },
  },
  rescore: {
   window_size: 100,
   query: {
   rescore_query: {
   sparse_vector: {
   field: "ml.tokens",
   inference_id: "my-elser-model",
   query: "How is the weather in Jamaica?",
   prune: true,
   pruning_config: {
   tokens_freq_ratio_threshold: 5,
   tokens_weight_threshold: 0.4,
   only_score_pruned_tokens: true,
   },
   },
   },
   },
  },
});
console.log(response);

コンソール

GET my-index/_search
{
   "query":{
   "sparse_vector":{
   "field": "ml.tokens",
   "inference_id": "my-elser-model",
   "query":"How is the weather in Jamaica?",
   "prune": true,
   "pruning_config": {
   "tokens_freq_ratio_threshold": 5,
   "tokens_weight_threshold": 0.4,
   "only_score_pruned_tokens": false
   }
   }
   },
   "rescore": {
   "window_size": 100,
   "query": {
   "rescore_query": {
   "sparse_vector": {
   "field": "ml.tokens",
   "inference_id": "my-elser-model",
   "query": "How is the weather in Jamaica?",
   "prune": true,
   "pruning_config": {
   "tokens_freq_ratio_threshold": 5,
   "tokens_weight_threshold": 0.4,
   "only_score_pruned_tokens": true
   }
   }
   }
   }
   }
}

クロスクラスター検索を実行する際、推論はローカルクラスターで実行されます。