セマンティック検索 - 推論APIを使用したセマンティック検索（Semantic search with the inference API）

Tutorial: semantic search with the inference API
Requirements
Create an inference endpoint
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
Create the index mapping
Create an ingest pipeline with an inference processor
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
Load data
Ingest the data through the inference ingest pipeline
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
- Python
- Js
- Console
Semantic search
Interactive tutorials

Tutorial: semantic search with the inference API

このチュートリアルの指示は、さまざまなサービスを使用してデータに対してセマンティック検索を実行するための推論APIワークフローの使用方法を示しています。

Elastic Stackでセマンティック検索を最も簡単に実行する方法については、semantic_textのエンドツーエンドチュートリアルを参照してください。

以下の例では、次のモデルを使用します：

Cohereのためのembed-english-v3.0モデル
HuggingFaceのall-mpnet-base-v2モデル
OpenAIの第二世代埋め込みモデルtext-embedding-ada-002
Azure AI StudioまたはAzure OpenAIを通じて利用可能なモデル
Google Vertex AIのためのtext-embedding-004モデル
Mistralのためのmistral-embedモデル
Amazon Bedrockのためのamazon.titan-embed-text-v1モデル

CohereおよびOpenAIの任意のモデルを使用できます。これらはすべて推論APIによってサポートされています。HuggingFaceで利用可能な推奨モデルのリストについては、サポートされているモデルリストを参照してください。

以下のウィジェットのいずれかで使用したいサービスの名前をクリックして、対応する指示を確認してください。

Requirements

Cohereサービスで推論APIを使用するには、Cohereアカウントが必要です。

ELSERはElasticによってトレーニングされたモデルです。Elasticsearchのデプロイメントがある場合、elserサービスで推論APIを使用するための追加要件はありません。

HuggingFaceサービスで推論APIを使用するには、HuggingFaceアカウントが必要です。

OpenAIサービスで推論APIを使用するには、OpenAIアカウントが必要です。

Azureサブスクリプション
希望するAzureサブスクリプションでAzure OpenAIへのアクセスが付与されていること。Azure OpenAIへのアクセスを申請するには、https://aka.ms/oai/accessのフォームに記入してください。
Azure OpenAI Studioにデプロイされた埋め込みモデル。
Azureサブスクリプション
Azure AI Studioへのアクセス
デプロイされた埋め込みまたはチャット完了モデル。
Google Cloudアカウント
Google Cloud内のプロジェクト
プロジェクト内でVertex AI APIが有効になっていること
Google Vertex AI API用の有効なサービスアカウント
サービスアカウントはVertex AIユーザー役割とaiplatform.endpoints.predict権限を持っている必要があります。
La PlateformeのMistralアカウント
アカウント用に生成されたAPIキー
Amazon Bedrockへのアクセスを持つAWSアカウント
Amazon Bedrockにアクセスするためのアクセスキーとシークレットキーのペア

Create an inference endpoint

推論エンドポイントを作成するには、Create inference APIを使用します：

Python

resp = client.inference.put(
   task_type="text_embedding",
   inference_id="cohere_embeddings",
   inference_config={
   "service": "cohere",
   "service_settings": {
   "api_key": "<api_key>",
   "model_id": "embed-english-v3.0",
   "embedding_type": "byte"
   }
   },
)
print(resp)

Js

const response = await client.inference.put({
  task_type: "text_embedding",
  inference_id: "cohere_embeddings",
  inference_config: {
   service: "cohere",
   service_settings: {
   api_key: "<api_key>",
   model_id: "embed-english-v3.0",
   embedding_type: "byte",
   },
  },
});
console.log(response);

Console

PUT _inference/text_embedding/cohere_embeddings
{
   "service": "cohere",
   "service_settings": {
   "api_key": "<api_key>",
   "model_id": "embed-english-v3.0",
   "embedding_type": "byte"
   }
}


	タイプは`text_embedding`で、`inference_id`は推論エンドポイントの一意の識別子`cohere_embeddings`です。
	CohereアカウントのAPIキー。APIキーはCohereダッシュボードのAPIキーセクションで見つけることができます。APIキーは一度だけ提供する必要があります。Get inference APIはAPIキーを返しません。
	使用する埋め込みモデルの名前。Cohereの埋め込みモデルのリストはこちらで確認できます。

このモデルを使用する場合、dense_vectorフィールドマッピングで使用する推奨の類似度測定はdot_productです。Cohereモデルの場合、埋め込みは単位長に正規化されており、dot_productとcosineの測定は同等です。

Python

resp = client.inference.put(
   task_type="sparse_embedding",
   inference_id="elser_embeddings",
   inference_config={
   "service": "elser",
   "service_settings": {
   "num_allocations": 1,
   "num_threads": 1
   }
   },
)
print(resp)

Js

const response = await client.inference.put({
  task_type: "sparse_embedding",
  inference_id: "elser_embeddings",
  inference_config: {
   service: "elser",
   service_settings: {
   num_allocations: 1,
   num_threads: 1,
   },
  },
});
console.log(response);

Console

PUT _inference/sparse_embedding/elser_embeddings
{
  "service": "elser",
  "service_settings": {
   "num_allocations": 1,
   "num_threads": 1
  }
}


	タイプは`sparse_embedding`で、`inference_id`は推論エンドポイントの一意の識別子`elser_embeddings`です。

ELSERモデルを事前にダウンロードしてデプロイする必要はありません。上記のAPIリクエストは、モデルがまだダウンロードされていない場合にモデルをダウンロードし、その後デプロイします。

Kibana Consoleを使用しているときに502 Bad Gatewayエラーが応答に表示されることがあります。このエラーは通常、モデルがバックグラウンドでダウンロードされている間のタイムアウトを反映しています。ダウンロードの進行状況はMachine Learning UIで確認できます。Pythonクライアントを使用している場合、timeoutパラメータをより高い値に設定できます。

まず、Hugging Faceエンドポイントページで新しい推論エンドポイントを作成してエンドポイントURLを取得する必要があります。新しいエンドポイント作成ページでモデルall-mpnet-base-v2を選択し、次に高度な設定セクションでSentence Embeddingsタスクを選択します。エンドポイントを作成します。エンドポイントの初期化が完了した後、URLをコピーします。このURLは次の推論API呼び出しで必要です。

Python

resp = client.inference.put(
   task_type="text_embedding",
   inference_id="hugging_face_embeddings",
   inference_config={
   "service": "hugging_face",
   "service_settings": {
   "api_key": "<access_token>",
   "url": "<url_endpoint>"
   }
   },
)
print(resp)

Js

const response = await client.inference.put({
  task_type: "text_embedding",
  inference_id: "hugging_face_embeddings",
  inference_config: {
   service: "hugging_face",
   service_settings: {
   api_key: "<access_token>",
   url: "<url_endpoint>",
   },
  },
});
console.log(response);

Console

PUT _inference/text_embedding/hugging_face_embeddings
{
  "service": "hugging_face",
  "service_settings": {
   "api_key": "<access_token>",
   "url": "<url_endpoint>"
  }
}


	タイプは`text_embedding`で、`inference_id`は推論エンドポイントの一意の識別子`hugging_face_embeddings`です。
	有効なHuggingFaceアクセストークン。アカウントの設定ページで見つけることができます。
	Hugging Faceで作成した推論エンドポイントのURL。

Python

resp = client.inference.put(
   task_type="text_embedding",
   inference_id="openai_embeddings",
   inference_config={
   "service": "openai",
   "service_settings": {
   "api_key": "<api_key>",
   "model_id": "text-embedding-ada-002"
   }
   },
)
print(resp)

Js

const response = await client.inference.put({
  task_type: "text_embedding",
  inference_id: "openai_embeddings",
  inference_config: {
   service: "openai",
   service_settings: {
   api_key: "<api_key>",
   model_id: "text-embedding-ada-002",
   },
  },
});
console.log(response);

Console

PUT _inference/text_embedding/openai_embeddings
{
   "service": "openai",
   "service_settings": {
   "api_key": "<api_key>",
   "model_id": "text-embedding-ada-002"
   }
}


	タイプは`text_embedding`で、`inference_id`は推論エンドポイントの一意の識別子`openai_embeddings`です。
	OpenAIアカウントのAPIキー。OpenAIアカウントのAPIキーセクションでOpenAI APIキーを見つけることができます。APIキーは一度だけ提供する必要があります。Get inference APIはAPIキーを返しません。
	使用する埋め込みモデルの名前。OpenAIの埋め込みモデルのリストはこちらで確認できます。

このモデルを使用する場合、dense_vectorフィールドマッピングで使用する推奨の類似度測定はdot_productです。OpenAIモデルの場合、埋め込みは単位長に正規化されており、dot_productとcosineの測定は同等です。

Python

resp = client.inference.put(
   task_type="text_embedding",
   inference_id="azure_openai_embeddings",
   inference_config={
   "service": "azureopenai",
   "service_settings": {
   "api_key": "<api_key>",
   "resource_name": "<resource_name>",
   "deployment_id": "<deployment_id>",
   "api_version": "2024-02-01"
   }
   },
)
print(resp)

Js

const response = await client.inference.put({
  task_type: "text_embedding",
  inference_id: "azure_openai_embeddings",
  inference_config: {
   service: "azureopenai",
   service_settings: {
   api_key: "<api_key>",
   resource_name: "<resource_name>",
   deployment_id: "<deployment_id>",
   api_version: "2024-02-01",
   },
  },
});
console.log(response);

Console

PUT _inference/text_embedding/azure_openai_embeddings
{
   "service": "azureopenai",
   "service_settings": {
   "api_key": "<api_key>",
   "resource_name": "<resource_name>",
   "deployment_id": "<deployment_id>",
   "api_version": "2024-02-01"
   }
}


	タイプは`text_embedding`で、`inference_id`は推論エンドポイントの一意の識別子`azure_openai_embeddings`です。
	Azure OpenAIサービスにアクセスするためのAPIキー。代わりに、ここで`entra_id`を提供することもできます。 Get inference APIはこの情報を返しません。
	Azureリソースの名前。
	デプロイされたモデルのID。

モデルが作成された後、デプロイメントが利用可能になるまでに数分かかることがあります。上記のようにモデルを作成し、404エラーメッセージが表示された場合は、数分待ってから再試行してください。また、このモデルを使用する場合、dense_vectorフィールドマッピングで使用する推奨の類似度測定はdot_productです。Azure OpenAIモデルの場合、埋め込みは単位長に正規化されており、dot_productとcosineの測定は同等です。

Python

resp = client.inference.put(
   task_type="text_embedding",
   inference_id="azure_ai_studio_embeddings",
   inference_config={
   "service": "azureaistudio",
   "service_settings": {
   "api_key": "<api_key>",
   "target": "<target_uri>",
   "provider": "<provider>",
   "endpoint_type": "<endpoint_type>"
   }
   },
)
print(resp)

Js

const response = await client.inference.put({
  task_type: "text_embedding",
  inference_id: "azure_ai_studio_embeddings",
  inference_config: {
   service: "azureaistudio",
   service_settings: {
   api_key: "<api_key>",
   target: "<target_uri>",
   provider: "<provider>",
   endpoint_type: "<endpoint_type>",
   },
  },
});
console.log(response);

Console

PUT _inference/text_embedding/azure_ai_studio_embeddings
{
   "service": "azureaistudio",
   "service_settings": {
   "api_key": "<api_key>",
   "target": "<target_uri>",
   "provider": "<provider>",
   "endpoint_type": "<endpoint_type>"
   }
}


	タイプは`text_embedding`で、`inference_id`は推論エンドポイントの一意の識別子`azure_ai_studio_embeddings`です。
	Azure AI StudioでデプロイされたモデルにアクセスするためのAPIキー。モデルデプロイメントの概要ページでこれを見つけることができます。
	Azure AI StudioでデプロイされたモデルにアクセスするためのターゲットURI。モデルデプロイメントの概要ページでこれを見つけることができます。
	モデルプロバイダー、`cohere`または`openai`など。
	デプロイされたエンドポイントのタイプ。これは、`token`（「従量課金」デプロイメント用）またはリアルタイムデプロイメントエンドポイント用の`realtime`です。

Python

resp = client.inference.put(
   task_type="text_embedding",
   inference_id="google_vertex_ai_embeddings",
   inference_config={
   "service": "googlevertexai",
   "service_settings": {
   "service_account_json": "<service_account_json>",
   "model_id": "text-embedding-004",
   "location": "<location>",
   "project_id": "<project_id>"
   }
   },
)
print(resp)

Js

const response = await client.inference.put({
  task_type: "text_embedding",
  inference_id: "google_vertex_ai_embeddings",
  inference_config: {
   service: "googlevertexai",
   service_settings: {
   service_account_json: "<service_account_json>",
   model_id: "text-embedding-004",
   location: "<location>",
   project_id: "<project_id>",
   },
  },
});
console.log(response);

Console

PUT _inference/text_embedding/google_vertex_ai_embeddings
{
   "service": "googlevertexai",
   "service_settings": {
   "service_account_json": "<service_account_json>",
   "model_id": "text-embedding-004",
   "location": "<location>",
   "project_id": "<project_id>"
   }
}


	パスに従って`text_embedding`のタスクタイプ。`google_vertex_ai_embeddings`は推論エンドポイントの一意の識別子（その`inference_id`）です。
	Google Vertex AI API用の有効なサービスアカウント（JSON形式）。
	利用可能なモデルのリストについては、Text embeddings APIページを参照してください。
	推論タスクに使用する場所の名前。利用可能な場所については、Generative AI on Vertex AI locationsを参照してください。
	推論タスクに使用するプロジェクトの名前。

Python

resp = client.inference.put(
   task_type="text_embedding",
   inference_id="mistral_embeddings",
   inference_config={
   "service": "mistral",
   "service_settings": {
   "api_key": "<api_key>",
   "model": "<model_id>"
   }
   },
)
print(resp)

Js

const response = await client.inference.put({
  task_type: "text_embedding",
  inference_id: "mistral_embeddings",
  inference_config: {
   service: "mistral",
   service_settings: {
   api_key: "<api_key>",
   model: "<model_id>",
   },
  },
});
console.log(response);

Console

PUT _inference/text_embedding/mistral_embeddings
{
   "service": "mistral",
   "service_settings": {
   "api_key": "<api_key>",
   "model": "<model_id>"
   }
}


	タイプは`text_embedding`で、`inference_id`は推論エンドポイントの一意の識別子`mistral_embeddings`です。
	Mistral APIにアクセスするためのAPIキー。MistralアカウントのAPIキーのページでこれを見つけることができます。
	Mistral埋め込みモデルの名前、例えば`mistral-embed`。

Python

resp = client.inference.put(
   task_type="text_embedding",
   inference_id="amazon_bedrock_embeddings",
   inference_config={
   "service": "amazonbedrock",
   "service_settings": {
   "access_key": "<aws_access_key>",
   "secret_key": "<aws_secret_key>",
   "region": "<region>",
   "provider": "<provider>",
   "model": "<model_id>"
   }
   },
)
print(resp)

Js

const response = await client.inference.put({
  task_type: "text_embedding",
  inference_id: "amazon_bedrock_embeddings",
  inference_config: {
   service: "amazonbedrock",
   service_settings: {
   access_key: "<aws_access_key>",
   secret_key: "<aws_secret_key>",
   region: "<region>",
   provider: "<provider>",
   model: "<model_id>",
   },
  },
});
console.log(response);

Console

PUT _inference/text_embedding/amazon_bedrock_embeddings
{
   "service": "amazonbedrock",
   "service_settings": {
   "access_key": "<aws_access_key>",
   "secret_key": "<aws_secret_key>",
   "region": "<region>",
   "provider": "<provider>",
   "model": "<model_id>"
   }
}


	タイプは`text_embedding`で、`inference_id`は推論エンドポイントの一意の識別子`amazon_bedrock_embeddings`です。
	Amazon BedrockにアクセスするためのユーザーアカウントのAWS IAM管理ページでアクセスキーを見つけることができます。
	シークレットキーは指定されたアクセスキーのペアキーである必要があります。
	モデルがホストされているリージョンを指定します。
	モデルプロバイダーを指定します。
	使用するモデルのモデルIDまたはARN。

Create the index mapping

宛先インデックスのマッピング - モデルが入力テキストに基づいて作成する埋め込みを含むインデックス - を作成する必要があります。宛先インデックスには、ほとんどのモデルに対してdense_vectorフィールドタイプを持つフィールドが必要であり、sparse_vectorフィールドタイプを持つスパースベクトルモデルの場合、elserサービスのように使用されるモデルの出力をインデックス化します。

Python

resp = client.indices.create(
   index="cohere-embeddings",
   mappings={
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 1024,
   "element_type": "byte"
   },
   "content": {
   "type": "text"
   }
   }
   },
)
print(resp)

Ruby

response = client.indices.create(
  index: 'cohere-embeddings',
  body: {
   mappings: {
   properties: {
   content_embedding: {
   type: 'dense_vector',
   dims: 1024,
   element_type: 'byte'
   },
   content: {
   type: 'text'
   }
   }
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "cohere-embeddings",
  mappings: {
   properties: {
   content_embedding: {
   type: "dense_vector",
   dims: 1024,
   element_type: "byte",
   },
   content: {
   type: "text",
   },
   },
  },
});
console.log(response);

Console

PUT cohere-embeddings
{
  "mappings": {
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 1024,
   "element_type": "byte"
   },
   "content": {
   "type": "text"
   }
   }
  }
}


	生成されたトークンを含むフィールドの名前。次のステップの推論パイプライン構成で参照する必要があります。
	トークンを含むフィールドは`dense_vector`フィールドです。
	モデルの出力次元。使用するモデルのCohereドキュメントでこの値を見つけてください。
	密なベクトル表現を作成するためのフィールドの名前。この例では、フィールドの名前は`content`です。次のステップの推論パイプライン構成で参照する必要があります。
	この例ではテキストのフィールドタイプです。

Python

resp = client.indices.create(
   index="elser-embeddings",
   mappings={
   "properties": {
   "content_embedding": {
   "type": "sparse_vector"
   },
   "content": {
   "type": "text"
   }
   }
   },
)
print(resp)

Js

const response = await client.indices.create({
  index: "elser-embeddings",
  mappings: {
   properties: {
   content_embedding: {
   type: "sparse_vector",
   },
   content: {
   type: "text",
   },
   },
  },
});
console.log(response);

Console

PUT elser-embeddings
{
  "mappings": {
   "properties": {
   "content_embedding": {
   "type": "sparse_vector"
   },
   "content": {
   "type": "text"
   }
   }
  }
}


	生成されたトークンを含むフィールドの名前。次のステップの推論パイプライン構成で参照する必要があります。
	ELSERの場合、トークンを含むフィールドは`sparse_vector`フィールドです。
	密なベクトル表現を作成するためのフィールドの名前。この例では、フィールドの名前は`content`です。次のステップの推論パイプライン構成で参照する必要があります。
	この例ではテキストのフィールドタイプです。

Python

resp = client.indices.create(
   index="hugging-face-embeddings",
   mappings={
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 768,
   "element_type": "float"
   },
   "content": {
   "type": "text"
   }
   }
   },
)
print(resp)

Ruby

response = client.indices.create(
  index: 'hugging-face-embeddings',
  body: {
   mappings: {
   properties: {
   content_embedding: {
   type: 'dense_vector',
   dims: 768,
   element_type: 'float'
   },
   content: {
   type: 'text'
   }
   }
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "hugging-face-embeddings",
  mappings: {
   properties: {
   content_embedding: {
   type: "dense_vector",
   dims: 768,
   element_type: "float",
   },
   content: {
   type: "text",
   },
   },
  },
});
console.log(response);

Console

PUT hugging-face-embeddings
{
  "mappings": {
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 768,
   "element_type": "float"
   },
   "content": {
   "type": "text"
   }
   }
  }
}


	生成されたトークンを含むフィールドの名前。次のステップの推論パイプライン構成で参照する必要があります。
	トークンを含むフィールドは`dense_vector`フィールドです。
	モデルの出力次元。この値は、使用するモデルのHuggingFaceモデルドキュメントで見つけることができます。
	密なベクトル表現を作成するためのフィールドの名前。この例では、フィールドの名前は`content`です。次のステップの推論パイプライン構成で参照する必要があります。
	この例ではテキストのフィールドタイプです。

Python

resp = client.indices.create(
   index="openai-embeddings",
   mappings={
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 1536,
   "element_type": "float",
   "similarity": "dot_product"
   },
   "content": {
   "type": "text"
   }
   }
   },
)
print(resp)

Ruby

response = client.indices.create(
  index: 'openai-embeddings',
  body: {
   mappings: {
   properties: {
   content_embedding: {
   type: 'dense_vector',
   dims: 1536,
   element_type: 'float',
   similarity: 'dot_product'
   },
   content: {
   type: 'text'
   }
   }
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "openai-embeddings",
  mappings: {
   properties: {
   content_embedding: {
   type: "dense_vector",
   dims: 1536,
   element_type: "float",
   similarity: "dot_product",
   },
   content: {
   type: "text",
   },
   },
  },
});
console.log(response);

チュートリアル: 推論APIを使用したセマンティック検索

Elastic Stackでセマンティック検索を実行する最も簡単な方法については、semantic_textエンドツーエンドチュートリアルを参照してください。

以下の例では、次のモデルを使用します：

Cohereのためのembed-english-v3.0モデル
HuggingFaceのall-mpnet-base-v2モデル
OpenAIの第二世代埋め込みモデルtext-embedding-ada-002
Azure AI StudioまたはAzure OpenAIを通じて利用可能なモデル
Google Vertex AIのためのtext-embedding-004モデル
Mistralのためのmistral-embedモデル
amazon.titan-embed-text-v1 モデルは Amazon Bedrock

以下のウィジェットのいずれかで使用したいサービスの名前をクリックして、対応する指示を確認してください。

Python

resp = client.indices.create(
   index="azure-openai-embeddings",
   mappings={
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 1536,
   "element_type": "float",
   "similarity": "dot_product"
   },
   "content": {
   "type": "text"
   }
   }
   },
)
print(resp)

Ruby

response = client.indices.create(
  index: 'azure-openai-embeddings',
  body: {
   mappings: {
   properties: {
   content_embedding: {
   type: 'dense_vector',
   dims: 1536,
   element_type: 'float',
   similarity: 'dot_product'
   },
   content: {
   type: 'text'
   }
   }
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "azure-openai-embeddings",
  mappings: {
   properties: {
   content_embedding: {
   type: "dense_vector",
   dims: 1536,
   element_type: "float",
   similarity: "dot_product",
   },
   content: {
   type: "text",
   },
   },
  },
});
console.log(response);

Console

PUT azure-openai-embeddings
{
  "mappings": {
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 1536,
   "element_type": "float",
   "similarity": "dot_product"
   },
   "content": {
   "type": "text"
   }
   }
  }
}


	生成されたトークンを含むフィールドの名前。次のステップの推論パイプライン構成で参照する必要があります。
	トークンを含むフィールドは`dense_vector`フィールドです。
	モデルの出力次元。この値は、Azure OpenAIドキュメントで見つけることができます。 Azure OpenAI埋め込みの場合、`dot_product`関数を使用して類似度を計算する必要があります。モデル仕様に関する詳細は、Azure OpenAI埋め込みドキュメントを参照してください。
	密なベクトル表現を作成するためのフィールドの名前。この例では、フィールドの名前は`content`です。次のステップの推論パイプライン構成で参照する必要があります。
	この例ではテキストのフィールドタイプです。

Python

resp = client.indices.create(
   index="azure-ai-studio-embeddings",
   mappings={
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 1536,
   "element_type": "float",
   "similarity": "dot_product"
   },
   "content": {
   "type": "text"
   }
   }
   },
)
print(resp)

Js

const response = await client.indices.create({
  index: "azure-ai-studio-embeddings",
  mappings: {
   properties: {
   content_embedding: {
   type: "dense_vector",
   dims: 1536,
   element_type: "float",
   similarity: "dot_product",
   },
   content: {
   type: "text",
   },
   },
  },
});
console.log(response);

Console

PUT azure-ai-studio-embeddings
{
  "mappings": {
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 1536,
   "element_type": "float",
   "similarity": "dot_product"
   },
   "content": {
   "type": "text"
   }
   }
  }
}


	生成されたトークンを含むフィールドの名前。次のステップの推論パイプライン構成で参照する必要があります。
	トークンを含むフィールドは`dense_vector`フィールドです。
	モデルの出力次元。この値は、Azure AI Studioデプロイメントのモデルカードで見つけることができます。
	Azure AI Studio埋め込みの場合、`dot_product`関数を使用して類似度を計算する必要があります。
	密なベクトル表現を作成するためのフィールドの名前。この例では、フィールドの名前は`content`です。次のステップの推論パイプライン構成で参照する必要があります。
	この例ではテキストのフィールドタイプです。

Python

resp = client.indices.create(
   index="google-vertex-ai-embeddings",
   mappings={
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 768,
   "element_type": "float",
   "similarity": "dot_product"
   },
   "content": {
   "type": "text"
   }
   }
   },
)
print(resp)

Js

const response = await client.indices.create({
  index: "google-vertex-ai-embeddings",
  mappings: {
   properties: {
   content_embedding: {
   type: "dense_vector",
   dims: 768,
   element_type: "float",
   similarity: "dot_product",
   },
   content: {
   type: "text",
   },
   },
  },
});
console.log(response);

Console

PUT google-vertex-ai-embeddings
{
  "mappings": {
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 768,
   "element_type": "float",
   "similarity": "dot_product"
   },
   "content": {
   "type": "text"
   }
   }
  }
}


	生成された埋め込みを含むフィールドの名前。次のステップの推論パイプライン構成で参照する必要があります。
	埋め込みを含むフィールドは`dense_vector`フィールドです。
	モデルの出力次元。この値はGoogle Vertex AIモデルリファレンスで見つけることができます。推論APIは、`dims`が指定されていない場合、出力次元を自動的に計算しようとします。
	Google Vertex AI埋め込みの場合、`dot_product`関数を使用して類似度を計算する必要があります。
	密なベクトル表現を作成するためのフィールドの名前。この例では、フィールドの名前は`content`です。次のステップの推論パイプライン構成で参照する必要があります。
	この例では`text`のフィールドタイプです。

Python

resp = client.indices.create(
   index="mistral-embeddings",
   mappings={
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 1024,
   "element_type": "float",
   "similarity": "dot_product"
   },
   "content": {
   "type": "text"
   }
   }
   },
)
print(resp)

Js

const response = await client.indices.create({
  index: "mistral-embeddings",
  mappings: {
   properties: {
   content_embedding: {
   type: "dense_vector",
   dims: 1024,
   element_type: "float",
   similarity: "dot_product",
   },
   content: {
   type: "text",
   },
   },
  },
});
console.log(response);

Console

PUT mistral-embeddings
{
  "mappings": {
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 1024,
   "element_type": "float",
   "similarity": "dot_product"
   },
   "content": {
   "type": "text"
   }
   }
  }
}


	生成されたトークンを含むフィールドの名前。次のステップの推論パイプライン構成で参照する必要があります。
	トークンを含むフィールドは`dense_vector`フィールドです。
	モデルの出力次元。この値はMistralモデルリファレンスで見つけることができます。
	Mistral埋め込みの場合、`dot_product`関数を使用して類似度を計算する必要があります。
	密なベクトル表現を作成するためのフィールドの名前。この例では、フィールドの名前は`content`です。次のステップの推論パイプライン構成で参照する必要があります。
	この例ではテキストのフィールドタイプです。

Python

resp = client.indices.create(
   index="amazon-bedrock-embeddings",
   mappings={
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 1024,
   "element_type": "float",
   "similarity": "dot_product"
   },
   "content": {
   "type": "text"
   }
   }
   },
)
print(resp)

Js

const response = await client.indices.create({
  index: "amazon-bedrock-embeddings",
  mappings: {
   properties: {
   content_embedding: {
   type: "dense_vector",
   dims: 1024,
   element_type: "float",
   similarity: "dot_product",
   },
   content: {
   type: "text",
   },
   },
  },
});
console.log(response);

Console

PUT amazon-bedrock-embeddings
{
  "mappings": {
   "properties": {
   "content_embedding": {
   "type": "dense_vector",
   "dims": 1024,
   "element_type": "float",
   "similarity": "dot_product"
   },
   "content": {
   "type": "text"
   }
   }
  }
}


	生成されたトークンを含むフィールドの名前。次のステップの推論パイプライン構成で参照する必要があります。
	トークンを含むフィールドは`dense_vector`フィールドです。
	モデルの出力次元。この値は、使用するモデルによって異なる場合があります。 Amazon TitanモデルまたはCohere Embeddingsモデルのドキュメントを参照してください。
	Amazon Bedrock埋め込みの場合、Amazon Titanモデル用には`dot_product`関数を使用し、Cohereモデル用には`cosine`関数を使用する必要があります。
	密なベクトル表現を作成するためのフィールドの名前。この例では、フィールドの名前は`content`です。次のステップの推論パイプライン構成で参照する必要があります。
	この例ではテキストのフィールドタイプです。

Create an ingest pipeline with an inference processor

推論プロセッサを使用してインジェストパイプラインを作成し、上記で作成したモデルを使用してパイプラインに取り込まれるデータに対して推論を行います。

Python

resp = client.ingest.put_pipeline(
   id="cohere_embeddings_pipeline",
   processors=[
   {
   "inference": {
   "model_id": "cohere_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
   ],
)
print(resp)

Js

const response = await client.ingest.putPipeline({
  id: "cohere_embeddings_pipeline",
  processors: [
   {
   inference: {
   model_id: "cohere_embeddings",
   input_output: {
   input_field: "content",
   output_field: "content_embedding",
   },
   },
   },
  ],
});
console.log(response);

Console

PUT _ingest/pipeline/cohere_embeddings_pipeline
{
  "processors": [
   {
   "inference": {
   "model_id": "cohere_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
  ]
}


	Create inference APIを使用して作成した推論エンドポイントの名前。ステップでは`inference_id`と呼ばれます。
	推論プロセスのための`input_field`を定義する構成オブジェクトと、推論結果を含む`output_field`。

Python

resp = client.ingest.put_pipeline(
   id="elser_embeddings_pipeline",
   processors=[
   {
   "inference": {
   "model_id": "elser_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
   ],
)
print(resp)

Js

const response = await client.ingest.putPipeline({
  id: "elser_embeddings_pipeline",
  processors: [
   {
   inference: {
   model_id: "elser_embeddings",
   input_output: {
   input_field: "content",
   output_field: "content_embedding",
   },
   },
   },
  ],
});
console.log(response);

Console

PUT _ingest/pipeline/elser_embeddings_pipeline
{
  "processors": [
   {
   "inference": {
   "model_id": "elser_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
  ]
}


	Create inference APIを使用して作成した推論エンドポイントの名前。ステップでは`inference_id`と呼ばれます。
	推論プロセスのための`input_field`を定義する構成オブジェクトと、推論結果を含む`output_field`。

Python

resp = client.ingest.put_pipeline(
   id="hugging_face_embeddings_pipeline",
   processors=[
   {
   "inference": {
   "model_id": "hugging_face_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
   ],
)
print(resp)

Js

const response = await client.ingest.putPipeline({
  id: "hugging_face_embeddings_pipeline",
  processors: [
   {
   inference: {
   model_id: "hugging_face_embeddings",
   input_output: {
   input_field: "content",
   output_field: "content_embedding",
   },
   },
   },
  ],
});
console.log(response);

Console

PUT _ingest/pipeline/hugging_face_embeddings_pipeline
{
  "processors": [
   {
   "inference": {
   "model_id": "hugging_face_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
  ]
}


	Create inference APIを使用して作成した推論エンドポイントの名前。ステップでは`inference_id`と呼ばれます。
	推論プロセスのための`input_field`を定義する構成オブジェクトと、推論結果を含む`output_field`。

Python

resp = client.ingest.put_pipeline(
   id="openai_embeddings_pipeline",
   processors=[
   {
   "inference": {
   "model_id": "openai_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
   ],
)
print(resp)

Js

const response = await client.ingest.putPipeline({
  id: "openai_embeddings_pipeline",
  processors: [
   {
   inference: {
   model_id: "openai_embeddings",
   input_output: {
   input_field: "content",
   output_field: "content_embedding",
   },
   },
   },
  ],
});
console.log(response);

Console

PUT _ingest/pipeline/openai_embeddings_pipeline
{
  "processors": [
   {
   "inference": {
   "model_id": "openai_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
  ]
}


	Create inference APIを使用して作成した推論エンドポイントの名前。ステップでは`inference_id`と呼ばれます。
	推論プロセスのための`input_field`を定義する構成オブジェクトと、推論結果を含む`output_field`。

Python

resp = client.ingest.put_pipeline(
   id="azure_openai_embeddings_pipeline",
   processors=[
   {
   "inference": {
   "model_id": "azure_openai_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
   ],
)
print(resp)

Js

const response = await client.ingest.putPipeline({
  id: "azure_openai_embeddings_pipeline",
  processors: [
   {
   inference: {
   model_id: "azure_openai_embeddings",
   input_output: {
   input_field: "content",
   output_field: "content_embedding",
   },
   },
   },
  ],
});
console.log(response);

Console

PUT _ingest/pipeline/azure_openai_embeddings_pipeline
{
  "processors": [
   {
   "inference": {
   "model_id": "azure_openai_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
  ]
}


	Create inference APIを使用して作成した推論エンドポイントの名前。ステップでは`inference_id`と呼ばれます。
	推論プロセスのための`input_field`を定義する構成オブジェクトと、推論結果を含む`output_field`。

Python

resp = client.ingest.put_pipeline(
   id="azure_ai_studio_embeddings_pipeline",
   processors=[
   {
   "inference": {
   "model_id": "azure_ai_studio_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
   ],
)
print(resp)

Js

const response = await client.ingest.putPipeline({
  id: "azure_ai_studio_embeddings_pipeline",
  processors: [
   {
   inference: {
   model_id: "azure_ai_studio_embeddings",
   input_output: {
   input_field: "content",
   output_field: "content_embedding",
   },
   },
   },
  ],
});
console.log(response);

Console

PUT _ingest/pipeline/azure_ai_studio_embeddings_pipeline
{
  "processors": [
   {
   "inference": {
   "model_id": "azure_ai_studio_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
  ]
}


	Create inference APIを使用して作成した推論エンドポイントの名前。ステップでは`inference_id`と呼ばれます。
	推論プロセスのための`input_field`を定義する構成オブジェクトと、推論結果を含む`output_field`。

Python

resp = client.ingest.put_pipeline(
   id="google_vertex_ai_embeddings_pipeline",
   processors=[
   {
   "inference": {
   "model_id": "google_vertex_ai_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
   ],
)
print(resp)

Js

const response = await client.ingest.putPipeline({
  id: "google_vertex_ai_embeddings_pipeline",
  processors: [
   {
   inference: {
   model_id: "google_vertex_ai_embeddings",
   input_output: {
   input_field: "content",
   output_field: "content_embedding",
   },
   },
   },
  ],
});
console.log(response);

Console

PUT _ingest/pipeline/google_vertex_ai_embeddings_pipeline
{
  "processors": [
   {
   "inference": {
   "model_id": "google_vertex_ai_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
  ]
}


	あなたが作成した推論エンドポイントの名前です。それは、`inference_id`としてそのステップで言及されています。
	推論プロセスのための`input_field`を定義する構成オブジェクトと、推論結果を含む`output_field`。

Python

resp = client.ingest.put_pipeline(
   id="mistral_embeddings_pipeline",
   processors=[
   {
   "inference": {
   "model_id": "mistral_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
   ],
)
print(resp)

Js

const response = await client.ingest.putPipeline({
  id: "mistral_embeddings_pipeline",
  processors: [
   {
   inference: {
   model_id: "mistral_embeddings",
   input_output: {
   input_field: "content",
   output_field: "content_embedding",
   },
   },
   },
  ],
});
console.log(response);

Console

PUT _ingest/pipeline/mistral_embeddings_pipeline
{
  "processors": [
   {
   "inference": {
   "model_id": "mistral_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
  ]
}


	あなたが作成した推論エンドポイントの名前です。それは、`inference_id`としてそのステップで言及されています。
	推論プロセスのための`input_field`を定義する構成オブジェクトと、推論結果を含む`output_field`。

Python

resp = client.ingest.put_pipeline(
   id="amazon_bedrock_embeddings_pipeline",
   processors=[
   {
   "inference": {
   "model_id": "amazon_bedrock_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
   ],
)
print(resp)

Js

const response = await client.ingest.putPipeline({
  id: "amazon_bedrock_embeddings_pipeline",
  processors: [
   {
   inference: {
   model_id: "amazon_bedrock_embeddings",
   input_output: {
   input_field: "content",
   output_field: "content_embedding",
   },
   },
   },
  ],
});
console.log(response);

Console

PUT _ingest/pipeline/amazon_bedrock_embeddings_pipeline
{
  "processors": [
   {
   "inference": {
   "model_id": "amazon_bedrock_embeddings",
   "input_output": {
   "input_field": "content",
   "output_field": "content_embedding"
   }
   }
   }
  ]
}


	あなたが作成した推論エンドポイントの名前です。それは、`inference_id`としてそのステップで言及されています。
	推論プロセスのための`input_field`を定義する構成オブジェクトと、推論結果を含む`output_field`。

Load data

このステップでは、後で推論取り込みパイプラインで使用するデータをロードして、そこから埋め込みを作成します。

msmarco-passagetest2019-top1000データセットを使用します。これは、MS MARCO Passage Rankingデータセットのサブセットです。200のクエリで構成されており、それぞれに関連するテキストパッセージのリストが付いています。すべてのユニークなパッセージとそのIDは、そのデータセットから抽出され、tsvファイルにまとめられています。

ファイルをダウンロードし、Machine Learning UIのData Visualizerを使用してクラスターにアップロードします。データが分析された後、Override settingsをクリックします。Edit field namesの下で、最初の列にidを、2番目の列にcontentを割り当てます。Applyをクリックし、次にImportをクリックします。インデックスにtest-dataという名前を付け、Importをクリックします。アップロードが完了すると、182,469のドキュメントを持つtest-dataという名前のインデックスが表示されます。

Ingest the data through the inference ingest pipeline

選択したモデルを使用して推論パイプラインを通じてデータを再インデックス化することで、テキストから埋め込みを作成します。このステップでは、reindex APIを使用して、パイプラインを通じたデータ取り込みをシミュレートします。

Python

resp = client.reindex(
   wait_for_completion=False,
   source={
   "index": "test-data",
   "size": 50
   },
   dest={
   "index": "cohere-embeddings",
   "pipeline": "cohere_embeddings_pipeline"
   },
)
print(resp)

Js

const response = await client.reindex({
  wait_for_completion: "false",
  source: {
   index: "test-data",
   size: 50,
  },
  dest: {
   index: "cohere-embeddings",
   pipeline: "cohere_embeddings_pipeline",
  },
});
console.log(response);

Console

POST _reindex?wait_for_completion=false
{
  "source": {
   "index": "test-data",
   "size": 50
  },
  "dest": {
   "index": "cohere-embeddings",
   "pipeline": "cohere_embeddings_pipeline"
  }
}


	再インデックス化のデフォルトバッチサイズは1000です。`size`を小さい数に減らすと、再インデックス化プロセスの更新が迅速になり、進捗を密接に追跡し、早期にエラーを検出できます。

あなたのCohereアカウントのレート制限が再インデックス化プロセスのスループットに影響を与える可能性があります。

Python

resp = client.reindex(
   wait_for_completion=False,
   source={
   "index": "test-data",
   "size": 50
   },
   dest={
   "index": "elser-embeddings",
   "pipeline": "elser_embeddings_pipeline"
   },
)
print(resp)

Js

const response = await client.reindex({
  wait_for_completion: "false",
  source: {
   index: "test-data",
   size: 50,
  },
  dest: {
   index: "elser-embeddings",
   pipeline: "elser_embeddings_pipeline",
  },
});
console.log(response);

Console

POST _reindex?wait_for_completion=false
{
  "source": {
   "index": "test-data",
   "size": 50
  },
  "dest": {
   "index": "elser-embeddings",
   "pipeline": "elser_embeddings_pipeline"
  }
}


	再インデックス化のデフォルトバッチサイズは1000です。`size`を小さい数に減らすと、再インデックス化プロセスの更新が迅速になり、進捗を密接に追跡し、早期にエラーを検出できます。

Python

resp = client.reindex(
   wait_for_completion=False,
   source={
   "index": "test-data",
   "size": 50
   },
   dest={
   "index": "hugging-face-embeddings",
   "pipeline": "hugging_face_embeddings_pipeline"
   },
)
print(resp)

Js

const response = await client.reindex({
  wait_for_completion: "false",
  source: {
   index: "test-data",
   size: 50,
  },
  dest: {
   index: "hugging-face-embeddings",
   pipeline: "hugging_face_embeddings_pipeline",
  },
});
console.log(response);

Console

POST _reindex?wait_for_completion=false
{
  "source": {
   "index": "test-data",
   "size": 50
  },
  "dest": {
   "index": "hugging-face-embeddings",
   "pipeline": "hugging_face_embeddings_pipeline"
  }
}


	再インデックス化のデフォルトバッチサイズは1000です。`size`を小さい数に減らすと、再インデックス化プロセスの更新が迅速になり、進捗を密接に追跡し、早期にエラーを検出できます。

Python

resp = client.reindex(
   wait_for_completion=False,
   source={
   "index": "test-data",
   "size": 50
   },
   dest={
   "index": "openai-embeddings",
   "pipeline": "openai_embeddings_pipeline"
   },
)
print(resp)

Js

const response = await client.reindex({
  wait_for_completion: "false",
  source: {
   index: "test-data",
   size: 50,
  },
  dest: {
   index: "openai-embeddings",
   pipeline: "openai_embeddings_pipeline",
  },
});
console.log(response);

Console

POST _reindex?wait_for_completion=false
{
  "source": {
   "index": "test-data",
   "size": 50
  },
  "dest": {
   "index": "openai-embeddings",
   "pipeline": "openai_embeddings_pipeline"
  }
}


	再インデックス化のデフォルトバッチサイズは1000です。`size`を小さい数に減らすと、再インデックス化プロセスの更新が迅速になり、進捗を密接に追跡し、早期にエラーを検出できます。

あなたのOpenAIアカウントのレート制限が再インデックス化プロセスのスループットに影響を与える可能性があります。この場合、sizeを3または同等の値に変更してください。

Python

resp = client.reindex(
   wait_for_completion=False,
   source={
   "index": "test-data",
   "size": 50
   },
   dest={
   "index": "azure-openai-embeddings",
   "pipeline": "azure_openai_embeddings_pipeline"
   },
)
print(resp)

Js

const response = await client.reindex({
  wait_for_completion: "false",
  source: {
   index: "test-data",
   size: 50,
  },
  dest: {
   index: "azure-openai-embeddings",
   pipeline: "azure_openai_embeddings_pipeline",
  },
});
console.log(response);

Console

POST _reindex?wait_for_completion=false
{
  "source": {
   "index": "test-data",
   "size": 50
  },
  "dest": {
   "index": "azure-openai-embeddings",
   "pipeline": "azure_openai_embeddings_pipeline"
  }
}


	再インデックス化のデフォルトバッチサイズは1000です。`size`を小さい数に減らすと、再インデックス化プロセスの更新が迅速になり、進捗を密接に追跡し、早期にエラーを検出できます。

あなたのAzure OpenAIアカウントのレート制限が再インデックス化プロセスのスループットに影響を与える可能性があります。この場合、sizeを3または同等の値に変更してください。

Python

resp = client.reindex(
   wait_for_completion=False,
   source={
   "index": "test-data",
   "size": 50
   },
   dest={
   "index": "azure-ai-studio-embeddings",
   "pipeline": "azure_ai_studio_embeddings_pipeline"
   },
)
print(resp)

Js

const response = await client.reindex({
  wait_for_completion: "false",
  source: {
   index: "test-data",
   size: 50,
  },
  dest: {
   index: "azure-ai-studio-embeddings",
   pipeline: "azure_ai_studio_embeddings_pipeline",
  },
});
console.log(response);

Console

POST _reindex?wait_for_completion=false
{
  "source": {
   "index": "test-data",
   "size": 50
  },
  "dest": {
   "index": "azure-ai-studio-embeddings",
   "pipeline": "azure_ai_studio_embeddings_pipeline"
  }
}


	再インデックス化のデフォルトバッチサイズは1000です。`size`を小さい数に減らすと、再インデックス化プロセスの更新が迅速になり、進捗を密接に追跡し、早期にエラーを検出できます。

あなたのAzure AI Studioモデルのデプロイには、再インデックス化プロセスのスループットに影響を与える可能性のあるレート制限があるかもしれません。この場合、sizeを3または同等の値に変更してください。

Python

resp = client.reindex(
   wait_for_completion=False,
   source={
   "index": "test-data",
   "size": 50
   },
   dest={
   "index": "google-vertex-ai-embeddings",
   "pipeline": "google_vertex_ai_embeddings_pipeline"
   },
)
print(resp)

Js

const response = await client.reindex({
  wait_for_completion: "false",
  source: {
   index: "test-data",
   size: 50,
  },
  dest: {
   index: "google-vertex-ai-embeddings",
   pipeline: "google_vertex_ai_embeddings_pipeline",
  },
});
console.log(response);

Console

POST _reindex?wait_for_completion=false
{
  "source": {
   "index": "test-data",
   "size": 50
  },
  "dest": {
   "index": "google-vertex-ai-embeddings",
   "pipeline": "google_vertex_ai_embeddings_pipeline"
  }
}


	再インデックス化のデフォルトバッチサイズは1000です。`size`を減らすと、再インデックス化プロセスの更新が速くなります。これにより、進捗を密接に追跡し、早期にエラーを検出できます。

Python

resp = client.reindex(
   wait_for_completion=False,
   source={
   "index": "test-data",
   "size": 50
   },
   dest={
   "index": "mistral-embeddings",
   "pipeline": "mistral_embeddings_pipeline"
   },
)
print(resp)

Js

const response = await client.reindex({
  wait_for_completion: "false",
  source: {
   index: "test-data",
   size: 50,
  },
  dest: {
   index: "mistral-embeddings",
   pipeline: "mistral_embeddings_pipeline",
  },
});
console.log(response);

Console

POST _reindex?wait_for_completion=false
{
  "source": {
   "index": "test-data",
   "size": 50
  },
  "dest": {
   "index": "mistral-embeddings",
   "pipeline": "mistral_embeddings_pipeline"
  }
}


	再インデックス化のデフォルトバッチサイズは1000です。`size`を小さい数に減らすと、再インデックス化プロセスの更新が迅速になり、進捗を密接に追跡し、早期にエラーを検出できます。

Python

resp = client.reindex(
   wait_for_completion=False,
   source={
   "index": "test-data",
   "size": 50
   },
   dest={
   "index": "amazon-bedrock-embeddings",
   "pipeline": "amazon_bedrock_embeddings_pipeline"
   },
)
print(resp)

Js

const response = await client.reindex({
  wait_for_completion: "false",
  source: {
   index: "test-data",
   size: 50,
  },
  dest: {
   index: "amazon-bedrock-embeddings",
   pipeline: "amazon_bedrock_embeddings_pipeline",
  },
});
console.log(response);

Console

POST _reindex?wait_for_completion=false
{
  "source": {
   "index": "test-data",
   "size": 50
  },
  "dest": {
   "index": "amazon-bedrock-embeddings",
   "pipeline": "amazon_bedrock_embeddings_pipeline"
  }
}


	再インデックス化のデフォルトバッチサイズは1000です。`size`を小さい数に減らすと、再インデックス化プロセスの更新が迅速になり、進捗を密接に追跡し、早期にエラーを検出できます。

呼び出しは、進捗を監視するためのタスクIDを返します:

Python

resp = client.tasks.get(
   task_id="<task_id>",
)
print(resp)

Js

const response = await client.tasks.get({
  task_id: "<task_id>",
});
console.log(response);

Console

GET _tasks/<task_id>

大規模なデータセットの再インデックス化には時間がかかる場合があります。このワークフローをデータセットのサブセットのみを使用してテストできます。これを行うには、再インデックス化プロセスをキャンセルし、再インデックス化されたサブセットの埋め込みのみを生成します。次のAPIリクエストは、再インデックス化タスクをキャンセルします:

Python

resp = client.tasks.cancel(
   task_id="<task_id>",
)
print(resp)

Js

const response = await client.tasks.cancel({
  task_id: "<task_id>",
});
console.log(response);

Console

POST _tasks/<task_id>/_cancel

Semantic search

データセットが埋め込みで強化された後、セマンティック検索を使用してデータをクエリできます。密なベクトルモデルの場合、k近傍探索APIにquery_vector_builderを渡し、クエリテキストと埋め込みを作成するために使用したモデルを提供します。ELSERのようなスパースベクトルモデルの場合、sparse_vectorクエリを使用し、クエリテキストと埋め込みを作成するために使用したモデルを提供します。

再インデックス化プロセスをキャンセルした場合、データの一部のみをクエリすることになり、結果の質に影響を与えます。

Python

resp = client.search(
   index="cohere-embeddings",
   knn={
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "cohere_embeddings",
   "model_text": "Muscles in human body"
   }
   },
   "k": 10,
   "num_candidates": 100
   },
   source=[
   "id",
   "content"
   ],
)
print(resp)

Ruby

response = client.search(
  index: 'cohere-embeddings',
  body: {
   knn: {
   field: 'content_embedding',
   query_vector_builder: {
   text_embedding: {
   model_id: 'cohere_embeddings',
   model_text: 'Muscles in human body'
   }
   },
   k: 10,
   num_candidates: 100
   },
   _source: [
   'id',
   'content'
   ]
  }
)
puts response

Js

const response = await client.search({
  index: "cohere-embeddings",
  knn: {
   field: "content_embedding",
   query_vector_builder: {
   text_embedding: {
   model_id: "cohere_embeddings",
   model_text: "Muscles in human body",
   },
   },
   k: 10,
   num_candidates: 100,
  },
  _source: ["id", "content"],
});
console.log(response);

Console

GET cohere-embeddings/_search
{
  "knn": {
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "cohere_embeddings",
   "model_text": "Muscles in human body"
   }
   },
   "k": 10,
   "num_candidates": 100
  },
  "_source": [
   "id",
   "content"
  ]
}

その結果、cohere-embeddingsインデックスからクエリに最も近い意味を持つ上位10件のドキュメントが、クエリへの近接度でソートされて返されます:

Consol-Result

"hits": [
   {
   "_index": "cohere-embeddings",
   "_id": "-eFWCY4BECzWLnMZuI78",
   "_score": 0.737484,
   "_source": {
   "id": 1690948,
   "content": "Oxygen is supplied to the muscles via red blood cells. Red blood cells carry hemoglobin which oxygen bonds with as the hemoglobin rich blood cells pass through the blood vessels of the lungs.The now oxygen rich blood cells carry that oxygen to the cells that are demanding it, in this case skeletal muscle cells.ther ways in which muscles are supplied with oxygen include: 1  Blood flow from the heart is increased. 2  Blood flow to your muscles in increased. 3  Blood flow from nonessential organs is transported to working muscles."
   }
   },
   {
   "_index": "cohere-embeddings",
   "_id": "HuFWCY4BECzWLnMZuI_8",
   "_score": 0.7176013,
   "_source": {
   "id": 1692482,
   "content": "The thoracic cavity is separated from the abdominal cavity by the  diaphragm. This is a broad flat muscle.    (muscular) diaphragm The diaphragm is a muscle that separat…e the thoracic from the abdominal cavity. The pelvis is the lowest part of the abdominal cavity and it has no physical separation from it    Diaphragm."
   }
   },
   {
   "_index": "cohere-embeddings",
   "_id": "IOFWCY4BECzWLnMZuI_8",
   "_score": 0.7154432,
   "_source": {
   "id": 1692489,
   "content": "Muscular Wall Separating the Abdominal and Thoracic Cavities; Thoracic Cavity of a Fetal Pig; In Mammals the Diaphragm Separates the Abdominal Cavity from the"
   }
   },
   {
   "_index": "cohere-embeddings",
   "_id": "C-FWCY4BECzWLnMZuI_8",
   "_score": 0.695313,
   "_source": {
   "id": 1691493,
   "content": "Burning, aching, tenderness and stiffness are just some descriptors of the discomfort you may feel in the muscles you exercised one to two days ago.For the most part, these sensations you experience after exercise are collectively known as delayed onset muscle soreness.urning, aching, tenderness and stiffness are just some descriptors of the discomfort you may feel in the muscles you exercised one to two days ago."
   }
   },
   (...)
   ]

Python

resp = client.search(
   index="elser-embeddings",
   query={
   "sparse_vector": {
   "field": "content_embedding",
   "inference_id": "elser_embeddings",
   "query": "How to avoid muscle soreness after running?"
   }
   },
   source=[
   "id",
   "content"
   ],
)
print(resp)

Js

const response = await client.search({
  index: "elser-embeddings",
  query: {
   sparse_vector: {
   field: "content_embedding",
   inference_id: "elser_embeddings",
   query: "How to avoid muscle soreness after running?",
   },
  },
  _source: ["id", "content"],
});
console.log(response);

Console

GET elser-embeddings/_search
{
  "query":{
   "sparse_vector":{
   "field": "content_embedding",
   "inference_id": "elser_embeddings",
   "query": "How to avoid muscle soreness after running?"
   }
  },
  "_source": [
   "id",
   "content"
  ]
}

その結果、cohere-embeddingsインデックスからクエリに最も近い意味を持つ上位10件のドキュメントが、クエリへの近接度でソートされて返されます:

Consol-Result

"hits": [
   {
   "_index": "elser-embeddings",
   "_id": "ZLGc_pABZbBmsu5_eCoH",
   "_score": 21.472063,
   "_source": {
   "id": 2258240,
   "content": "You may notice some muscle aches while you are exercising. This is called acute soreness. More often, you may begin to feel sore about 12 hours after exercising, and the discomfort usually peaks at 48 to 72 hours after exercise. This is called delayed-onset muscle soreness.It is thought that, during this time, your body is repairing the muscle, making it stronger and bigger.You may also notice the muscles feel better if you exercise lightly. This is normal.his is called delayed-onset muscle soreness. It is thought that, during this time, your body is repairing the muscle, making it stronger and bigger. You may also notice the muscles feel better if you exercise lightly. This is normal."
   }
   },
   {
   "_index": "elser-embeddings",
   "_id": "ZbGc_pABZbBmsu5_eCoH",
   "_score": 21.421381,
   "_source": {
   "id": 2258242,
   "content": "Photo Credit Jupiterimages/Stockbyte/Getty Images. That stiff, achy feeling you get in the days after exercise is a normal physiological response known as delayed onset muscle soreness. You can take it as a positive sign that your muscles have felt the workout, but the pain may also turn you off to further exercise.ou are more likely to develop delayed onset muscle soreness if you are new to working out, if you’ve gone a long time without exercising and start up again, if you have picked up a new type of physical activity or if you have recently boosted the intensity, length or frequency of your exercise sessions."
   }
   },
   {
   "_index": "elser-embeddings",
   "_id": "ZrGc_pABZbBmsu5_eCoH",
   "_score": 20.542095,
   "_source": {
   "id": 2258248,
   "content": "They found that stretching before and after exercise has no effect on muscle soreness. Exercise might cause inflammation, which leads to an increase in the production of immune cells (comprised mostly of macrophages and neutrophils). Levels of these immune cells reach a peak 24-48 hours after exercise.These cells, in turn, produce bradykinins and prostaglandins, which make the pain receptors in your body more sensitive. Whenever you move, these pain receptors are stimulated.hey found that stretching before and after exercise has no effect on muscle soreness. Exercise might cause inflammation, which leads to an increase in the production of immune cells (comprised mostly of macrophages and neutrophils). Levels of these immune cells reach a peak 24-48 hours after exercise."
   }
   },
   (...)
  ]

Python

resp = client.search(
   index="hugging-face-embeddings",
   knn={
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "hugging_face_embeddings",
   "model_text": "What's margin of error?"
   }
   },
   "k": 10,
   "num_candidates": 100
   },
   source=[
   "id",
   "content"
   ],
)
print(resp)

Ruby

response = client.search(
  index: 'hugging-face-embeddings',
  body: {
   knn: {
   field: 'content_embedding',
   query_vector_builder: {
   text_embedding: {
   model_id: 'hugging_face_embeddings',
   model_text: "What's margin of error?"
   }
   },
   k: 10,
   num_candidates: 100
   },
   _source: [
   'id',
   'content'
   ]
  }
)
puts response

Js

const response = await client.search({
  index: "hugging-face-embeddings",
  knn: {
   field: "content_embedding",
   query_vector_builder: {
   text_embedding: {
   model_id: "hugging_face_embeddings",
   model_text: "What's margin of error?",
   },
   },
   k: 10,
   num_candidates: 100,
  },
  _source: ["id", "content"],
});
console.log(response);

Console

GET hugging-face-embeddings/_search
{
  "knn": {
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "hugging_face_embeddings",
   "model_text": "What's margin of error?"
   }
   },
   "k": 10,
   "num_candidates": 100
  },
  "_source": [
   "id",
   "content"
  ]
}

その結果、hugging-face-embeddingsインデックスからクエリに最も近い意味を持つ上位10件のドキュメントが、クエリへの近接度でソートされて返されます:

Consol-Result

"hits": [
   {
   "_index": "hugging-face-embeddings",
   "_id": "ljEfo44BiUQvMpPgT20E",
   "_score": 0.8522128,
   "_source": {
   "id": 7960255,
   "content": "The margin of error can be defined by either of the following equations. Margin of error = Critical value x Standard deviation of the statistic. Margin of error = Critical value x Standard error of the statistic. If you know the standard deviation of the statistic, use the first equation to compute the margin of error. Otherwise, use the second equation. Previously, we described how to compute the standard deviation and standard error."
   }
   },
   {
   "_index": "hugging-face-embeddings",
   "_id": "lzEfo44BiUQvMpPgT20E",
   "_score": 0.7865497,
   "_source": {
   "id": 7960259,
   "content": "1 y ou are told only the size of the sample and are asked to provide the margin of error for percentages which are not (yet) known. 2  This is typically the case when you are computing the margin of error for a survey which is going to be conducted in the future."
   }
   },
   {
   "_index": "hugging-face-embeddings1",
   "_id": "DjEfo44BiUQvMpPgT20E",
   "_score": 0.6229427,
   "_source": {
   "id": 2166183,
   "content": "1. In general, the point at which gains equal losses. 2. In options, the market price that a stock must reach for option buyers to avoid a loss if they exercise. For a call, it is the strike price plus the premium paid. For a put, it is the strike price minus the premium paid."
   }
   },
   {
   "_index": "hugging-face-embeddings1",
   "_id": "VzEfo44BiUQvMpPgT20E",
   "_score": 0.6034223,
   "_source": {
   "id": 2173417,
   "content": "How do you find the area of a circle? Can you measure the area of a circle and use that to find a value for Pi?"
   }
   },
   (...)
   ]

Python

resp = client.search(
   index="openai-embeddings",
   knn={
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "openai_embeddings",
   "model_text": "Calculate fuel cost"
   }
   },
   "k": 10,
   "num_candidates": 100
   },
   source=[
   "id",
   "content"
   ],
)
print(resp)

Ruby

response = client.search(
  index: 'openai-embeddings',
  body: {
   knn: {
   field: 'content_embedding',
   query_vector_builder: {
   text_embedding: {
   model_id: 'openai_embeddings',
   model_text: 'Calculate fuel cost'
   }
   },
   k: 10,
   num_candidates: 100
   },
   _source: [
   'id',
   'content'
   ]
  }
)
puts response

Js

const response = await client.search({
  index: "openai-embeddings",
  knn: {
   field: "content_embedding",
   query_vector_builder: {
   text_embedding: {
   model_id: "openai_embeddings",
   model_text: "Calculate fuel cost",
   },
   },
   k: 10,
   num_candidates: 100,
  },
  _source: ["id", "content"],
});
console.log(response);

Console

GET openai-embeddings/_search
{
  "knn": {
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "openai_embeddings",
   "model_text": "Calculate fuel cost"
   }
   },
   "k": 10,
   "num_candidates": 100
  },
  "_source": [
   "id",
   "content"
  ]
}

その結果、openai-embeddingsインデックスからクエリに最も近い意味を持つ上位10件のドキュメントが、クエリへの近接度でソートされて返されます:

Consol-Result

"hits": [
   {
   "_index": "openai-embeddings",
   "_id": "DDd5OowBHxQKHyc3TDSC",
   "_score": 0.83704096,
   "_source": {
   "id": 862114,
   "body": "How to calculate fuel cost for a road trip. By Tara Baukus Mello • Bankrate.com. Dear Driving for Dollars, My family is considering taking a long road trip to finish off the end of the summer, but I'm a little worried about gas prices and our overall fuel cost.It doesn't seem easy to calculate since we'll be traveling through many states and we are considering several routes.y family is considering taking a long road trip to finish off the end of the summer, but I'm a little worried about gas prices and our overall fuel cost. It doesn't seem easy to calculate since we'll be traveling through many states and we are considering several routes."
   }
   },
   {
   "_index": "openai-embeddings",
   "_id": "ajd5OowBHxQKHyc3TDSC",
   "_score": 0.8345704,
   "_source": {
   "id": 820622,
   "body": "Home Heating Calculator. Typically, approximately 50% of the energy consumed in a home annually is for space heating. When deciding on a heating system, many factors will come into play: cost of fuel, installation cost, convenience and life style are all important.This calculator can help you estimate the cost of fuel for different heating appliances.hen deciding on a heating system, many factors will come into play: cost of fuel, installation cost, convenience and life style are all important. This calculator can help you estimate the cost of fuel for different heating appliances."
   }
   },
   {
   "_index": "openai-embeddings",
   "_id": "Djd5OowBHxQKHyc3TDSC",
   "_score": 0.8327426,
   "_source": {
   "id": 8202683,
   "body": "Fuel is another important cost. This cost will depend on your boat, how far you travel, and how fast you travel. A 33-foot sailboat traveling at 7 knots should be able to travel 300 miles on 50 gallons of diesel fuel.If you are paying $4 per gallon, the trip would cost you $200.Most boats have much larger gas tanks than cars.uel is another important cost. This cost will depend on your boat, how far you travel, and how fast you travel. A 33-foot sailboat traveling at 7 knots should be able to travel 300 miles on 50 gallons of diesel fuel."
   }
   },
   (...)
   ]

Python

resp = client.search(
   index="azure-openai-embeddings",
   knn={
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "azure_openai_embeddings",
   "model_text": "Calculate fuel cost"
   }
   },
   "k": 10,
   "num_candidates": 100
   },
   source=[
   "id",
   "content"
   ],
)
print(resp)

Ruby

response = client.search(
  index: 'azure-openai-embeddings',
  body: {
   knn: {
   field: 'content_embedding',
   query_vector_builder: {
   text_embedding: {
   model_id: 'azure_openai_embeddings',
   model_text: 'Calculate fuel cost'
   }
   },
   k: 10,
   num_candidates: 100
   },
   _source: [
   'id',
   'content'
   ]
  }
)
puts response

Js

const response = await client.search({
  index: "azure-openai-embeddings",
  knn: {
   field: "content_embedding",
   query_vector_builder: {
   text_embedding: {
   model_id: "azure_openai_embeddings",
   model_text: "Calculate fuel cost",
   },
   },
   k: 10,
   num_candidates: 100,
  },
  _source: ["id", "content"],
});
console.log(response);

Console

GET azure-openai-embeddings/_search
{
  "knn": {
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "azure_openai_embeddings",
   "model_text": "Calculate fuel cost"
   }
   },
   "k": 10,
   "num_candidates": 100
  },
  "_source": [
   "id",
   "content"
  ]
}

その結果、azure-openai-embeddingsインデックスからクエリに最も近い意味を持つ上位10件のドキュメントが、クエリへの近接度でソートされて返されます:

Consol-Result

"hits": [
   {
   "_index": "azure-openai-embeddings",
   "_id": "DDd5OowBHxQKHyc3TDSC",
   "_score": 0.83704096,
   "_source": {
   "id": 862114,
   "body": "How to calculate fuel cost for a road trip. By Tara Baukus Mello • Bankrate.com. Dear Driving for Dollars, My family is considering taking a long road trip to finish off the end of the summer, but I'm a little worried about gas prices and our overall fuel cost.It doesn't seem easy to calculate since we'll be traveling through many states and we are considering several routes.y family is considering taking a long road trip to finish off the end of the summer, but I'm a little worried about gas prices and our overall fuel cost. It doesn't seem easy to calculate since we'll be traveling through many states and we are considering several routes."
   }
   },
   {
   "_index": "azure-openai-embeddings",
   "_id": "ajd5OowBHxQKHyc3TDSC",
   "_score": 0.8345704,
   "_source": {
   "id": 820622,
   "body": "Home Heating Calculator. Typically, approximately 50% of the energy consumed in a home annually is for space heating. When deciding on a heating system, many factors will come into play: cost of fuel, installation cost, convenience and life style are all important.This calculator can help you estimate the cost of fuel for different heating appliances.hen deciding on a heating system, many factors will come into play: cost of fuel, installation cost, convenience and life style are all important. This calculator can help you estimate the cost of fuel for different heating appliances."
   }
   },
   {
   "_index": "azure-openai-embeddings",
   "_id": "Djd5OowBHxQKHyc3TDSC",
   "_score": 0.8327426,
   "_source": {
   "id": 8202683,
   "body": "Fuel is another important cost. This cost will depend on your boat, how far you travel, and how fast you travel. A 33-foot sailboat traveling at 7 knots should be able to travel 300 miles on 50 gallons of diesel fuel.If you are paying $4 per gallon, the trip would cost you $200.Most boats have much larger gas tanks than cars.uel is another important cost. This cost will depend on your boat, how far you travel, and how fast you travel. A 33-foot sailboat traveling at 7 knots should be able to travel 300 miles on 50 gallons of diesel fuel."
   }
   },
   (...)
   ]

Python

resp = client.search(
   index="azure-ai-studio-embeddings",
   knn={
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "azure_ai_studio_embeddings",
   "model_text": "Calculate fuel cost"
   }
   },
   "k": 10,
   "num_candidates": 100
   },
   source=[
   "id",
   "content"
   ],
)
print(resp)

Js

const response = await client.search({
  index: "azure-ai-studio-embeddings",
  knn: {
   field: "content_embedding",
   query_vector_builder: {
   text_embedding: {
   model_id: "azure_ai_studio_embeddings",
   model_text: "Calculate fuel cost",
   },
   },
   k: 10,
   num_candidates: 100,
  },
  _source: ["id", "content"],
});
console.log(response);

Console

GET azure-ai-studio-embeddings/_search
{
  "knn": {
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "azure_ai_studio_embeddings",
   "model_text": "Calculate fuel cost"
   }
   },
   "k": 10,
   "num_candidates": 100
  },
  "_source": [
   "id",
   "content"
  ]
}

その結果、azure-ai-studio-embeddingsインデックスからクエリに最も近い意味を持つ上位10件のドキュメントが、クエリへの近接度でソートされて返されます:

Console-Result

"hits": [
   {
   "_index": "azure-ai-studio-embeddings",
   "_id": "DDd5OowBHxQKHyc3TDSC",
   "_score": 0.83704096,
   "_source": {
   "id": 862114,
   "body": "How to calculate fuel cost for a road trip. By Tara Baukus Mello • Bankrate.com. Dear Driving for Dollars, My family is considering taking a long road trip to finish off the end of the summer, but I'm a little worried about gas prices and our overall fuel cost.It doesn't seem easy to calculate since we'll be traveling through many states and we are considering several routes.y family is considering taking a long road trip to finish off the end of the summer, but I'm a little worried about gas prices and our overall fuel cost. It doesn't seem easy to calculate since we'll be traveling through many states and we are considering several routes."
   }
   },
   {
   "_index": "azure-ai-studio-embeddings",
   "_id": "ajd5OowBHxQKHyc3TDSC",
   "_score": 0.8345704,
   "_source": {
   "id": 820622,
   "body": "Home Heating Calculator. Typically, approximately 50% of the energy consumed in a home annually is for space heating. When deciding on a heating system, many factors will come into play: cost of fuel, installation cost, convenience and life style are all important.This calculator can help you estimate the cost of fuel for different heating appliances.hen deciding on a heating system, many factors will come into play: cost of fuel, installation cost, convenience and life style are all important. This calculator can help you estimate the cost of fuel for different heating appliances."
   }
   },
   {
   "_index": "azure-ai-studio-embeddings",
   "_id": "Djd5OowBHxQKHyc3TDSC",
   "_score": 0.8327426,
   "_source": {
   "id": 8202683,
   "body": "Fuel is another important cost. This cost will depend on your boat, how far you travel, and how fast you travel. A 33-foot sailboat traveling at 7 knots should be able to travel 300 miles on 50 gallons of diesel fuel.If you are paying $4 per gallon, the trip would cost you $200.Most boats have much larger gas tanks than cars.uel is another important cost. This cost will depend on your boat, how far you travel, and how fast you travel. A 33-foot sailboat traveling at 7 knots should be able to travel 300 miles on 50 gallons of diesel fuel."
   }
   },
   (...)
   ]

Python

resp = client.search(
   index="google-vertex-ai-embeddings",
   knn={
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "google_vertex_ai_embeddings",
   "model_text": "Calculate fuel cost"
   }
   },
   "k": 10,
   "num_candidates": 100
   },
   source=[
   "id",
   "content"
   ],
)
print(resp)

Js

const response = await client.search({
  index: "google-vertex-ai-embeddings",
  knn: {
   field: "content_embedding",
   query_vector_builder: {
   text_embedding: {
   model_id: "google_vertex_ai_embeddings",
   model_text: "Calculate fuel cost",
   },
   },
   k: 10,
   num_candidates: 100,
  },
  _source: ["id", "content"],
});
console.log(response);

Console

GET google-vertex-ai-embeddings/_search
{
  "knn": {
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "google_vertex_ai_embeddings",
   "model_text": "Calculate fuel cost"
   }
   },
   "k": 10,
   "num_candidates": 100
  },
  "_source": [
   "id",
   "content"
  ]
}

その結果、mistral-embeddingsインデックスからクエリに最も近い意味を持つ上位10件のドキュメントが、クエリへの近接度でソートされて返されます:

Console-Result

"hits": [
   {
   "_index": "google-vertex-ai-embeddings",
   "_id": "Ryv0nZEBBFPLbFsdCbGn",
   "_score": 0.86815524,
   "_source": {
   "id": 3041038,
   "content": "For example, the cost of the fuel could be 96.9, the amount could be 10 pounds, and the distance covered could be 80 miles. To convert between Litres per 100KM and Miles Per Gallon, please provide a value and click on the required button.o calculate how much fuel you'll need for a given journey, please provide the distance in miles you will be covering on your journey, and the estimated MPG of your vehicle. To work out what MPG you are really getting, please provide the cost of the fuel, how much you spent on the fuel, and how far it took you."
   }
   },
   {
   "_index": "google-vertex-ai-embeddings",
   "_id": "w4j0nZEBZ1nFq1oiHQvK",
   "_score": 0.8676357,
   "_source": {
   "id": 1541469,
   "content": "This driving cost calculator takes into consideration the fuel economy of the vehicle that you are travelling in as well as the fuel cost. This road trip gas calculator will give you an idea of how much would it cost to drive before you actually travel.his driving cost calculator takes into consideration the fuel economy of the vehicle that you are travelling in as well as the fuel cost. This road trip gas calculator will give you an idea of how much would it cost to drive before you actually travel."
   }
   },
   {
   "_index": "google-vertex-ai-embeddings",
   "_id": "Hoj0nZEBZ1nFq1oiHQjJ",
   "_score": 0.80510974,
   "_source": {
   "id": 7982559,
   "content": "What's that light cost you? 1  Select your electric rate (or click to enter your own). 2  You can calculate results for up to four types of lights. 3  Select the type of lamp (i.e. 4  Select the lamp wattage (lamp lumens). 5  Enter the number of lights in use. 6  Select how long the lamps are in use (or click to enter your own; enter hours on per year). 7  Finally, ..."
   }
   },
   (...)
   ]

Python

resp = client.search(
   index="mistral-embeddings",
   knn={
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "mistral_embeddings",
   "model_text": "Calculate fuel cost"
   }
   },
   "k": 10,
   "num_candidates": 100
   },
   source=[
   "id",
   "content"
   ],
)
print(resp)

Js

const response = await client.search({
  index: "mistral-embeddings",
  knn: {
   field: "content_embedding",
   query_vector_builder: {
   text_embedding: {
   model_id: "mistral_embeddings",
   model_text: "Calculate fuel cost",
   },
   },
   k: 10,
   num_candidates: 100,
  },
  _source: ["id", "content"],
});
console.log(response);

Console

GET mistral-embeddings/_search
{
  "knn": {
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "mistral_embeddings",
   "model_text": "Calculate fuel cost"
   }
   },
   "k": 10,
   "num_candidates": 100
  },
  "_source": [
   "id",
   "content"
  ]
}

その結果、mistral-embeddingsインデックスからクエリに最も近い意味を持つ上位10件のドキュメントが、クエリへの近接度でソートされて返されます:

Consol-Result

"hits": [
   {
   "_index": "mistral-embeddings",
   "_id": "DDd5OowBHxQKHyc3TDSC",
   "_score": 0.83704096,
   "_source": {
   "id": 862114,
   "body": "How to calculate fuel cost for a road trip. By Tara Baukus Mello • Bankrate.com. Dear Driving for Dollars, My family is considering taking a long road trip to finish off the end of the summer, but I'm a little worried about gas prices and our overall fuel cost.It doesn't seem easy to calculate since we'll be traveling through many states and we are considering several routes.y family is considering taking a long road trip to finish off the end of the summer, but I'm a little worried about gas prices and our overall fuel cost. It doesn't seem easy to calculate since we'll be traveling through many states and we are considering several routes."
   }
   },
   {
   "_index": "mistral-embeddings",
   "_id": "ajd5OowBHxQKHyc3TDSC",
   "_score": 0.8345704,
   "_source": {
   "id": 820622,
   "body": "Home Heating Calculator. Typically, approximately 50% of the energy consumed in a home annually is for space heating. When deciding on a heating system, many factors will come into play: cost of fuel, installation cost, convenience and life style are all important.This calculator can help you estimate the cost of fuel for different heating appliances.hen deciding on a heating system, many factors will come into play: cost of fuel, installation cost, convenience and life style are all important. This calculator can help you estimate the cost of fuel for different heating appliances."
   }
   },
   {
   "_index": "mistral-embeddings",
   "_id": "Djd5OowBHxQKHyc3TDSC",
   "_score": 0.8327426,
   "_source": {
   "id": 8202683,
   "body": "Fuel is another important cost. This cost will depend on your boat, how far you travel, and how fast you travel. A 33-foot sailboat traveling at 7 knots should be able to travel 300 miles on 50 gallons of diesel fuel.If you are paying $4 per gallon, the trip would cost you $200.Most boats have much larger gas tanks than cars.uel is another important cost. This cost will depend on your boat, how far you travel, and how fast you travel. A 33-foot sailboat traveling at 7 knots should be able to travel 300 miles on 50 gallons of diesel fuel."
   }
   },
   (...)
   ]

Python

resp = client.search(
   index="amazon-bedrock-embeddings",
   knn={
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "amazon_bedrock_embeddings",
   "model_text": "Calculate fuel cost"
   }
   },
   "k": 10,
   "num_candidates": 100
   },
   source=[
   "id",
   "content"
   ],
)
print(resp)

Js

const response = await client.search({
  index: "amazon-bedrock-embeddings",
  knn: {
   field: "content_embedding",
   query_vector_builder: {
   text_embedding: {
   model_id: "amazon_bedrock_embeddings",
   model_text: "Calculate fuel cost",
   },
   },
   k: 10,
   num_candidates: 100,
  },
  _source: ["id", "content"],
});
console.log(response);

Console

GET amazon-bedrock-embeddings/_search
{
  "knn": {
   "field": "content_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "amazon_bedrock_embeddings",
   "model_text": "Calculate fuel cost"
   }
   },
   "k": 10,
   "num_candidates": 100
  },
  "_source": [
   "id",
   "content"
  ]
}

その結果、amazon-bedrock-embeddingsインデックスからクエリに最も近い意味を持つ上位10件のドキュメントが、クエリへの近接度でソートされて返されます:

Consol-Result

"hits": [
   {
   "_index": "amazon-bedrock-embeddings",
   "_id": "DDd5OowBHxQKHyc3TDSC",
   "_score": 0.83704096,
   "_source": {
   "id": 862114,
   "body": "How to calculate fuel cost for a road trip. By Tara Baukus Mello • Bankrate.com. Dear Driving for Dollars, My family is considering taking a long road trip to finish off the end of the summer, but I'm a little worried about gas prices and our overall fuel cost.It doesn't seem easy to calculate since we'll be traveling through many states and we are considering several routes.y family is considering taking a long road trip to finish off the end of the summer, but I'm a little worried about gas prices and our overall fuel cost. It doesn't seem easy to calculate since we'll be traveling through many states and we are considering several routes."
   }
   },
   {
   "_index": "amazon-bedrock-embeddings",
   "_id": "ajd5OowBHxQKHyc3TDSC",
   "_score": 0.8345704,
   "_source": {
   "id": 820622,
   "body": "Home Heating Calculator. Typically, approximately 50% of the energy consumed in a home annually is for space heating. When deciding on a heating system, many factors will come into play: cost of fuel, installation cost, convenience and life style are all important.This calculator can help you estimate the cost of fuel for different heating appliances.hen deciding on a heating system, many factors will come into play: cost of fuel, installation cost, convenience and life style are all important. This calculator can help you estimate the cost of fuel for different heating appliances."
   }
   },
   {
   "_index": "amazon-bedrock-embeddings",
   "_id": "Djd5OowBHxQKHyc3TDSC",
   "_score": 0.8327426,
   "_source": {
   "id": 8202683,
   "body": "Fuel is another important cost. This cost will depend on your boat, how far you travel, and how fast you travel. A 33-foot sailboat traveling at 7 knots should be able to travel 300 miles on 50 gallons of diesel fuel.If you are paying $4 per gallon, the trip would cost you $200.Most boats have much larger gas tanks than cars.uel is another important cost. This cost will depend on your boat, how far you travel, and how fast you travel. A 33-foot sailboat traveling at 7 knots should be able to travel 300 miles on 50 gallons of diesel fuel."
   }
   },
   (...)
   ]

Interactive tutorials

Elasticsearch Pythonクライアントを使用して、インタラクティブなColabノートブック形式のチュートリアルも見つけることができます: