セマンティック検索 - CohereをElasticsearchと共に使用（Using Cohere with Elasticsearch）

チュートリアル: ElasticsearchでCohereを使用する
要件
必要なパッケージをインストールする
- Py
- Py
Elasticsearchクライアントを作成する
- Py
推論エンドポイントを作成する
- Py
インデックスマッピングを作成する
- Py
推論パイプラインを作成する
- Py
データを準備し、ドキュメントを挿入する
- Py
ハイブリッド検索
- Py
検索結果を再ランクする
- Py
- Py
CohereとElasticsearchを使用した情報検索強化生成 (RAG)
- Py
- コンソール結果

チュートリアル: ElasticsearchでCohereを使用する

このチュートリアルの指示では、推論APIを使用してCohereで埋め込みを計算し、Elasticsearchで効率的なベクトルまたはハイブリッド検索のためにそれらを保存する方法を示します。このチュートリアルでは、Python Elasticsearchクライアントを使用して操作を実行します。

あなたは以下のことを学びます:

Cohereサービスを使用してテキスト埋め込みのための推論エンドポイントを作成すること、
Elasticsearchインデックスのために必要なインデックスマッピングを作成すること、
埋め込みと共にドキュメントをインデックスに取り込むための推論パイプラインを構築すること、
データに対してハイブリッド検索を実行すること、
Cohereの再ランクモデルを使用して検索結果を再ランクすること、
CohereのチャットAPIを使用してRAGシステムを設計すること。

このチュートリアルでは、SciFactデータセットを使用します。

異なるデータセットを使用した例については、Cohereのチュートリアルを参照してください。

このチュートリアルのColabノートブック版も確認できます。

要件

Cohereサービスで推論APIを使用するには、有料のCohereアカウントが必要です。Cohereの無料トライアルAPIの使用は制限されています。
Elastic Cloudアカウントが必要です。
Python 3.7以上が必要です。

必要なパッケージをインストールする

ElasticsearchとCohereをインストールします:

Py

!pip install elasticsearch
!pip install cohere

必要なパッケージをインポートします:

Py

from elasticsearch import Elasticsearch, helpers
import cohere
import json
import requests

Elasticsearchクライアントを作成する

Elasticsearchクライアントを作成するには、次のものが必要です:

Py

ELASTICSEARCH_ENDPOINT = "elastic_endpoint"
ELASTIC_API_KEY = "elastic_api_key"
client = Elasticsearch(
  cloud_id=ELASTICSEARCH_ENDPOINT,
  api_key=ELASTIC_API_KEY
)
# クライアントが接続されていることを確認する
print(client.info())

推論エンドポイントを作成する

推論エンドポイントを作成することから始めます。この例では、推論エンドポイントはCohereのembed-english-v3.0モデルを使用し、embedding_typeはbyteに設定されています。

Py

COHERE_API_KEY = "cohere_api_key"
client.inference.put_model(
   task_type="text_embedding",
   inference_id="cohere_embeddings",
   body={
   "service": "cohere",
   "service_settings": {
   "api_key": COHERE_API_KEY,
   "model_id": "embed-english-v3.0",
   "embedding_type": "byte"
   }
   },
)

APIキーは、CohereダッシュボードのAPIキーセクションで見つけることができます。

インデックスマッピングを作成する

埋め込みを含むインデックスのためのインデックスマッピングを作成します。

Py

client.indices.create(
   index="cohere-embeddings",
   settings={"index": {"default_pipeline": "cohere_embeddings"}},
   mappings={
   "properties": {
   "text_embedding": {
   "type": "dense_vector",
   "dims": 1024,
   "element_type": "byte",
   },
   "text": {"type": "text"},
   "id": {"type": "integer"},
   "title": {"type": "text"}
   }
   },
)

推論パイプラインを作成する

これで、埋め込みを保存するための推論エンドポイントとインデックスが準備できました。次のステップは、推論エンドポイントを使用して埋め込みを作成し、それらをインデックスに保存する取り込みパイプラインを作成することです。

Py

client.ingest.put_pipeline(
   id="cohere_embeddings",
   description="Ingest pipeline for Cohere inference.",
   processors=[
   {
   "inference": {
   "model_id": "cohere_embeddings",
   "input_output": {
   "input_field": "text",
   "output_field": "text_embedding",
   },
   }
   }
   ],
)

データを準備し、ドキュメントを挿入する

この例では、HuggingFaceで見つけることができるSciFactデータセットを使用します。

Py

url = 'https://huggingface.co/datasets/mteb/scifact/raw/main/corpus.jsonl'
# URLからJSONLデータを取得する
response = requests.get(url)
response.raise_for_status()  # 不正なレスポンスを確認する
# 内容を改行で分割し、各行をJSONとして解析する
data = [json.loads(line) for line in response.text.strip().split('\n') if line]
# データは辞書のリストです
# _idキーをidに変更します。_idはElasticsearchで予約されたキーです。
for item in data:
   if '_id' in item:
   item['id'] = item.pop('_id')
# インデックスされるドキュメントを準備する
documents = []
for line in data:
   data_dict = line
   documents.append({
   "_index": "cohere-embeddings",
   "_source": data_dict,
   }
   )
# バルクエンドポイントを使用してインデックスする
helpers.bulk(client, documents)
print("データの取り込みが完了し、テキスト埋め込みが生成されました!")

あなたのインデックスは、SciFactデータとテキストフィールドのテキスト埋め込みで満たされています。

ハイブリッド検索

インデックスをクエリし始めましょう!

以下のコードはハイブリッド検索を実行します。kNNクエリは、text_embeddingフィールドを使用してベクトルの類似性に基づいて検索結果の関連性を計算し、レキシカル検索クエリはtitleおよびtextフィールドでキーワードの類似性を計算するためにBM25検索を使用します。

Py

query = "バイオ類似性とは何ですか?"
response = client.search(
   index="cohere-embeddings",
   size=100,
   knn={
   "field": "text_embedding",
   "query_vector_builder": {
   "text_embedding": {
   "model_id": "cohere_embeddings",
   "model_text": query,
   }
   },
   "k": 10,
   "num_candidates": 50,
   },
   query={
   "multi_match": {
   "query": query,
   "fields": ["text", "title"]
   }
   }
)
raw_documents = response["hits"]["hits"]
# 最初の10件の結果を表示する
for document in raw_documents[0:10]:
  print(f'タイトル: {document["_source"]["title"]}\nテキスト: {document["_source"]["text"]}\n')
# ランキングのためにドキュメントをフォーマットする
documents = []
for hit in response["hits"]["hits"]:
   documents.append(hit["_source"]["text"])

検索結果を再ランクする

結果をより効果的に組み合わせるために、推論APIを通じてCohereのRerank v3モデルを使用して、結果のより正確な意味的再ランクを提供します。

Cohere APIキーと使用するモデル名をmodel_id（この例ではrerank-english-v3.0）として推論エンドポイントを作成します。

Py

client.inference.put_model(
   task_type="rerank",
   inference_id="cohere_rerank",
   body={
   "service": "cohere",
   "service_settings":{
   "api_key": COHERE_API_KEY,
   "model_id": "rerank-english-v3.0"
   },
   "task_settings": {
   "top_n": 10,
   },
   }
)

新しい推論エンドポイントを使用して結果を再ランクします。

Py


# クエリと検索結果をサービスに渡す
response = client.inference.inference(
   inference_id="cohere_rerank",
   body={
   "query": query,
   "input": documents,
   "task_settings": {
   "return_documents": False
   }
   }
)
# 再ランク応答で提供されたインデックスに基づいて入力ドキュメントを再構築する
ranked_documents = []
for document in response.body["rerank"]:
  ranked_documents.append({
   "title": raw_documents[int(document["index"])]["_source"]["title"],
   "text": raw_documents[int(document["index"])]["_source"]["text"]
  })
# 上位10件の結果を印刷する
for document in ranked_documents[0:10]:
  print(f"タイトル: {document['title']}\nテキスト: {document['text']}\n")

応答は、関連性の降順でドキュメントのリストです。各ドキュメントには、推論エンドポイントに送信されたときのドキュメントの順序を反映するインデックスが対応しています。

CohereとElasticsearchを使用した情報検索強化生成 (RAG)

RAGは、外部データソースから取得した追加情報を使用してテキストを生成する方法です。ランク付けされた結果を使用して、CohereのチャットAPIを使用して以前に作成したものの上にRAGシステムを構築できます。

取得したドキュメントとクエリを渡して、Cohereの最新の生成モデルCommand R+を使用して基盤となる応答を受け取ります。

次に、クエリとドキュメントをチャットAPIに渡し、応答を印刷します。

Py

response = co.chat(message=query, documents=ranked_documents, model='command-r-plus')
source_documents = []
for citation in response.citations:
   for document_id in citation.document_ids:
   if document_id not in source_documents:
   source_documents.append(document_id)
print(f"Query: {query}")
print(f"Response: {response.text}")
print("Sources:")
for document in response.documents:
   if document['id'] in source_documents:
   print(f"{document['title']}: {document['text']}")

応答はこのようになります:

コンソール結果

Query: What is biosimilarity?
Response: Biosimilarity is based on the comparability concept, which has been used successfully for several decades to ensure close similarity of a biological product before and after a manufacturing change. Over the last 10 years, experience with biosimilars has shown that even complex biotechnology-derived proteins can be copied successfully.
Sources:
Interchangeability of Biosimilars: A European Perspective: (...)