トークンフィルタリファレンス - KStem - 《Elasticsearchガイドv8.15》日本語

KStem トークンフィルター
アナライザーに追加

KStem トークンフィルター

KStemに基づく英語のステミングを提供します。kstemフィルターは、アルゴリズミックステミングと組み込みの辞書を組み合わせています。

kstemフィルターは、porter_stemフィルターなどの他の英語のステマーフィルターよりも、あまり攻撃的にステミングしない傾向があります。


このフィルターは、Luceneの[KStemFilter](https://lucene.apache.org/core/9_11_1/analysis/common/org/apache/lucene/analysis/en/KStemFilter.html)を使用します。
## 例
以下の分析APIリクエストは、`````kstem`````フィルターを使用して`````the foxes
jumping quickly`````を`````the fox jump quick`````にステミングします:
#### Python
``````python
resp = client.indices.analyze(
   tokenizer="standard",
   filter=[
   "kstem"
   ],
   text="the foxes jumping quickly",
)
print(resp)
`

Ruby

response = client.indices.analyze(
  body: {
   tokenizer: 'standard',
   filter: [
   'kstem'
   ],
   text: 'the foxes jumping quickly'
  }
)
puts response

Js

const response = await client.indices.analyze({
  tokenizer: "standard",
  filter: ["kstem"],
  text: "the foxes jumping quickly",
});
console.log(response);

コンソール

GET /_analyze
{
  "tokenizer": "standard",
  "filter": [ "kstem" ],
  "text": "the foxes jumping quickly"
}

フィルターは以下のトークンを生成します:

テキスト

[ the, fox, jump, quick ]

アナライザーに追加

以下のインデックス作成APIリクエストは、kstemフィルターを使用して新しいカスタムアナライザーを構成します。


#### Python
``````python
resp = client.indices.create(
   index="my-index-000001",
   settings={
   "analysis": {
   "analyzer": {
   "my_analyzer": {
   "tokenizer": "whitespace",
   "filter": [
   "lowercase",
   "kstem"
   ]
   }
   }
   }
   },
)
print(resp)
`

Ruby

response = client.indices.create(
  index: 'my-index-000001',
  body: {
   settings: {
   analysis: {
   analyzer: {
   my_analyzer: {
   tokenizer: 'whitespace',
   filter: [
   'lowercase',
   'kstem'
   ]
   }
   }
   }
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "my-index-000001",
  settings: {
   analysis: {
   analyzer: {
   my_analyzer: {
   tokenizer: "whitespace",
   filter: ["lowercase", "kstem"],
   },
   },
   },
  },
});
console.log(response);

コンソール

PUT /my-index-000001
{
  "settings": {
   "analysis": {
   "analyzer": {
   "my_analyzer": {
   "tokenizer": "whitespace",
   "filter": [
   "lowercase",
   "kstem"
   ]
   }
   }
   }
  }
}