トークンフィルタリファレンス - トリム（Trim） - 《Elasticsearchガイドv8.15》日本語

トリムトークンフィルター
アナライザーに追加

トリムトークンフィルター

ストリーム内の各トークンから先頭と末尾の空白を削除します。これによりトークンの長さが変わる可能性がありますが、trimフィルターはトークンのオフセットを変更しません。


[トークン化器](/read/elasticsearch-8-15/d096dc2d4f06be16.md)や[`````whitespace`````](/read/elasticsearch-8-15/37c528a22eb78d57.md)トークン化器など、多くの一般的に使用されるトークン化器は、デフォルトで空白を削除します。これらのトークン化器を使用する場合、別の`````trim`````フィルターを追加する必要はありません。
## 例
`````trim`````フィルターがどのように機能するかを見るには、まず空白を含むトークンを生成する必要があります。  
次の[分析API](/read/elasticsearch-8-15/1a51b9d359d8a54c.md)リクエストは、[`````keyword`````](/read/elasticsearch-8-15/ec2a5f33a39f9570.md)トークン化器を使用して`````" fox "`````のトークンを生成します。
#### Python
``````python
resp = client.indices.analyze(
   tokenizer="keyword",
   text=" fox ",
)
print(resp)
`

Ruby

response = client.indices.analyze(
  body: {
   tokenizer: 'keyword',
   text: ' fox '
  }
)
puts response

Js

const response = await client.indices.analyze({
  tokenizer: "keyword",
  text: " fox ",
});
console.log(response);

コンソール

GET _analyze
{
  "tokenizer" : "keyword",
  "text" : " fox "
}

APIは次のレスポンスを返します。" fox "トークンには元のテキストの空白が含まれています。トークンの長さが変わっても、start_offsetとend_offsetは同じままです。

コンソール-結果

{
  "tokens": [
   {
   "token": " fox ",
   "start_offset": 0,
   "end_offset": 5,
   "type": "word",
   "position": 0
   }
  ]
}

空白を削除するには、前の分析APIリクエストにtrimフィルターを追加します。

Python

resp = client.indices.analyze(
   tokenizer="keyword",
   filter=[
   "trim"
   ],
   text=" fox ",
)
print(resp)

Ruby

response = client.indices.analyze(
  body: {
   tokenizer: 'keyword',
   filter: [
   'trim'
   ],
   text: ' fox '
  }
)
puts response

Js

const response = await client.indices.analyze({
  tokenizer: "keyword",
  filter: ["trim"],
  text: " fox ",
});
console.log(response);

コンソール

GET _analyze
{
  "tokenizer" : "keyword",
  "filter" : ["trim"],
  "text" : " fox "
}

APIは次のレスポンスを返します。返されたfoxトークンには先頭や末尾の空白が含まれていません。

コンソール-結果

{
  "tokens": [
   {
   "token": "fox",
   "start_offset": 0,
   "end_offset": 5,
   "type": "word",
   "position": 0
   }
  ]
}

アナライザーに追加

次のインデックス作成APIリクエストは、trimフィルターを使用して新しいカスタムアナライザーを構成します。

Python

resp = client.indices.create(
   index="trim_example",
   settings={
   "analysis": {
   "analyzer": {
   "keyword_trim": {
   "tokenizer": "keyword",
   "filter": [
   "trim"
   ]
   }
   }
   }
   },
)
print(resp)

Ruby

response = client.indices.create(
  index: 'trim_example',
  body: {
   settings: {
   analysis: {
   analyzer: {
   keyword_trim: {
   tokenizer: 'keyword',
   filter: [
   'trim'
   ]
   }
   }
   }
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "trim_example",
  settings: {
   analysis: {
   analyzer: {
   keyword_trim: {
   tokenizer: "keyword",
   filter: ["trim"],
   },
   },
   },
  },
});
console.log(response);

コンソール

PUT trim_example
{
  "settings": {
   "analysis": {
   "analyzer": {
   "keyword_trim": {
   "tokenizer": "keyword",
   "filter": [ "trim" ]
   }
   }
   }
  }
}