フィールドデータ型 - フラット化（Flattened） - 《Elasticsearchガイドv8.15》日本語

フラット化フィールドタイプ
サポートされている操作
フラット化されたフィールドの取得
フラット化されたオブジェクトフィールドのパラメータ
合成 _source

フラット化フィールドタイプ

デフォルトでは、オブジェクト内の各サブフィールドは別々にマッピングされ、インデックスされます。サブフィールドの名前やタイプが事前にわからない場合、それらは動的にマッピングされます。


このデータタイプは、大量または未知の数のユニークキーを持つオブジェクトのインデックス作成に役立ちます。全体のJSONオブジェクトに対して1つのフィールドマッピングが作成され、あまりにも多くの異なるフィールドマッピングによる[mappings explosion](e690e01ae99d7d0f.md#mapping-limit-settings)を防ぐのに役立ちます。  
一方で、フラット化されたオブジェクトフィールドは、検索機能に関してトレードオフを示します。基本的なクエリのみが許可され、数値範囲クエリやハイライトのサポートはありません。制限に関するさらなる情報は、[サポートされている操作](c59ef41400664546.md#supported-operations)セクションにあります。  
`````flattened`````マッピングタイプは、すべてのドキュメントコンテンツのインデックス作成には**使用しないでください**。これはすべての値をキーワードとして扱い、完全な検索機能を提供しません。各サブフィールドがマッピング内に独自のエントリを持つデフォルトのアプローチは、ほとんどのケースでうまく機能します。  
フラット化されたオブジェクトフィールドは、次のように作成できます：
#### Python
``````python
resp = client.indices.create(
   index="bug_reports",
   mappings={
   "properties": {
   "title": {
   "type": "text"
   },
   "labels": {
   "type": "flattened"
   }
   }
   },
)
print(resp)
resp1 = client.index(
   index="bug_reports",
   id="1",
   document={
   "title": "Results are not sorted correctly.",
   "labels": {
   "priority": "urgent",
   "release": [
   "v1.2.5",
   "v1.3.0"
   ],
   "timestamp": {
   "created": 1541458026,
   "closed": 1541457010
   }
   }
   },
)
print(resp1)
`

Ruby

response = client.indices.create(
  index: 'bug_reports',
  body: {
   mappings: {
   properties: {
   title: {
   type: 'text'
   },
   labels: {
   type: 'flattened'
   }
   }
   }
  }
)
puts response
response = client.index(
  index: 'bug_reports',
  id: 1,
  body: {
   title: 'Results are not sorted correctly.',
   labels: {
   priority: 'urgent',
   release: [
   'v1.2.5',
   'v1.3.0'
   ],
   timestamp: {
   created: 1_541_458_026,
   closed: 1_541_457_010
   }
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "bug_reports",
  mappings: {
   properties: {
   title: {
   type: "text",
   },
   labels: {
   type: "flattened",
   },
   },
  },
});
console.log(response);
const response1 = await client.index({
  index: "bug_reports",
  id: 1,
  document: {
   title: "Results are not sorted correctly.",
   labels: {
   priority: "urgent",
   release: ["v1.2.5", "v1.3.0"],
   timestamp: {
   created: 1541458026,
   closed: 1541457010,
   },
   },
  },
});
console.log(response1);

コンソール

PUT bug_reports
{
  "mappings": {
   "properties": {
   "title": {
   "type": "text"
   },
   "labels": {
   "type": "flattened"
   }
   }
  }
}
POST bug_reports/_doc/1
{
  "title": "Results are not sorted correctly.",
  "labels": {
   "priority": "urgent",
   "release": ["v1.2.5", "v1.3.0"],
   "timestamp": {
   "created": 1541458026,
   "closed": 1541457010
   }
  }
}

インデックス作成中に、JSONオブジェクト内の各リーフ値に対してトークンが作成されます。値は文字列キーワードとしてインデックスされ、数値や日付に対する分析や特別な処理は行われません。

トップレベルのflattenedフィールドをクエリすると、オブジェクト内のすべてのリーフ値が検索されます：

Python

resp = client.search(
   index="bug_reports",
   query={
   "term": {
   "labels": "urgent"
   }
   },
)
print(resp)

Ruby

response = client.search(
  index: 'bug_reports',
  body: {
   query: {
   term: {
   labels: 'urgent'
   }
   }
  }
)
puts response

Js

const response = await client.search({
  index: "bug_reports",
  query: {
   term: {
   labels: "urgent",
   },
  },
});
console.log(response);

コンソール

POST bug_reports/_search
{
  "query": {
   "term": {"labels": "urgent"}
  }
}

フラット化されたオブジェクト内の特定のキーをクエリするには、オブジェクトドット表記を使用します：

Python

resp = client.search(
   index="bug_reports",
   query={
   "term": {
   "labels.release": "v1.3.0"
   }
   },
)
print(resp)

Ruby

response = client.search(
  index: 'bug_reports',
  body: {
   query: {
   term: {
   'labels.release' => 'v1.3.0'
   }
   }
  }
)
puts response

Js

const response = await client.search({
  index: "bug_reports",
  query: {
   term: {
   "labels.release": "v1.3.0",
   },
  },
});
console.log(response);

コンソール

POST bug_reports/_search
{
  "query": {
   "term": {"labels.release": "v1.3.0"}
  }
}

サポートされている操作

値がインデックスされる方法の類似性のために、flattenedフィールドはkeywordフィールドと同じマッピングおよび検索機能の多くを共有します。

現在、フラット化されたオブジェクトフィールドは、次のクエリタイプで使用できます：

term、terms、およびterms_set
prefix
range
matchおよびmulti_match
query_stringおよびsimple_query_string
exists

クエリを実行する際、{ "term": {"labels.time*": 1541457010}}のようにワイルドカードを使用してフィールドキーを参照することはできません。すべてのクエリ、rangeを含む、は値を文字列キーワードとして扱います。flattenedフィールドでのハイライトはサポートされていません。

フラット化されたオブジェクトフィールドでソートを行うことができ、termsのようなシンプルなキーワードスタイルの集計を実行することもできます。クエリと同様に、数値に対する特別なサポートはありません。JSONオブジェクト内のすべての値はキーワードとして扱われます。ソート時には、値が辞書式に比較されることを意味します。

現在、フラット化されたオブジェクトフィールドは保存できません。マッピング内でstoreパラメータを指定することはできません。

フラット化されたフィールドの取得

フィールド値と具体的なサブフィールドは、fieldsパラメータを使用して取得できます。flattenedフィールドは、潜在的に多くのサブフィールドを持つ全体のオブジェクトを単一のフィールドとしてマッピングするため、応答には_sourceからの変更されていない構造が含まれます。

ただし、単一のサブフィールドは、リクエスト内で明示的に指定することによって取得できます。これは具体的なパスに対してのみ機能し、ワイルドカードを使用することはできません：

Python

resp = client.indices.create(
   index="my-index-000001",
   mappings={
   "properties": {
   "flattened_field": {
   "type": "flattened"
   }
   }
   },
)
print(resp)
resp1 = client.index(
   index="my-index-000001",
   id="1",
   refresh=True,
   document={
   "flattened_field": {
   "subfield": "value"
   }
   },
)
print(resp1)
resp2 = client.search(
   index="my-index-000001",
   fields=[
   "flattened_field.subfield"
   ],
   source=False,
)
print(resp2)

Ruby

response = client.indices.create(
  index: 'my-index-000001',
  body: {
   mappings: {
   properties: {
   flattened_field: {
   type: 'flattened'
   }
   }
   }
  }
)
puts response
response = client.index(
  index: 'my-index-000001',
  id: 1,
  refresh: true,
  body: {
   flattened_field: {
   subfield: 'value'
   }
  }
)
puts response
response = client.search(
  index: 'my-index-000001',
  body: {
   fields: [
   'flattened_field.subfield'
   ],
   _source: false
  }
)
puts response

Js

const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
   properties: {
   flattened_field: {
   type: "flattened",
   },
   },
  },
});
console.log(response);
const response1 = await client.index({
  index: "my-index-000001",
  id: 1,
  refresh: "true",
  document: {
   flattened_field: {
   subfield: "value",
   },
  },
});
console.log(response1);
const response2 = await client.search({
  index: "my-index-000001",
  fields: ["flattened_field.subfield"],
  _source: false,
});
console.log(response2);

コンソール

PUT my-index-000001
{
  "mappings": {
   "properties": {
   "flattened_field": {
   "type": "flattened"
   }
   }
  }
}
PUT my-index-000001/_doc/1?refresh=true
{
  "flattened_field" : {
   "subfield" : "value"
  }
}
POST my-index-000001/_search
{
  "fields": ["flattened_field.subfield"],
  "_source": false
}

コンソール-結果

{
  "took": 2,
  "timed_out": false,
  "_shards": {
   "total": 1,
   "successful": 1,
   "skipped": 0,
   "failed": 0
  },
  "hits": {
   "total": {
   "value": 1,
   "relation": "eq"
   },
   "max_score": 1.0,
   "hits": [{
   "_index": "my-index-000001",
   "_id": "1",
   "_score": 1.0,
   "fields": {
   "flattened_field.subfield" : [ "value" ]
   }
   }]
  }
}

サブフィールドからフラット化されたフィールドの値を取得するために、Painlessスクリプトを使用することもできます。Painlessスクリプトにdoc['<field_name>'].valueを含める代わりに、doc['<field_name>.<sub-field_name>'].valueを使用します。たとえば、releaseサブフィールドを持つlabelというフラット化されたフィールドがある場合、あなたのPainlessスクリプトはdoc['labels.release'].valueになります。

たとえば、マッピングにflattenedタイプのフィールドが1つ含まれているとしましょう：

Python

resp = client.indices.create(
   index="my-index-000001",
   mappings={
   "properties": {
   "title": {
   "type": "text"
   },
   "labels": {
   "type": "flattened"
   }
   }
   },
)
print(resp)

Ruby

response = client.indices.create(
  index: 'my-index-000001',
  body: {
   mappings: {
   properties: {
   title: {
   type: 'text'
   },
   labels: {
   type: 'flattened'
   }
   }
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
   properties: {
   title: {
   type: "text",
   },
   labels: {
   type: "flattened",
   },
   },
  },
});
console.log(response);

コンソール

PUT my-index-000001
{
  "mappings": {
   "properties": {
   "title": {
   "type": "text"
   },
   "labels": {
   "type": "flattened"
   }
   }
  }
}

マッピングされたフィールドを含むいくつかのドキュメントをインデックスします。labelsフィールドには3つのサブフィールドがあります：

Python

resp = client.bulk(
   index="my-index-000001",
   refresh=True,
   operations=[
   {
   "index": {}
   },
   {
   "title": "Something really urgent",
   "labels": {
   "priority": "urgent",
   "release": [
   "v1.2.5",
   "v1.3.0"
   ],
   "timestamp": {
   "created": 1541458026,
   "closed": 1541457010
   }
   }
   },
   {
   "index": {}
   },
   {
   "title": "Somewhat less urgent",
   "labels": {
   "priority": "high",
   "release": [
   "v1.3.0"
   ],
   "timestamp": {
   "created": 1541458026,
   "closed": 1541457010
   }
   }
   },
   {
   "index": {}
   },
   {
   "title": "Not urgent",
   "labels": {
   "priority": "low",
   "release": [
   "v1.2.0"
   ],
   "timestamp": {
   "created": 1541458026,
   "closed": 1541457010
   }
   }
   }
   ],
)
print(resp)

Ruby

response = client.bulk(
  index: 'my-index-000001',
  refresh: true,
  body: [
   {
   index: {}
   },
   {
   title: 'Something really urgent',
   labels: {
   priority: 'urgent',
   release: [
   'v1.2.5',
   'v1.3.0'
   ],
   timestamp: {
   created: 1_541_458_026,
   closed: 1_541_457_010
   }
   }
   },
   {
   index: {}
   },
   {
   title: 'Somewhat less urgent',
   labels: {
   priority: 'high',
   release: [
   'v1.3.0'
   ],
   timestamp: {
   created: 1_541_458_026,
   closed: 1_541_457_010
   }
   }
   },
   {
   index: {}
   },
   {
   title: 'Not urgent',
   labels: {
   priority: 'low',
   release: [
   'v1.2.0'
   ],
   timestamp: {
   created: 1_541_458_026,
   closed: 1_541_457_010
   }
   }
   }
  ]
)
puts response

Js

const response = await client.bulk({
  index: "my-index-000001",
  refresh: "true",
  operations: [
   {
   index: {},
   },
   {
   title: "Something really urgent",
   labels: {
   priority: "urgent",
   release: ["v1.2.5", "v1.3.0"],
   timestamp: {
   created: 1541458026,
   closed: 1541457010,
   },
   },
   },
   {
   index: {},
   },
   {
   title: "Somewhat less urgent",
   labels: {
   priority: "high",
   release: ["v1.3.0"],
   timestamp: {
   created: 1541458026,
   closed: 1541457010,
   },
   },
   },
   {
   index: {},
   },
   {
   title: "Not urgent",
   labels: {
   priority: "low",
   release: ["v1.2.0"],
   timestamp: {
   created: 1541458026,
   closed: 1541457010,
   },
   },
   },
  ],
});
console.log(response);

コンソール

POST /my-index-000001/_bulk?refresh
{"index":{}}
{"title":"Something really urgent","labels":{"priority":"urgent","release":["v1.2.5","v1.3.0"],"timestamp":{"created":1541458026,"closed":1541457010}}}
{"index":{}}
{"title":"Somewhat less urgent","labels":{"priority":"high","release":["v1.3.0"],"timestamp":{"created":1541458026,"closed":1541457010}}}
{"index":{}}
{"title":"Not urgent","labels":{"priority":"low","release":["v1.2.0"],"timestamp":{"created":1541458026,"closed":1541457010}}}


#### Painless
``````painless
"script": {
  "source": """
   if (doc['labels.release'].value.equals('v1.3.0'))
   {emit(doc['labels.release'].value)}
   else{emit('Version mismatch')}
  """
`

フラット化されたオブジェクトフィールドのパラメータ

次のマッピングパラメータが受け入れられます：


`depth_limit`	ネストされた内側のオブジェクトの観点から、フラット化されたオブジェクトフィールドの最大許可深度。フラット化されたオブジェクトフィールドがこの制限を超えると、エラーが発生します。デフォルトは`20`です。`depth_limit`は、マッピングの更新 APIを通じて動的に更新できます。
`doc_values`	フィールドは、後でソート、集計、またはスクリプトに使用できるように、ディスクにカラムストライド方式で保存されるべきですか？`true`（デフォルト）または`false`を受け入れます。
`eager_global_ordinals`	グローバルオーディナルはリフレッシュ時に早期にロードされるべきですか？`true`または`false`（デフォルト）を受け入れます。これは、用語集計に頻繁に使用されるフィールドに対して良いアイデアです。
`ignore_above`	この制限を超えるリーフ値はインデックスされません。デフォルトでは制限はなく、すべての値がインデックスされます。この制限は、フラット化されたオブジェクトフィールド内のリーフ値に適用され、フィールド全体の長さには適用されません。
`index`	フィールドが検索可能であるべきかどうかを決定します。`true`（デフォルト）または`false`を受け入れます。
`index_options`	スコアリング目的でインデックスに保存されるべき情報。デフォルトは`docs`ですが、スコアを計算する際に用語頻度を考慮するために`freqs`に設定することもできます。
`null_value`	フラット化されたオブジェクトフィールド内の明示的な`null`値の代わりに置き換えられる文字列値。デフォルトは`null`で、これはnullフィールドが欠落しているかのように扱われることを意味します。
`similarity`	使用するスコアリングアルゴリズムまたは類似性。デフォルトは`BM25`です。
`split_queries_on_whitespace`	全文検索クエリがこのフィールドのクエリを構築する際に入力をホワイトスペースで分割するべきかどうか。`true`または`false`（デフォルト）を受け入れます。

| time_series_dimensions | （オプション、文字列の配列）フラット化されたオブジェクト内のフィールドのリストで、各フィールドは時系列の次元です。各フィールドは、ルートフィールドからの相対パスを使用して指定され、ルートフィールド名は含まれません。

合成 _source

合成_sourceは、一般的にTSDBインデックス（index.modeがtime_seriesに設定されているインデックス）のみで利用可能です。他のインデックスでは、合成_sourceは技術プレビュー中です。技術プレビュー中の機能は、将来のリリースで変更または削除される可能性があります。Elasticは問題を修正するために作業しますが、技術プレビュー中の機能は公式GA機能のサポートSLAの対象ではありません。

フラット化されたフィールドは、デフォルト設定で合成_sourceをサポートします。合成_sourceは、doc_valuesが無効になっている場合には使用できません。

合成ソースは常にアルファベット順にソートされ、フラット化されたフィールドを重複排除します。たとえば：

Python

resp = client.indices.create(
   index="idx",
   mappings={
   "_source": {
   "mode": "synthetic"
   },
   "properties": {
   "flattened": {
   "type": "flattened"
   }
   }
   },
)
print(resp)
resp1 = client.index(
   index="idx",
   id="1",
   document={
   "flattened": {
   "field": [
   "apple",
   "apple",
   "banana",
   "avocado",
   "10",
   "200",
   "AVOCADO",
   "Banana",
   "Tangerine"
   ]
   }
   },
)
print(resp1)

Ruby

response = client.indices.create(
  index: 'idx',
  body: {
   mappings: {
   _source: {
   mode: 'synthetic'
   },
   properties: {
   flattened: {
   type: 'flattened'
   }
   }
   }
  }
)
puts response
response = client.index(
  index: 'idx',
  id: 1,
  body: {
   flattened: {
   field: [
   'apple',
   'apple',
   'banana',
   'avocado',
   '10',
   '200',
   'AVOCADO',
   'Banana',
   'Tangerine'
   ]
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "idx",
  mappings: {
   _source: {
   mode: "synthetic",
   },
   properties: {
   flattened: {
   type: "flattened",
   },
   },
  },
});
console.log(response);
const response1 = await client.index({
  index: "idx",
  id: 1,
  document: {
   flattened: {
   field: [
   "apple",
   "apple",
   "banana",
   "avocado",
   "10",
   "200",
   "AVOCADO",
   "Banana",
   "Tangerine",
   ],
   },
  },
});
console.log(response1);

コンソール

PUT idx
{
  "mappings": {
   "_source": { "mode": "synthetic" },
   "properties": {
   "flattened": { "type": "flattened" }
   }
  }
}
PUT idx/_doc/1
{
  "flattened": {
   "field": [ "apple", "apple", "banana", "avocado", "10", "200", "AVOCADO", "Banana", "Tangerine" ]
  }
}

次のようになります：

コンソール-結果

{
  "flattened": {
   "field": [ "10", "200", "AVOCADO", "Banana", "Tangerine", "apple", "avocado", "banana" ]
  }
}

合成ソースは常にオブジェクトの配列の代わりにネストされたオブジェクトを使用します。たとえば：

Python

resp = client.indices.create(
   index="idx",
   mappings={
   "_source": {
   "mode": "synthetic"
   },
   "properties": {
   "flattened": {
   "type": "flattened"
   }
   }
   },
)
print(resp)
resp1 = client.index(
   index="idx",
   id="1",
   document={
   "flattened": {
   "field": [
   {
   "id": 1,
   "name": "foo"
   },
   {
   "id": 2,
   "name": "bar"
   },
   {
   "id": 3,
   "name": "baz"
   }
   ]
   }
   },
)
print(resp1)

Ruby

response = client.indices.create(
  index: 'idx',
  body: {
   mappings: {
   _source: {
   mode: 'synthetic'
   },
   properties: {
   flattened: {
   type: 'flattened'
   }
   }
   }
  }
)
puts response
response = client.index(
  index: 'idx',
  id: 1,
  body: {
   flattened: {
   field: [
   {
   id: 1,
   name: 'foo'
   },
   {
   id: 2,
   name: 'bar'
   },
   {
   id: 3,
   name: 'baz'
   }
   ]
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "idx",
  mappings: {
   _source: {
   mode: "synthetic",
   },
   properties: {
   flattened: {
   type: "flattened",
   },
   },
  },
});
console.log(response);
const response1 = await client.index({
  index: "idx",
  id: 1,
  document: {
   flattened: {
   field: [
   {
   id: 1,
   name: "foo",
   },
   {
   id: 2,
   name: "bar",
   },
   {
   id: 3,
   name: "baz",
   },
   ],
   },
  },
});
console.log(response1);

コンソール

PUT idx
{
  "mappings": {
   "_source": { "mode": "synthetic" },
   "properties": {
   "flattened": { "type": "flattened" }
   }
  }
}
PUT idx/_doc/1
{
  "flattened": {
   "field": [
   { "id": 1, "name": "foo" },
   { "id": 2, "name": "bar" },
   { "id": 3, "name": "baz" }
   ]
  }
}

次のようになります（「フラット化された」配列の代わりにネストされたオブジェクトに注意）：

コンソール-結果

{
   "flattened": {
   "field": {
   "id": [ "1", "2", "3" ],
   "name": [ "bar", "baz", "foo" ]
   }
   }
}

合成ソースは常に1要素の配列に対して単一値フィールドを使用します。たとえば：

Python

resp = client.indices.create(
   index="idx",
   mappings={
   "_source": {
   "mode": "synthetic"
   },
   "properties": {
   "flattened": {
   "type": "flattened"
   }
   }
   },
)
print(resp)
resp1 = client.index(
   index="idx",
   id="1",
   document={
   "flattened": {
   "field": [
   "foo"
   ]
   }
   },
)
print(resp1)

Ruby

response = client.indices.create(
  index: 'idx',
  body: {
   mappings: {
   _source: {
   mode: 'synthetic'
   },
   properties: {
   flattened: {
   type: 'flattened'
   }
   }
   }
  }
)
puts response
response = client.index(
  index: 'idx',
  id: 1,
  body: {
   flattened: {
   field: [
   'foo'
   ]
   }
  }
)
puts response

Js

const response = await client.indices.create({
  index: "idx",
  mappings: {
   _source: {
   mode: "synthetic",
   },
   properties: {
   flattened: {
   type: "flattened",
   },
   },
  },
});
console.log(response);
const response1 = await client.index({
  index: "idx",
  id: 1,
  document: {
   flattened: {
   field: ["foo"],
   },
  },
});
console.log(response1);

コンソール

PUT idx
{
  "mappings": {
   "_source": { "mode": "synthetic" },
   "properties": {
   "flattened": { "type": "flattened" }
   }
  }
}
PUT idx/_doc/1
{
  "flattened": {
   "field": [ "foo" ]
  }
}

次のようになります（「フラット化された」配列の代わりにネストされたオブジェクトに注意）：

コンソール-結果

{
  "flattened": {
   "field": "foo"
  }
}