パイプライン集計 - 正規化（Normalize） - 《Elasticsearchガイドv8.15》日本語

正規化集約
構文
- Js
メソッド
例

正規化集約

特定のバケット値に対して特定の正規化/再スケーリング値を計算する親パイプライン集約です。正規化できない値は、スキップギャップポリシーを使用してスキップされます。

構文

normalize 集約は、単独で次のようになります:

Js

{
  "normalize": {
   "buckets_path": "normalized",
   "method": "percent_of_sum"
  }
}

表 75. normalize_pipeline パラメータ

パラメータ名	説明	必須	デフォルト値
`buckets_path`	正規化したいバケットへのパス（詳細については [`buckets_path` 構文](045ebb369a1af050.md#buckets-path-syntax “buckets_path Syntax”）を参照）	必須
`method`	適用する特定のメソッド	必須
`format`	出力値のための DecimalFormat パターン。指定された場合、フォーマットされた値は集約の `value_as_string` プロパティに返されます	オプション	`null`

メソッド

正規化集約は、バケット値を変換するための複数のメソッドをサポートしています。各メソッド定義は、次の元のバケット値のセットを例として使用します: [5, 5, 10, 50, 10, 20].

rescale_0_1
このメソッドは、最小値がゼロ、最大値が1になるようにデータを再スケーリングし、残りは線形に正規化されます。
```
x' = (x - min_x) / (max_x - min_x)
```

[0, 0, .1111, 1, .1111, .3333]

rescale_0_100
このメソッドは、最小値がゼロ、最大値が100になるようにデータを再スケーリングし、残りは線形に正規化されます。
```
x' = 100 * (x - min_x) / (max_x - min_x)
```

[0, 0, 11.11, 100, 11.11, 33.33]

percent_of_sum
このメソッドは、各値をその合計のパーセンテージとして正規化します。
```
x' = x / sum_x
```

[5%, 5%, 10%, 50%, 10%, 20%]

mean
このメソッドは、各値が平均からどれだけ異なるかによって正規化されます。
```
x' = (x - mean_x) / (max_x - min_x)
```

[4.63, 4.63, 9.63, 49.63, 9.63, 9.63, 19.63]

z-score
このメソッドは、各値が平均からの距離を標準偏差に対して表すように正規化されます。
```
x' = (x - mean_x) / stdev_x
```

[-0.68, -0.68, -0.39, 1.94, -0.39, 0.19]

softmax
このメソッドは、各値を指数化し、元の値の指数の合計に対して相対的に正規化します。
```
x' = e^x / sum_e_x
```

[2.862E-20, 2.862E-20, 4.248E-18, 0.999, 9.357E-14, 4.248E-18]

例

次のスニペットは、各月の総売上のパーセンテージを計算します:

Python

resp = client.search(
   index="sales",
   size=0,
   aggs={
   "sales_per_month": {
   "date_histogram": {
   "field": "date",
   "calendar_interval": "month"
   },
   "aggs": {
   "sales": {
   "sum": {
   "field": "price"
   }
   },
   "percent_of_total_sales": {
   "normalize": {
   "buckets_path": "sales",
   "method": "percent_of_sum",
   "format": "00.00%"
   }
   }
   }
   }
   },
)
print(resp)

Ruby

response = client.search(
  index: 'sales',
  body: {
   size: 0,
   aggregations: {
   sales_per_month: {
   date_histogram: {
   field: 'date',
   calendar_interval: 'month'
   },
   aggregations: {
   sales: {
   sum: {
   field: 'price'
   }
   },
   percent_of_total_sales: {
   normalize: {
   buckets_path: 'sales',
   method: 'percent_of_sum',
   format: '00.00%'
   }
   }
   }
   }
   }
  }
)
puts response

Js

const response = await client.search({
  index: "sales",
  size: 0,
  aggs: {
   sales_per_month: {
   date_histogram: {
   field: "date",
   calendar_interval: "month",
   },
   aggs: {
   sales: {
   sum: {
   field: "price",
   },
   },
   percent_of_total_sales: {
   normalize: {
   buckets_path: "sales",
   method: "percent_of_sum",
   format: "00.00%",
   },
   },
   },
   },
  },
});
console.log(response);

コンソール

POST /sales/_search
{
  "size": 0,
  "aggs": {
   "sales_per_month": {
   "date_histogram": {
   "field": "date",
   "calendar_interval": "month"
   },
   "aggs": {
   "sales": {
   "sum": {
   "field": "price"
   }
   },
   "percent_of_total_sales": {
   "normalize": {
   "buckets_path": "sales",
   "method": "percent_of_sum",
   "format": "00.00%"
   }
   }
   }
   }
  }
}


	`buckets_path` はこの正規化集約に `sales` 集約の出力を再スケーリングに使用するよう指示します

| | method は適用する再スケーリングを設定します。この場合、percent_of_sum は親バケット内のすべての売上のパーセンテージとして売上値を計算します。
| | format は、Javaの DecimalFormat パターンを使用してメトリックを文字列としてフォーマットする方法に影響を与えます。この場合、100を掛けて % を追加します。

次のような応答が返される可能性があります:

コンソール-結果

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
   "sales_per_month": {
   "buckets": [
   {
   "key_as_string": "2015/01/01 00:00:00",
   "key": 1420070400000,
   "doc_count": 3,
   "sales": {
   "value": 550.0
   },
   "percent_of_total_sales": {
   "value": 0.5583756345177665,
   "value_as_string": "55.84%"
   }
   },
   {
   "key_as_string": "2015/02/01 00:00:00",
   "key": 1422748800000,
   "doc_count": 2,
   "sales": {
   "value": 60.0
   },
   "percent_of_total_sales": {
   "value": 0.06091370558375635,
   "value_as_string": "06.09%"
   }
   },
   {
   "key_as_string": "2015/03/01 00:00:00",
   "key": 1425168000000,
   "doc_count": 2,
   "sales": {
   "value": 375.0
   },
   "percent_of_total_sales": {
   "value": 0.38071065989847713,
   "value_as_string": "38.07%"
   }
   }
   ]
   }
   }
}