集計関数

説明

集計関数は、行全体の値に対して演算を行い、合計、平均、カウント、最小/最大値、標準偏差、推定などの数学的計算、および一部の非数学的演算を実行します。

構文

aggregate_function(input1 [, input2, ...]) FILTER (WHERE boolean_expression)

パラメータ

Spark の集計関数のすべての例については、組み込み集計関数ドキュメントを参照してください。

順序付きセット集計関数

これらの集計関数は、他の集計関数とは異なる構文を使用し、値を順序付けるための式(通常は列名)を指定します。

構文

{ PERCENTILE_CONT | PERCENTILE_DISC }(percentile) WITHIN GROUP (ORDER BY { order_by_expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ... ] }) FILTER (WHERE boolean_expression)

パラメータ

CREATE OR REPLACE TEMPORARY VIEW basic_pays AS SELECT * FROM VALUES
('Diane Murphy','Accounting',8435),
('Mary Patterson','Accounting',9998),
('Jeff Firrelli','Accounting',8992),
('William Patterson','Accounting',8870),
('Gerard Bondur','Accounting',11472),
('Anthony Bow','Accounting',6627),
('Leslie Jennings','IT',8113),
('Leslie Thompson','IT',5186),
('Julie Firrelli','Sales',9181),
('Steve Patterson','Sales',9441),
('Foon Yue Tseng','Sales',6660),
('George Vanauf','Sales',10563),
('Loui Bondur','SCM',10449),
('Gerard Hernandez','SCM',6949),
('Pamela Castillo','SCM',11303),
('Larry Bott','SCM',11798),
('Barry Jones','SCM',10586)
AS basic_pays(employee_name, department, salary);

SELECT * FROM basic_pays;
+-----------------+----------+------+
|    employee_name|department|salary|
+-----------------+----------+------+
|      Anthony Bow|Accounting|  6627|
|      Barry Jones|       SCM| 10586|
|     Diane Murphy|Accounting|  8435|
|   Foon Yue Tseng|     Sales|  6660|
|    George Vanauf|     Sales| 10563|
|    Gerard Bondur|Accounting| 11472|
| Gerard Hernandez|       SCM|  6949|
|    Jeff Firrelli|Accounting|  8992|
|   Julie Firrelli|     Sales|  9181|
|       Larry Bott|       SCM| 11798|
|  Leslie Jennings|        IT|  8113|
|  Leslie Thompson|        IT|  5186|
|      Loui Bondur|       SCM| 10449|
|   Mary Patterson|Accounting|  9998|
|  Pamela Castillo|       SCM| 11303|
|  Steve Patterson|     Sales|  9441|
|William Patterson|Accounting|  8870|
+-----------------+----------+------+

SELECT
    department,
    percentile_cont(0.25) WITHIN GROUP (ORDER BY salary) AS pc1,
    percentile_cont(0.25) WITHIN GROUP (ORDER BY salary) FILTER (WHERE employee_name LIKE '%Bo%') AS pc2,
    percentile_cont(0.25) WITHIN GROUP (ORDER BY salary DESC) AS pc3,
    percentile_cont(0.25) WITHIN GROUP (ORDER BY salary DESC) FILTER (WHERE employee_name LIKE '%Bo%') AS pc4,
    percentile_disc(0.25) WITHIN GROUP (ORDER BY salary) AS pd1,
    percentile_disc(0.25) WITHIN GROUP (ORDER BY salary) FILTER (WHERE employee_name LIKE '%Bo%') AS pd2,
    percentile_disc(0.25) WITHIN GROUP (ORDER BY salary DESC) AS pd3,
    percentile_disc(0.25) WITHIN GROUP (ORDER BY salary DESC) FILTER (WHERE employee_name LIKE '%Bo%') AS pd4
FROM basic_pays
GROUP BY department
ORDER BY department;
+----------+-------+--------+-------+--------+-----+-----+-----+-----+
|department|    pc1|     pc2|    pc3|     pc4|  pd1|  pd2|  pd3|  pd4|
+----------+-------+--------+-------+--------+-----+-----+-----+-----+
|Accounting|8543.75| 7838.25| 9746.5|10260.75| 8435| 6627| 9998|11472|
|        IT|5917.75|    NULL|7381.25|    NULL| 5186| NULL| 8113| NULL|
|     Sales|8550.75|    NULL| 9721.5|    NULL| 6660| NULL|10563| NULL|
|       SCM|10449.0|10786.25|11303.0|11460.75|10449|10449|11303|11798|
+----------+-------+--------+-------+--------+-----+-----+-----+-----+