Aggregate Functions

Aggregate functions operate on multiple sets of values and return a single value.

AVG( <expr> )

Returns the arithmetic mean of all input values.

select avg(colA) from table1;

COUNT( <expr> )

Returns the number of rows in an input value.

select count(colA) from table1;

select count(*), colA from table1 group by colA;

DISTINCT( <expr> )

Returns the number of distinct values in a given column.

select distinct(colA) from table1;

MAX( <expr> )

Returns the maximum value of all input values.

select max(colA) from table1;

MIN( <expr> )

Returns the minimum value of all input values.

select min(colA) from table1;

SUM( <expr> )

Returns the arithmetic sum of all input values.

select sum(colA) from table1;

ARBITRARY / ANY_VALUE( <expr> )

Returns random value from set of values

select arbitrary(colA) from table1;

LISTAGG( [distinct] <column> [, <delimiter>]) [within group (order by <column> ) )

Concatenates the input values together, with each value separated by the specified delimiter string.

Support datatype: String or data type that is cast to string

select c1, listagg(distinct c2,'++') within group(order by c2)
from(
SELECT c1, c2
from (values (1, 'one'), (1, 'two'),(1, 'three'),(1, 'two'),(1, null),(1, null),(2,null),(2,'five')) AS v1 (c1, c2))
group by 1
order by 1,2;  
 > |1    |one++three++two
   |2    |five

Usages:

  1. This function returns a single string comprising all non-NULL input values

  2. If we didn’t provide WITHIN GROUP (ORDER BY) then the result will be unpredictable

  3. We support exact column name in the order by of within group(order by <column>)we dont support ordinals in the order by

  4. We don’t support over() clause with listagg.

  5. If delimiter is not provided then default value is ,.

STRING_AGG( [distinct] <column> [, <delimiter>]) order by <column> limit <value> )

Concatenates the input values together, with each value separated by the specified delimiter string.

Support datatype: String or data type that is cast to string

select c1, string_agg(distinct c2,',' order by c2 limit 2) as c1
from(
select c1, c2
from (values (1, 'one'), (1, 'two'),(1, 'three'),(1, 'two'),(1, null),(1, null),(2,null),(2,'five')) AS v1 (c1, c2))
group by 1
order by 1,2;
> |1    |one ,three
  |2    |five

Usages:

  1. This function returns a single string comprising all non-NULL input values

  2. If we didn’t provide ORDER BY <column> then the result will be unpredictable

  3. We support exact column name in the order by of ORDER BY <column>we don't support ordinals in the order by

  4. We don’t support over() clause with string.

  5. If delimiter is not provided then default value is -.

COUNT(DISTINCT( <expr> ))

Returns distinct count of first projection used in the query

select count(distinct colA, colB) from table1;

Last updated

#930: Cross account hive GCP

Change request updated