Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -55,28 +55,7 @@ ClickHouse Cloud controls user access in two places, via the [cloud console](/cl

## Data types {#data-types}

ClickHouse offers more granular precision with respect to numerics. For example, BigQuery offers the numeric types [`INT64`, `NUMERIC`, `BIGNUMERIC` and `FLOAT64`](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#numeric_types). Contrast these with ClickHouse, which offers multiple precision types for decimals, floats, and integers. With these data types, you can optimize storage and memory overhead, resulting in faster queries and lower resource consumption. Below we map the equivalent ClickHouse type for each BigQuery type:

| BigQuery | ClickHouse |
|----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [ARRAY](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#array_type) | [Array(t)](/sql-reference/data-types/array) |
| [NUMERIC](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#decimal_types) | [Decimal(P, S), Decimal32(S), Decimal64(S), Decimal128(S)](/sql-reference/data-types/decimal) |
| [BIG NUMERIC](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#decimal_types) | [Decimal256(S)](/sql-reference/data-types/decimal) |
| [BOOL](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#boolean_type) | [Bool](/sql-reference/data-types/boolean) |
| [BYTES](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#bytes_type) | [FixedString](/sql-reference/data-types/fixedstring) |
| [DATE](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#date_type) | [Date32](/sql-reference/data-types/date32) (with narrower range) |
| [DATETIME](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#datetime_type) | [DateTime](/sql-reference/data-types/datetime), [DateTime64](/sql-reference/data-types/datetime64) (narrow range, higher precision) |
| [FLOAT64](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#floating_point_types) | [Float64](/sql-reference/data-types/float) |
| [GEOGRAPHY](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#geography_type) | [Geo Data Types](/sql-reference/data-types/float) |
| [INT64](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#integer_types) | [UInt8, UInt16, UInt32, UInt64, UInt128, UInt256, Int8, Int16, Int32, Int64, Int128, Int256](/sql-reference/data-types/int-uint) |
| [INTERVAL](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#integer_types) | NA - [supported as expression](/sql-reference/data-types/special-data-types/interval#usage-remarks) or [through functions](/sql-reference/functions/date-time-functions#addYears) |
| [JSON](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#json_type) | [JSON](/integrations/data-formats/json/inference) |
| [STRING](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#string_type) | [String (bytes)](/sql-reference/data-types/string) |
| [STRUCT](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#constructing_a_struct) | [Tuple](/sql-reference/data-types/tuple), [Nested](/sql-reference/data-types/nested-data-structures/nested) |
| [TIME](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#time_type) | [DateTime64](/sql-reference/data-types/datetime64) |
| [TIMESTAMP](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#timestamp_type) | [DateTime64](/sql-reference/data-types/datetime64) |

When presented with multiple options for ClickHouse types, consider the actual range of the data and pick the lowest required. Also, consider utilizing [appropriate codecs](https://clickhouse.com/blog/optimize-clickhouse-codecs-compression-schema) for further compression.
For the full BigQuery-to-ClickHouse type mapping, see the [Data types section](/migrations/bigquery/sql-translation-reference#data-types) of the SQL translation reference.

## Query acceleration techniques {#query-acceleration-techniques}

Expand Down Expand Up @@ -157,324 +136,4 @@ ClickHouse provides standard SQL with many extensions and improvements that make

## Arrays {#arrays}

Compared to BigQuery's 8 array functions, ClickHouse has over 80 [built-in array functions](/sql-reference/functions/array-functions) for modeling and solving a wide range of problems elegantly and simply.

A typical design pattern in ClickHouse is to use the [`groupArray`](/sql-reference/aggregate-functions/reference/grouparray) aggregate function to (temporarily) transform specific row values of a table into an array. This then can be conveniently processed via array functions, and the result can be converted back into individual table rows via [`arrayJoin`](/sql-reference/functions/array-join) aggregate function.

Because ClickHouse SQL supports [higher order lambda functions](/sql-reference/functions/overview#arrow-operator-and-lambda), many advanced array operations can be achieved by simply calling one of the higher order built-in array functions, instead of temporarily converting arrays back to tables, as it is often [required](https://cloud.google.com/bigquery/docs/arrays) in BigQuery, e.g. for [filtering](https://cloud.google.com/bigquery/docs/arrays#filtering_arrays) or [zipping](https://cloud.google.com/bigquery/docs/arrays#zipping_arrays) arrays. In ClickHouse these operations are just a simple function call of the higher order functions [`arrayFilter`](/sql-reference/functions/array-functions#arrayFilter), and [`arrayZip`](/sql-reference/functions/array-functions#arrayZip), respectively.

In the following, we provide a mapping of array operations from BigQuery to ClickHouse:

| BigQuery | ClickHouse |
|----------|------------|
| [ARRAY_CONCAT](https://cloud.google.com/bigquery/docs/reference/standard-sql/array_functions#array_concat) | [arrayConcat](/sql-reference/functions/array-functions#arrayConcat) |
| [ARRAY_LENGTH](https://cloud.google.com/bigquery/docs/reference/standard-sql/array_functions#array_length) | [length](/sql-reference/functions/array-functions#length) |
| [ARRAY_REVERSE](https://cloud.google.com/bigquery/docs/reference/standard-sql/array_functions#array_reverse) | [arrayReverse](/sql-reference/functions/array-functions#arrayReverse) |
| [ARRAY_TO_STRING](https://cloud.google.com/bigquery/docs/reference/standard-sql/array_functions#array_to_string) | [arrayStringConcat](/sql-reference/functions/splitting-merging-functions#arrayStringConcat) |
| [GENERATE_ARRAY](https://cloud.google.com/bigquery/docs/reference/standard-sql/array_functions#generate_array) | [range](/sql-reference/functions/array-functions#range) |

**Create an array with one element for each row in a subquery**

_BigQuery_

[ARRAY function](https://cloud.google.com/bigquery/docs/reference/standard-sql/array_functions#array)

```sql
SELECT ARRAY
(SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3) AS new_array;

/*-----------*
| new_array |
+-----------+
| [1, 2, 3] |
*-----------*/
```

_ClickHouse_

[groupArray](/sql-reference/aggregate-functions/reference/grouparray) aggregate function

```sql
SELECT groupArray(*) AS new_array
FROM
(
SELECT 1
UNION ALL
SELECT 2
UNION ALL
SELECT 3
)
┌─new_array─┐
1. │ [1,2,3] │
└───────────┘
```

**Convert an array into a set of rows**

_BigQuery_

[`UNNEST`](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#unnest_operator) operator

```sql
SELECT *
FROM UNNEST(['foo', 'bar', 'baz', 'qux', 'corge', 'garply', 'waldo', 'fred'])
AS element
WITH OFFSET AS offset
ORDER BY offset;

/*----------+--------*
| element | offset |
+----------+--------+
| foo | 0 |
| bar | 1 |
| baz | 2 |
| qux | 3 |
| corge | 4 |
| garply | 5 |
| waldo | 6 |
| fred | 7 |
*----------+--------*/
```

_ClickHouse_

[ARRAY JOIN](/sql-reference/statements/select/array-join) clause

```sql
WITH ['foo', 'bar', 'baz', 'qux', 'corge', 'garply', 'waldo', 'fred'] AS values
SELECT element, num-1 AS offset
FROM (SELECT values AS element) AS subquery
ARRAY JOIN element, arrayEnumerate(element) AS num;

/*----------+--------*
| element | offset |
+----------+--------+
| foo | 0 |
| bar | 1 |
| baz | 2 |
| qux | 3 |
| corge | 4 |
| garply | 5 |
| waldo | 6 |
| fred | 7 |
*----------+--------*/
```

**Return an array of dates**

_BigQuery_

[GENERATE_DATE_ARRAY](https://cloud.google.com/bigquery/docs/reference/standard-sql/array_functions#generate_date_array) function

```sql
SELECT GENERATE_DATE_ARRAY('2016-10-05', '2016-10-08') AS example;

/*--------------------------------------------------*
| example |
+--------------------------------------------------+
| [2016-10-05, 2016-10-06, 2016-10-07, 2016-10-08] |
*--------------------------------------------------*/
```

[range](/sql-reference/functions/array-functions#range) + [arrayMap](/sql-reference/functions/array-functions#arrayMap) functions

_ClickHouse_

```sql
SELECT arrayMap(x -> (toDate('2016-10-05') + x), range(toUInt32((toDate('2016-10-08') - toDate('2016-10-05')) + 1))) AS example

┌─example───────────────────────────────────────────────┐
1. │ ['2016-10-05','2016-10-06','2016-10-07','2016-10-08'] │
└───────────────────────────────────────────────────────┘
```

**Return an array of timestamps**

_BigQuery_

[GENERATE_TIMESTAMP_ARRAY](https://cloud.google.com/bigquery/docs/reference/standard-sql/array_functions#generate_timestamp_array) function

```sql
SELECT GENERATE_TIMESTAMP_ARRAY('2016-10-05 00:00:00', '2016-10-07 00:00:00',
INTERVAL 1 DAY) AS timestamp_array;

/*--------------------------------------------------------------------------*
| timestamp_array |
+--------------------------------------------------------------------------+
| [2016-10-05 00:00:00+00, 2016-10-06 00:00:00+00, 2016-10-07 00:00:00+00] |
*--------------------------------------------------------------------------*/
```

_ClickHouse_

[range](/sql-reference/functions/array-functions#range) + [arrayMap](/sql-reference/functions/array-functions#arrayMap) functions

```sql
SELECT arrayMap(x -> (toDateTime('2016-10-05 00:00:00') + toIntervalDay(x)), range(dateDiff('day', toDateTime('2016-10-05 00:00:00'), toDateTime('2016-10-07 00:00:00')) + 1)) AS timestamp_array

Query id: b324c11f-655b-479f-9337-f4d34fd02190

┌─timestamp_array─────────────────────────────────────────────────────┐
1. │ ['2016-10-05 00:00:00','2016-10-06 00:00:00','2016-10-07 00:00:00'] │
└─────────────────────────────────────────────────────────────────────┘
```

**Filtering arrays**

_BigQuery_

Requires temporarily converting arrays back to tables via [`UNNEST`](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#unnest_operator) operator

```sql
WITH Sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT [5, 10] AS some_numbers)
SELECT
ARRAY(SELECT x * 2
FROM UNNEST(some_numbers) AS x
WHERE x < 5) AS doubled_less_than_five
FROM Sequences;

/*------------------------*
| doubled_less_than_five |
+------------------------+
| [0, 2, 2, 4, 6] |
| [4, 8] |
| [] |
*------------------------*/
```

_ClickHouse_

[arrayFilter](/sql-reference/functions/array-functions#arrayFilter) function

```sql
WITH Sequences AS
(
SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL
SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL
SELECT [5, 10] AS some_numbers
)
SELECT arrayMap(x -> (x * 2), arrayFilter(x -> (x < 5), some_numbers)) AS doubled_less_than_five
FROM Sequences;
┌─doubled_less_than_five─┐
1. │ [0,2,2,4,6] │
└────────────────────────┘
┌─doubled_less_than_five─┐
2. │ [] │
└────────────────────────┘
┌─doubled_less_than_five─┐
3. │ [4,8] │
└────────────────────────┘
```

**Zipping arrays**

_BigQuery_

Requires temporarily converting arrays back to tables via [`UNNEST`](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#unnest_operator) operator

```sql
WITH
Combinations AS (
SELECT
['a', 'b'] AS letters,
[1, 2, 3] AS numbers
)
SELECT
ARRAY(
SELECT AS STRUCT
letters[SAFE_OFFSET(index)] AS letter,
numbers[SAFE_OFFSET(index)] AS number
FROM Combinations
CROSS JOIN
UNNEST(
GENERATE_ARRAY(
0,
LEAST(ARRAY_LENGTH(letters), ARRAY_LENGTH(numbers)) - 1)) AS index
ORDER BY index
);

/*------------------------------*
| pairs |
+------------------------------+
| [{ letter: "a", number: 1 }, |
| { letter: "b", number: 2 }] |
*------------------------------*/
```

_ClickHouse_

[arrayZip](/sql-reference/functions/array-functions#arrayZip) function

```sql
WITH Combinations AS
(
SELECT
['a', 'b'] AS letters,
[1, 2, 3] AS numbers
)
SELECT arrayZip(letters, arrayResize(numbers, length(letters))) AS pairs
FROM Combinations;
┌─pairs─────────────┐
1. │ [('a',1),('b',2)] │
└───────────────────┘
```

**Aggregating arrays**

_BigQuery_

Requires converting arrays back to tables via [`UNNEST`](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#unnest_operator) operator

```sql
WITH Sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
(SELECT SUM(x)
FROM UNNEST(s.some_numbers) AS x) AS sums
FROM Sequences AS s;

/*--------------------+------*
| some_numbers | sums |
+--------------------+------+
| [0, 1, 1, 2, 3, 5] | 12 |
| [2, 4, 8, 16, 32] | 62 |
| [5, 10] | 15 |
*--------------------+------*/
```

_ClickHouse_

[arraySum](/sql-reference/functions/array-functions#arraySum), [arrayAvg](/sql-reference/functions/array-functions#arrayAvg), ... function, or any of the over 90 existing aggregate function names as argument for the [arrayReduce](/sql-reference/functions/array-functions#arrayReduce) function

```sql
WITH Sequences AS
(
SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL
SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL
SELECT [5, 10] AS some_numbers
)
SELECT
some_numbers,
arraySum(some_numbers) AS sums
FROM Sequences;
┌─some_numbers──┬─sums─┐
1. │ [0,1,1,2,3,5] │ 12 │
└───────────────┴──────┘
┌─some_numbers──┬─sums─┐
2. │ [2,4,8,16,32] │ 62 │
└───────────────┴──────┘
┌─some_numbers─┬─sums─┐
3. │ [5,10] │ 15 │
└──────────────┴──────┘
```
For the ClickHouse equivalents of BigQuery array functions and `UNNEST`-based patterns, see the [Array functions section](/migrations/bigquery/sql-translation-reference#array-functions) of the SQL translation reference.
Loading
Loading