I am using Impala for querying parquet
-tables and cannot find a solution to increment an integer
-column ranging from 1..n
. The column is supposed to be used as ID-reference. Currently I am aware of the uuid()
function, which
Returns a universal unique identifier, a 128-bit value encoded as a string with groups of hexadecimal digits separated by dashes.
Anyhow, this is not suitable for me since I have to pass the ID to another system which requests an ID in style of 1..n
. I also already know that Impala has no auto-increment
-implementation.
The desired result should look like:
-- UUID() provided as example - I want to achieve the `my_id`-column.
| my_id | example_uuid | some_content |
|-------|--------------|--------------|
| 1 | 50d53ca4-b...| "a" |
| 2 | 6ba8dd54-1...| "b" |
| 3 | 515362df-f...| "c" |
| 4 | a52db5e9-e...| "d" |
|-------|--------------|--------------|
How can I achieve the desired result (integer
-ID ranging from 1..n
)?
Note: This question differs from this one which specifically handles Kudu-tables. However, answers should be applicable for this question as well.
Since other Q&A's like this one only came up with uuid()
-alike answers, I put some thought in it and finally came up with this solution:
SELECT
row_number() OVER (PARTITION BY "dummy" ORDER BY "dummy") as my_id
, some_content
FROM some_table
row_number()
generates a continuous integer
-number over a provided partition. Unlike rank()
, row_number()
always provides an incremented number on its partition (even if duplicates occur)PARTITION BY "dummy"
partitions the entire table into one partition. This works since "dummy"
is interpreted in the execution graph as temporary column yielding only the String
-value "dummy"
. Thus, also something analog to "dummy"
works.ORDER BY
is required in order to generate the increment. Since we don't care about the order in this example (otherwise just set your respective column), also use the "dummy"
-workaround.The command creates the desired incremental ID without any nested SQL-statements or other tricks.
| my_id | some_content |
|-------|--------------|
| 1 | "a" |
| 2 | "b" |
| 3 | "c" |
| 4 | "d" |
|-------|--------------|