Search code examples
hadoopimpala

Is there a way to set query parameters (variables) in separate files?


I have a number of SQL queries that use a common set of parameters (variables). Currently, parameters are set at the top of each file. When any parameter changes, it has to be changed in every file. It would be beneficial to be able to have the parameters in a separate file and changed in only one place.

How can this be accomplished?

I realize that I can use the --var option to impala-shell, but this means that these have to be entered multiple times.

I can see several ways that this might happen:

  1. impala-shell might support multiple -f arguments: This would be very elegant, but it doesn't.

  2. The queries can be cat'd together and piped into impala-shell. This is serviceable but inelegant.

  3. An alias can be set for impala-shell that specified --var arguments. This is potentially hard to get right.

Clearly #1 would be the best solution, but is there any other options or advice.

Not quite related: Multiple query execution in cloudera impala


Solution

  • Please check Impala Documentation: https://www.cloudera.com/documentation/enterprise/5-15-x/topics/impala_shell_options.html#shell_options

    Pasting here the relevant part:

    -f query_file or --query_file=query_file

    Passes a SQL query from a file. Multiple statements must be semicolon (;) delimited. In CDH 5.5 / Impala 2.3 and higher, you can specify a filename of - to represent standard input. This feature makes it convenient to use impala-shell as part of a Unix pipeline where SQL statements are generated dynamically by other tools.

    As you can see, Impala can parse more query files semicolon (;) delimited. This way, and with the --var argument you can accomplish your 1st case.