I tried to replicate the oracle Instr
function, but it seems to me that there are not all the arguments that exist in Oracle. I receive this error and I would like to include this transformation in a "plataforma" field in the table but I can't:
SELECT
SUBSTR(a.SOURCE, 0, INSTR(a.SOURCE, '-', 1, 2) - 1) AS plataforma,
COUNT(*) AS qtd
FROM db1.table AS as a
LEFT JOIN db1.table2 AS b ON a.ID=b.id
GROUP BY SUBSTR(a.SOURCE, 0, INSTR(a.SOURCE, '-', 1, 2) - 1)
ORDER BY qtd
The Apache Spark 2.0 database encountered an error while running this query. Error running query: org.apache.spark.sql.AnalysisException: Invalid number of arguments for function instr. Expected: 2; Found: 4; line 8 pos 45
I made the transformation of the field that way but I don't know if it is the correct one:
How can I replicate the same Oracle function in Spark? I need to do just this:
Source:
apache-spark-sql
sql-server-dw
Result:
apache-spark
sql-server
What you're looking for is substring_index
function :
substring_index('apache-spark-sql', '-', 2)
It returns the substring before 2 occurrences of -
.
I suppose you want to get the substring before the last occurrence of -
. So you can count the number of -
in the input string and combine it with substring_index
function like this:
substring_index(col, '-', size(split(col, '-')) - 1)
Where size(split(col, '-')) - 1
gives the number of occurences of -
.