Search code examples
javaapache-sparkapache-spark-sqlapache-spark-dataset

Find column index by searching column header of a Dataset in Apache Spark Java


I have a Spark Dataset similar to the example below:

       0         1                  2          3
    +------+------------+--------------------+---+
    |ItemID|Manufacturer|       Category     |UPC|
    +------+------------+--------------------+---+
    |   804|         ael|Brush & Broom Han...|123|
    |   805|         ael|Wheel Brush Parts...|124|
    +------+------------+--------------------+---+

I need to find the position of a column by searching the column header.

For Example:

int position=getColumnPosition("Category");

This should return 2.

Is there any Spark function supported on Dataset<Row> datatype to find the column index or any java functions which can run on Spark dataset?


Solution

  • You need to access the schema and read the field index as follows:

    int position = df.schema().fieldIndex("Category");