Create an interaction between two categorical columns in PySpark

I have two multi-level categorical columns stored in df:

dow represents the day of week (seven catagories mapped to integers: 1, 2, ..., 7).
type represents four types of observation (four categories mapped to integers: 1, 2, 3, 4).

How can I create an interaction (i.e., the multiplication) of these two columns in PySpark?

I know how to encode them using OneHotEncoder. However, I'm not sure how to go about the feature engineering process to account for all 28 combinations (7 x 4 possible cases), especially because OneHotEncoder returns sparse vectors.

For the purpose of this question, assume my pyspark dataframe df looks as follows:

dow	type	target
1	1	200
1	2	222
1	7	229

Where dow can take on seven different values and type can take on four. Is there a built-in way to create interactions between these two columns in order to account for all possible combinations?

Solution

You could do integer encoding by multiplying dow by 10 and adding type to it to create individual integers for each unique value:

(
    df
    .select(
        (F.col('dow') * F.lit(10) + F.col('type')).alias('result'), 
        'dow', 
        'type'
    )
    .show()
)

+------+---+----+
|result|dow|type|
+------+---+----+
|    11|  1|   1|
|    12|  1|   2|
|    17|  1|   7|
+------+---+----+