How to join two different datasets with different conditions with different columns?

I will join these two datasets with different conditions on different columns to obtain one dataset in Pyspark.

The first dataset df1:

rc1	rc2	rc3	resp
AB2	AB1	AB6	jean
AB4	AB3	AB7	shein
AB9	AB5	AB8	patrick

The second dataset df2:

Key	description
AB1	Normal
AB4	Expand
AB3	small
AB6	Big
AB8	First
AB2	Dock
AB7	Missing
AB9	Package
AB5	Wrong

I will obtain in final dataset df join with df1 & df2:

rc1	rc2	rc3	resp
Dock	Normal	Big	jean
Expand	Small	Missing	shein
Package	Wrong	First	Patrick

Solution

See the below implementation -

df = (
       df1.join(df2, df1.rc1 == df2.Key, 'inner').drop("Key","rc1") 
          .withColumnRenamed('description', 'rc1') 
          .join(df2, df1.rc2 == df2.Key, 'inner').drop("Key","rc2") 
          .withColumnRenamed('description', 'rc2') 
          .join(df2, df1.rc3 == df2.Key, 'inner').drop("Key","rc3") 
          .withColumnRenamed('description', 'rc3') 
          .select("rc1","rc2","rc3","resp")
     )

df.show()

+-------+------+-------+-------+
|    rc1|   rc2|    rc3|   resp|
+-------+------+-------+-------+
|   Dock|Normal|    Big|   jean|
|Package| Wrong|  First|patrick|
| Expand| Small|Missing|  shein|
+-------+------+-------+-------+