In AWS Glue, I read the data from data catalog in a glue dynamic frame. Then convert the dynamic frame to spark dataframe to apply schema transformations. To write the data back to s3 I have seen developers convert the dataframe back to dynamicframe. Is there any advantage over writing a glue dynamic frame to writing a spark dataframe?
You will find that there is functionality that is available only to dynamic frame writer class that cannot be accessed when using data frames:
from_jdbc_conf
glueparquet
as a format.These are some of the use-cases I can think of, but if you have a use case that requires using save modes, for example, mode('overwrite')
you could use data frames. A similar approach however exists at dynamic frame but is implemented slightly different. You can take a look at [purge_s3_path][3]
then write.