I am writing to MongoDB with Pyspark using spark-mongo connector. I want to edit some documents with this command
df.write.format("com.mongodb.spark.sql.DefaultSource").options(uri=uri, collection="test").mode("append").save()
df has a column '_id' but when I run that then I get two documents in MongoDB with the same _id, one with type 'String' and the other one with type 'ObjectId'. Is there a way to change the type of the column _id in my dataframe ? I found that the type should be StructType: { oid: String } but I don't know how to change that.
Thanks
My problem is that I actually had some documents with _id string and others with _id ObjectId in my collection so when I was loading it with Spark it was inferring that the type of this field was string.