I mounted my google drive in my colab notebook, and I have a fairly big pandas dataframe and try to mydf.to_feather(path) where path is in my google drive. it is expected to be 100meg big and it is taking forever.
Is this to be expected? it seems the network link between colab and google drive is not great. Anyone know if the servers are in same region/zone?
I may need to change my workflow to avoid this. If you have any best practice or suggestion, pls let me know, anything short of going all GCP (which I expect don't have this kind of latency).
If you find calling df.to_feather("somewhere on your gdrive") from google colab and it is on the order of ~X00mb, you may find sporadic performance. It can take anywhere between a few min to a whole hour to save a file. I can't explain this behavior.
Workaround: First save to /content/, the colab's host machine's local dir. Then copy the file from /content to your gdrive mount dir. This seems to work much more consistently and faster for me. I just can't explain why .to_feather directly to gdrive suffer so much.