Search code examples
python-3.xpandasfile-uploadstreamgoogle-fusion-tables

Uploading CSV files to Fusion Tables through Python


I am trying to grab data from looker and insert it directly into Google Fusion Tables using the MediaFileUpload so as to not download any files and upload from memory. My current code below returns a TypeError. Any help would be appreciated. Thanks!

Error returned to me:

Traceback (most recent call last):
  File "csvpython.py", line 96, in <module>
    main()
  File "csvpython.py", line 88, in main
    media = MediaFileUpload(dataq, mimetype='application/octet-stream', resumable=True)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/oauth2client/_helpers.py", line 133, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/googleapiclient/http.py", line 548, in __init__
    fd = open(self._filename, 'rb')
  TypeError: expected str, bytes or os.PathLike object, not NoneType

Code in question:

for x, y, z in zip(look, destination, fusion):    

    look_data = lc.run_look(x)
    df = pd.DataFrame(look_data)
    stream = io.StringIO()
    dataq = df.to_csv(path_or_buf=stream, sep=";", index=False)
    media = MediaFileUpload(dataq, mimetype='application/octet-stream', resumable=True)
    replace = ftserv.table().replaceRows(tableId=z, media_body=media, startLine=None, isStrict=False, encoding='UTF-8', media_mime_type='application/octet-stream', delimiter=';', endLine=None).execute()

After switching dataq to stream in MediaFileUpload, I have had the following returned to me:

 Traceback (most recent call last):
   File "quicktestbackup.py", line 96, in <module>
     main()
   File "quicktestbackup.py", line 88, in main
     media = MediaFileUpload(stream, mimetype='application/octet-stream', resumable=True)
   File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/oauth2client/_helpers.py", line 133, in positional_wrapper
     return wrapped(*args, **kwargs)
   File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/googleapiclient/http.py", line 548, in __init__
     fd = open(self._filename, 'rb')
   TypeError: expected str, bytes or os.PathLike object, not _io.StringIO

Solution

  • DataFrame.to_csv is a void method and any side effects from calling it are passed to stream and not dataq. That is, dataq is NoneType and has no data - your CSV data is in stream.
    When you construct the media file from the io object, you need to feed it the data from the stream (and not the stream itself), thus its getvalue() method is needed.

    df.to_csv(path_or_buf=stream, ...)
    media = MediaFileUpload(stream.getvalue(), ...)
    

    The call to FusionTables looks to be perfectly valid.