Search code examples
pythonpython-3.xpandasinfluxdb

InfluxDB and pandas errors in Python


im following the instructions to read data from influx into pandas and im getting the following error:

ValueError                                Traceback (most recent call last) <ipython-input-13-1e63a2e6d3db> in <module>()
----> 1 df = pd.DataFrame(AandCStation)
      2 
      3 #AandCStation['time'] # gets the name
      4 
      5 #AandCStation.values

C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
    328                                  dtype=dtype, copy=copy)
    329         elif isinstance(data, dict):
--> 330             mgr = self._init_dict(data, index, columns, dtype=dtype)
    331         elif isinstance(data, ma.MaskedArray):
    332             import numpy.ma.mrecords as mrecords

C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _init_dict(self, data, index, columns, dtype)
    459             arrays = [data[k] for k in keys]
    460 
--> 461         return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    462 
    463     def _init_ndarray(self, values, index, columns, dtype=None, copy=False):

C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _arrays_to_mgr(arrays, arr_names, index, columns, dtype)    6161   
# figure out the index, if necessary    6162     if index is None:
-> 6163         index = extract_index(arrays)    6164     else:    6165         index = _ensure_index(index)

C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in extract_index(data)    6200     6201         if not indexes and not raw_lengths:
-> 6202             raise ValueError('If using all scalar values, you must pass'    6203                              ' an index')    6204 

ValueError: If using all scalar values, you must pass an index

Read DataFrame defaultdict(<class 'list'>, {'NoT/machinename':         MachineName  MachineType SensorWorking  \

This is the code im running :

client = DataFrameClient(host, port, user, password, dbname)

print("Read DataFrame")
AandCStation = client.query("""SELECT * FROM "NoT/machinename" WHERE time >= now() - 12h""")
print(AandCStation)

print(type(AandCStation))

df = pd.DataFrame(AandCStation)

This is the data:

Read DataFrame
defaultdict(<class 'list'>, {'NoT/sensor':                                       MachineName  MachineType SensorWorking  \
2018-07-16 04:11:19.912895848+00:00  Quench tank          Yes   
2018-07-16 04:11:22.961838564+00:00  Quench tank          Yes   
2018-07-16 04:11:25.872680626+00:00  Quench tank          Yes   
2018-07-16 04:11:28.850205591+00:00  Quench tank          Yes   
...                                           ...          ...           ...   
2018-07-16 16:08:05.188868516+00:00  Quench tank          Yes   
2018-07-16 16:08:08.169862344+00:00  Quench tank          Yes   
2018-07-16 16:08:11.144413930+00:00  Quench tank          Yes   
2018-07-16 16:08:14.126290232+00:00  Quench tank          Yes   
2018-07-16 16:08:17.107127232+00:00  Quench tank          Yes   
2018-07-16 16:08:20.079248843+00:00  Quench tank          Yes   

                                     TempValue  
2018-07-16 04:09:50.467145647+00:00      32.69  
2018-07-16 04:09:53.888973858+00:00      32.69  
2018-07-16 04:09:55.879811649+00:00      32.69  
2018-07-16 04:09:58.818001127+00:00      32.69  
...                                        ...  
2018-07-16 16:08:05.188868516+00:00      34.19  
2018-07-16 16:08:08.169862344+00:00      34.19  
2018-07-16 16:09:43.209347998+00:00      34.19  
2018-07-16 16:09:46.187872612+00:00      34.19  

[12233 rows x 4 columns]})
<class 'collections.defaultdict'>

Any ideas why im getting the error?


Solution

  • I came across this same issue today.

    So it turns out that you are getting a dictionary of DataFrames which you can concat and then droplevel to have the desired columns.

    client = DataFrameClient(host, port, user, password, dbname)
    
    print("Read DataFrame")
    AandCStation = client.query("""SELECT * FROM "NoT/machinename" WHERE time >= now() - 12h""")
    AandCStation = pd.concat(AandCStation, axis=1)
    AandCStation.columns = AandCStation.columns.droplevel()
    
    print(AandCStation.head())
    
    print(type(AandCStation))
    

    Hope this helps!

    Sources: