Search code examples
pythonpandasdataframetensorflowtfrecord

AttributeError: 'NoneType' object has no attribute 'SerializeToString'


I want to convert a dataframe to a tensorflow dataset with a TFRecordf format. This is what I have written:

import pandas as pd
import tensorflow as tf
from pandas_tfrecords import pd2tf, tf2pd
import pandas_tfrecords

df = pd.read_csv('data.csv')
print(df)
output_file = 'data.tfrecord'
writer = tf.compat.v1.python_io.TFRecordWriter(output_file)
rec = df.to_records(index=False)
print(repr(rec))

s = rec.tobytes()
def _bytes_feature(value):
    if isinstance(value, type(tf.constant(0))):
        value = value.numpy() # BytesList won't unpack a string from an EagerTensor.
        return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
print(_bytes_feature(s))
a=_bytes_feature(s)
a=a.SerializeToString()
writer.write(a)

writer.close()

The rows are mostly comprised of strings. But there are also some empty rows in between. So, not all rows are of type string.

This is what I get as the output:

[65422771 rows x 1 columns]
rec.array([('method apparatus facilitating attention task disclosed',),
           ('method include detecting sensor movement estimating task attention state based movement determining workload based estimated attention state determining based workload optimal format relay operational information best facilitates attention task increased ease task performance',),
           ('\n',), ..., ('\n',),
           ('system method apparatus provided implementation motorized tricycle includes lean mechanism active system operably coupled lean mechanism receives signal indicative interaction human operable detect lean body human operable receive sensed movement seat multisensory device generate send signal lean mechanism signal lean sensed movement',),
           ('\n',)],
          dtype=[('text', 'O')])

None
Traceback (most recent call last):
  File "/dfgd.py", line 20, in <module>
    a=a.SerializeToString()
AttributeError: 'NoneType' object has no attribute 'SerializeToString'

How can I fix this problem and write my dataframe to a tenesorflow dataset?


Solution

  • You have an indentation error. Use the following.

    def _bytes_feature(value):
        if isinstance(value, type(tf.constant(0))):
            value = value.numpy() # BytesList won't unpack a string from an EagerTensor.
        return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
    

    In your original code, the return statement was within the if statement. So if the code never entered the if, the function would return None.