Search code examples
python-3.xtensorflowtensorflow-datasetstensorflow-estimator

Tensorflow Error "UnimplementedError: Cast string to float is not supported" - Linear Classifier Model using Estimator


Below are the steps that have been followed:

  1. Created a csv input file for tensorflow.
  2. Defined the input columns and their default data types to read with tf.decode_csv function.
  3. Defined serving input function with appropriate placeholders (same data types as per step 2).
  4. The order of the columns in CSV file and step 2 exactly matches with each other.
  5. Defined Linear Classifier Model with Estimator
  6. Define Train Spec and Eval Spec for train_and_evaluate function

The error occurs when the Estimator runs to read the input data.

Error Log:

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': 'sample_dir', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x000001E370166828>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
Created Estimator
Defining Train Spec
Train Spec Defination Completed
Defining Exporter
Defining Eval Spec
Eval Spec Defination Completed
Running Estimator
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after 10 secs (eval_spec.throttle_secs) or training is finished.
Created Dataset
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into sample_dir\model.ckpt.
---------------------------------------------------------------------------
UnimplementedError                        Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args)
   1321     try:
-> 1322       return fn(*args)
   1323     except errors.OpError as e:

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
   1306       return self._call_tf_sessionrun(
-> 1307           options, feed_dict, fetch_list, target_list, run_metadata)
   1308 

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
   1408           self._session, options, feed_dict, fetch_list, target_list,
-> 1409           run_metadata)
   1410     else:

UnimplementedError: Cast string to float is not supported
     [[Node: linear/head/ToFloat = Cast[DstT=DT_FLOAT, SrcT=DT_STRING, _class=["loc:@linea...t/Switch_1"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](linear/head/labels/ExpandDims, ^linear/head/labels/assert_equal/Assert/Assert)]]

During handling of the above exception, another exception occurred:

UnimplementedError                        Traceback (most recent call last)
<ipython-input-229-7ea5d3d759fb> in <module>()
----> 1 train_and_evaluate(OUTDIR, num_train_steps=5)

<ipython-input-227-891dd877d57e> in train_and_evaluate(output_dir, num_train_steps)
     26 
     27     print('Running Estimator')
---> 28     tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\training.py in train_and_evaluate(estimator, train_spec, eval_spec)
    445         '(with task id 0).  Given task id {}'.format(config.task_id))
    446 
--> 447   return executor.run()
    448 
    449 

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\training.py in run(self)
    529         config.task_type != run_config_lib.TaskType.EVALUATOR):
    530       logging.info('Running training and evaluation locally (non-distributed).')
--> 531       return self.run_local()
    532 
    533     # Distributed case.

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\training.py in run_local(self)
    667           input_fn=self._train_spec.input_fn,
    668           max_steps=self._train_spec.max_steps,
--> 669           hooks=train_hooks)
    670 
    671       if not self._continuous_eval_listener.before_eval():

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\estimator.py in train(self, input_fn, hooks, steps, max_steps, saving_listeners)
    364 
    365       saving_listeners = _check_listeners_type(saving_listeners)
--> 366       loss = self._train_model(input_fn, hooks, saving_listeners)
    367       logging.info('Loss for final step: %s.', loss)
    368       return self

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\estimator.py in _train_model(self, input_fn, hooks, saving_listeners)
   1117       return self._train_model_distributed(input_fn, hooks, saving_listeners)
   1118     else:
-> 1119       return self._train_model_default(input_fn, hooks, saving_listeners)
   1120 
   1121   def _train_model_default(self, input_fn, hooks, saving_listeners):

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\estimator.py in _train_model_default(self, input_fn, hooks, saving_listeners)
   1133       return self._train_with_estimator_spec(estimator_spec, worker_hooks,
   1134                                              hooks, global_step_tensor,
-> 1135                                              saving_listeners)
   1136 
   1137   def _train_model_distributed(self, input_fn, hooks, saving_listeners):

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\estimator.py in _train_with_estimator_spec(self, estimator_spec, worker_hooks, hooks, global_step_tensor, saving_listeners)
   1334       loss = None
   1335       while not mon_sess.should_stop():
-> 1336         _, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
   1337     return loss
   1338 

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\monitored_session.py in run(self, fetches, feed_dict, options, run_metadata)
    575                           feed_dict=feed_dict,
    576                           options=options,
--> 577                           run_metadata=run_metadata)
    578 
    579   def run_step_fn(self, step_fn):

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\monitored_session.py in run(self, fetches, feed_dict, options, run_metadata)
   1051                               feed_dict=feed_dict,
   1052                               options=options,
-> 1053                               run_metadata=run_metadata)
   1054       except _PREEMPTION_ERRORS as e:
   1055         logging.info('An error was raised. This may be due to a preemption in '

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\monitored_session.py in run(self, *args, **kwargs)
   1142         raise six.reraise(*original_exc_info)
   1143       else:
-> 1144         raise six.reraise(*original_exc_info)
   1145 
   1146 

C:\ProgramData\Anaconda3\lib\site-packages\six.py in reraise(tp, value, tb)
    691             if value.__traceback__ is not tb:
    692                 raise value.with_traceback(tb)
--> 693             raise value
    694         finally:
    695             value = None

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\monitored_session.py in run(self, *args, **kwargs)
   1127   def run(self, *args, **kwargs):
   1128     try:
-> 1129       return self._sess.run(*args, **kwargs)
   1130     except _PREEMPTION_ERRORS:
   1131       raise

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\monitored_session.py in run(self, fetches, feed_dict, options, run_metadata)
   1199                                   feed_dict=feed_dict,
   1200                                   options=options,
-> 1201                                   run_metadata=run_metadata)
   1202 
   1203     for hook in self._hooks:

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\monitored_session.py in run(self, *args, **kwargs)
    979 
    980   def run(self, *args, **kwargs):
--> 981     return self._sess.run(*args, **kwargs)
    982 
    983   def run_step_fn(self, step_fn, raw_session, run_with_hooks):

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata)
    898     try:
    899       result = self._run(None, fetches, feed_dict, options_ptr,
--> 900                          run_metadata_ptr)
    901       if run_metadata:
    902         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1133     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1134       results = self._do_run(handle, final_targets, final_fetches,
-> 1135                              feed_dict_tensor, options, run_metadata)
   1136     else:
   1137       results = []

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1314     if handle is None:
   1315       return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1316                            run_metadata)
   1317     else:
   1318       return self._do_call(_prun_fn, handle, feeds, fetches)

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args)
   1333         except KeyError:
   1334           pass
-> 1335       raise type(e)(node_def, op, message)
   1336 
   1337   def _extend_graph(self):

UnimplementedError: Cast string to float is not supported
     [[Node: linear/head/ToFloat = Cast[DstT=DT_FLOAT, SrcT=DT_STRING, _class=["loc:@linea...t/Switch_1"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](linear/head/labels/ExpandDims, ^linear/head/labels/assert_equal/Assert/Assert)]]

Caused by op 'linear/head/ToFloat', defined at:
  File "C:\ProgramData\Anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\ProgramData\Anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "C:\ProgramData\Anaconda3\lib\site-packages\traitlets\config\application.py", line 658, in launch_instance
    app.start()
  File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelapp.py", line 486, in start
    self.io_loop.start()
  File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\platform\asyncio.py", line 127, in start
    self.asyncio_loop.run_forever()
  File "C:\ProgramData\Anaconda3\lib\asyncio\base_events.py", line 422, in run_forever
    self._run_once()
  File "C:\ProgramData\Anaconda3\lib\asyncio\base_events.py", line 1432, in _run_once
    handle._run()
  File "C:\ProgramData\Anaconda3\lib\asyncio\events.py", line 145, in _run
    self._callback(*self._args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\platform\asyncio.py", line 117, in _handle_events
    handler_func(fileobj, events)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 450, in _handle_events
    self._handle_recv()
  File "C:\ProgramData\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 480, in _handle_recv
    self._run_callback(callback, msg)
  File "C:\ProgramData\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 432, in _run_callback
    callback(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 233, in dispatch_shell
    handler(stream, idents, msg)
  File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\ipkernel.py", line 208, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\zmqshell.py", line 537, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2662, in run_cell
    raw_cell, store_history, silent, shell_futures)
  File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2785, in _run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2909, in run_ast_nodes
    if self.run_code(code, result):
  File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2963, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-229-7ea5d3d759fb>", line 1, in <module>
    train_and_evaluate(OUTDIR, num_train_steps=5)
  File "<ipython-input-227-891dd877d57e>", line 28, in train_and_evaluate
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\training.py", line 447, in train_and_evaluate
    return executor.run()
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\training.py", line 531, in run
    return self.run_local()
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\training.py", line 669, in run_local
    hooks=train_hooks)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\estimator.py", line 366, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1119, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1132, in _train_model_default
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1107, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\canned\linear.py", line 311, in _model_fn
    config=config)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\canned\linear.py", line 164, in _linear_model_fn
    logits=logits)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\canned\head.py", line 239, in create_estimator_spec
    regularization_losses))
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\canned\head.py", line 1208, in _create_tpu_estimator_spec
    features=features, mode=mode, logits=logits, labels=labels))
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\estimator\canned\head.py", line 1114, in create_loss
    labels = math_ops.to_float(labels)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py", line 719, in to_float
    return cast(x, dtypes.float32, name=name)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py", line 665, in cast
    x = gen_math_ops.cast(x, base_type, name=name)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 1613, in cast
    "Cast", x=x, DstT=DstT, name=name)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3414, in create_op
    op_def=op_def)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1740, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

UnimplementedError (see above for traceback): Cast string to float is not supported
     [[Node: linear/head/ToFloat = Cast[DstT=DT_FLOAT, SrcT=DT_STRING, _class=["loc:@linea...t/Switch_1"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](linear/head/labels/ExpandDims, ^linear/head/labels/assert_equal/Assert/Assert)]]

Tensorflow Code:

# Import libraries
import pandas as pd
from sklearn.model_selection import train_test_split
import tensorflow as tf
import shutil

# Read data
df = pd.read_csv('sample.csv')

# Separate label from dataset
X = df.drop(['label'], axis=1).values
y = df[['label']].values

# Split into train and test dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Convert to dataframe
X_train = pd.DataFrame(X_train)
X_test = pd.DataFrame(X_test)
y_train = pd.DataFrame(y_train)
y_test = pd.DataFrame(y_test)

# Concatenate for writing into csv
train = pd.concat([X_train, y_train], axis=1)
valid = pd.concat([X_valid, y_valid], axis=1)

# Write into csv file
train.to_csv('train.csv', header=False, index=False)
valid.to_csv('valid.csv', header=False, index=False)

# Specify structure for tensorflow input
CSV_COLUMNS = ['col1', 'col2', 'col3', 'col4', 'col5', 'col6', 'col7', 'col8']
LABEL_COLUMN = 'label'
DEFAULTS = [['none'], ['none'], ['none'], ['none'], ['none'], ['0'], [0], [0]]

# Function for reading input file and creating dataset
def read_dataset(filename, mode, batch_size = 512):
    def _input_fn():
        def decode_csv(value_column):
            columns = tf.decode_csv(value_column, record_defaults=DEFAULTS)
            features = dict(zip(CSV_COLUMNS, columns))
            label = features.pop(LABEL_COLUMN)
            return features, label

        # Create list of files that match pattern
        file_list = tf.gfile.Glob(filename)

        # Create dataset from file list
        dataset = tf.data.TextLineDataset(file_list).map(decode_csv)

        if mode==tf.estimator.ModeKeys.TRAIN:
            num_epochs = None # indefinitely
            dataset = dataset.shuffle(buffer_size = 10 * batch_size)
        else:
            num_epochs = 1 # end-of-input after this

        dataset = dataset.repeat(num_epochs).batch(batch_size)

        return dataset.make_one_shot_iterator().get_next()
    return _input_fn

# Input feature columns
    INPUT_COLUMNS = [
    tf.feature_column.categorical_column_with_vocabulary_list('col1', vocabulary_list=['1', '2', '3', '4']),
    tf.feature_column.categorical_column_with_vocabulary_list('col2', vocabulary_list = [ '1', '2', '3', '4', '5', '6']),
    tf.feature_column.categorical_column_with_vocabulary_list('col3', vocabulary_list = ['1', '2', '3', '4', '5', '6', '7', '8', '9']),
    tf.feature_column.categorical_column_with_vocabulary_list('col4', vocabulary_list = [ '1', '2', '3', '4', '5', '6', '7', '8', '9', '10']),
    tf.feature_column.categorical_column_with_vocabulary_list('col5', vocabulary_list = [ '0', '1', '2', '3', '4', '5']),
    tf.feature_column.categorical_column_with_vocabulary_list('col6', vocabulary_list=['0', '1']),
    tf.feature_column.numeric_column('col7'),
    tf.feature_column.numeric_column('col8')
]

def add_more_features(feats):
    # for future reference
    return(feats)

feature_cols = add_more_features(INPUT_COLUMNS)

# Serving function
def serving_input_fn():
    feature_placeholders = {
    'col1': tf.placeholder(tf.string, [None]),
    'col2': tf.placeholder(tf.string, [None]),
    'col3': tf.placeholder(tf.string, [None]),
    'col4': tf.placeholder(tf.string, [None]),
    'col5': tf.placeholder(tf.string, [None]),
    'col6': tf.placeholder(tf.string, [None]),
    'col7': tf.placeholder(tf.int64, [None]),
    'col8': tf.placeholder(tf.int64, [None])
    }

    features = {
        key: tf.expand_dims(tensor, -1)
        for key, tensor in feature_placeholders.items()
    }

    return tf.estimator.export.ServingInputReceiver(features, feature_placeholders)

# Train and evaluate function
def train_and_evaluate(output_dir, num_train_steps):

    estimator = tf.estimator.LinearClassifier(
        model_dir=output_dir,
        feature_columns=feature_cols)

    train_spec = tf.estimator.TrainSpec(
        input_fn = read_dataset('train.csv', mode = tf.estimator.ModeKeys.TRAIN),
        max_steps=num_train_steps)

    exporter = tf.estimator.LatestExporter('exporter', serving_input_fn)

    eval_spec = tf.estimator.EvalSpec(
        input_fn = read_dataset('valid.csv', mode = tf.estimator.ModeKeys.EVAL),
        steps = None,
        start_delay_secs = 1,
        throttle_secs = 10,
        exporters = exporter)

    tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

# Log level and cleanup
tf.logging.set_verbosity(tf.logging.INFO)
OUTDIR = 'sample_dir'
shutil.rmtree(OUTDIR, ignore_errors=True)

# Run training and evaluation
train_and_evaluate(OUTDIR, num_train_steps=1)

I have been struggling with this error. Help would be much appreciated.


Solution

  • While debugging this issue, the issue was resolved but I am not sure what step did actually resolve it.

    I have tried below things while debugging this issue:

    1. In reference to the stackoverflow thread: float64 with pandas to_csv, changed the floating type format which is written to CSV file as below:

    Prior Code:

    train.to_csv('train.csv', header=False, index=False)
    valid.to_csv('valid.csv', header=False, index=False)
    

    Modified Code:

    train.to_csv('train.csv', header=False, index=False, float_format='%.4f')
    valid.to_csv('valid.csv', header=False, index=False, float_format='%.4f')
    
    1. Added columns one by one to the input CSV file and checked the corresponding default datatypes. I found one of the columns in which the pandas written CSV file had 0.0 (although it was being read as integer value). In Tensorflow it was being read as int64. Changing the datatype to float64 resolved the mismatching datatype issue.

    Now the model is up and running.