Search code examples
pythontensorflowtflearn

tf.contrib.learn load_csv_with_header not working in TensorFlow 1.1


I installed the latest TensorFlow (v1.1.0) and I tried to run the tf.contrib.learn Quickstart tutorial, where you suppose to build a classifier for the IRIS data set. However, when I tried:

training_set = tf.contrib.learn.datasets.base.load_csv_with_header(
    filename=IRIS_TRAINING,
    target_dtype=np.int,
    features_dtype=np.float32)

I got a StopIteration error.

When I checked the API, I didn't find anything about the load_csv_with_header(). Have they changed it in the latest version without updating the tutorial? How can I fix this?

EDIT: I use Python3.6 if this makes any difference.


Solution

  • This is because of the difference between Python 2 and Python 3. Here's my code below that works for Python 3.5:

    if not os.path.exists(IRIS_TRAINING):
        raw = urllib.request.urlopen(IRIS_TRAINING_URL).read().decode()
        with open(IRIS_TRAINING, 'w') as f:
            f.write(raw)
    
    if not os.path.exists(IRIS_TEST):
        raw = urllib.request.urlopen(IRIS_TEST_URL).read().decode()
        with open(IRIS_TEST, 'w') as f:
            f.write(raw)
    

    What probably happened is that your code created a file name after IRIS_TRAINING. But the file is empty. Thus StopIteration is raised. If you look into the implementation of load_csv_with_header:

    with gfile.Open(filename) as csv_file:
        data_file = csv.reader(csv_file)
        header = next(data_file)
    

    StopIteration is raised when next does not detect any additional items to read as documented https://docs.python.org/3.5/library/exceptions.html#StopIteration

    Note the change in my code compared to the Python 2 version as shown in Tensorflow tutorial:

    1. urllib.request.urlopen instead of urllib.urlopen
    2. decode() is performed after read()