Search code examples
pythontensorflowdatasetpredictiontensorflow-datasets

Tensorflow Dataset AssertionError


I'm trying to create a program to predict new passwords based on the "rock_you" dataset, but i'm getting this error:

    assert isinstance(train_dataset, tf.data.Dataset)
AssertionError

Code:

import tensorflow as tf
import tensorflow_datasets as tfds
import keras

try:
    model = keras.models.load_model("passrockmodel.h5")
except:
    print('\nDownloading Train Dataset...\n')
    train_dataset = tfds.as_numpy(tfds.load(name="rock_you", split="train[:75%]"))
    assert isinstance(train_dataset, tf.data.Dataset)

    print('\nDownloading Test Dataset...\n')
    test_dataset = tfds.as_numpy(tfds.load("rock_you", split='train[-25%:]'))
    assert isinstance(test_dataset, tf.data.Dataset)

    model = tf.keras.Sequential([
      tf.keras.layers.Dense(128, activation='relu'),
      tf.keras.layers.Dense(128, activation='relu'),
      tf.keras.layers.Dense(1, activation='sigmoid'),
    ])

    model.compile(
        loss='binary_crossentropy',
        optimizer='adam',
        metrics=['accuracy'])


    model.fit(train_dataset, epochs=20)

    model.save("passrockmodel.h5")


test_loss, test_accuracy = model.evaluate(test_dataset)

I've followed the examples in the TensorFlow tutorial website but still couldn't get an appropriate answer. Moreover, i'm guessing there is another problem with the input shapes in the model layers.


Solution

  • The .as_numpy converts the loaded dataset into a Python generator. If you want a tf.data.Dataset, you can simply remove the .as_numpy call.

    See the .as_numpy documentation