Search code examples
pythontensorflowkerasutf-8neural-network

Can't use model.save with TensorFlow/Keras. Probably because TensorFlow doesn't support non-English letters


I'm a beginner and I'm following tutorial by NeuralNine: https://www.youtube.com/watch?v=bte8Er0QhDg

Trying to make a very basic neural network to recognize handwritten digits in python. Training the model and everything else works just fine until I try to save the model using model.save and I get the following in the terminal:

Traceback (most recent call last): File "c:\Users\Käyttäjä\Desktop\Sampo\Neural\handwritten.py", line 33, in model.save('handwritten.model.test1') File "C:\Users\Käyttäjä\AppData\Local\Programs\Python\Python311\Lib\site-packages\keras\src\utils\traceback_utils.py", line 70, in error_handler raise e.with_traceback(filtered_tb) from None File "C:\Users\Käyttäjä\AppData\Local\Programs\Python\Python311\Lib\site-packages\tensorflow\python\lib\io\file_io.py", line 513, in recursive_create_dir_v2 _pywrap_file_io.RecursivelyCreateDir(compat.path_to_bytes(path)) tensorflow.python.framework.errors_impl.FailedPreconditionError: handwritten.model.test1\variables is not a directory PS C:\Users\Käyttäjä>

I think it might be because of the letter Ä in Käyttäjä (which means user) that prevents successful saving of the model as there might be a problem with encoding the directory name or something along the lines. Or it might be something else entirely as I'm just a beginner.

I can't just change the directory name as it would mess with other users. Is there a workaround that would fit the situation? Or am I just wrong and there's a problem somewhere else? I couldn't find a solution to my problem anywhere so I'm making this post. Any and all help is very much appreciated! Thank you!

Here's the code I'm using:

import os
import cv2
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf


# Load the dataset
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Use tf to normalize datasets between 0-1
x_train = tf.keras.utils.normalize(x_train, axis=1)
x_test = tf.keras.utils.normalize(x_test, axis=1)

# Create a basic sequential model. Also define the shape of input to be flattened 28 x 28 because of dataset beeing a 28 x 28 greyscale image.
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten(input_shape=(28, 28)))
# Second, Dense, layer is a layer where every neuron is connected to every neuron. relu= rectified linear unit is the activation function (negative values turn to 0)
model.add(tf.keras.layers.Dense(128, activation='relu'))
# thrid layer, exactly same
model.add(tf.keras.layers.Dense(128, activation='relu'))
# final layer. 10 outputs for numbers 0-9. Softmax activation function is often used for the output. Gives a probability distribution as the output
model.add(tf.keras.layers.Dense(10, activation='softmax'))

# compile model using optimizer algorithm and loss function
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# finally train the model. Model is going to see each datapoint 3 times
model.fit(x_train, y_train, epochs=3)

# after the model has been trained, save the model as handwritten.model
model.save('handwritten.model.test1')

Solution

  • the problem should be that unless you define the save path as 'path.keras', the model will be saved in the directory defined by the provided path. In your case, it seems that tensorflow is trying to save the model in a directory './handwritten.model.test1' that doesn't exist.

    Either use the 'path.keras' convention to save the model in a file, or first create a directory and then pass it as the argument to model.save