I'm trying to run the following Colab project, but when I want to split the training data into validation and train parts I get this error:
KeyError: "Invalid split train[:70%]. Available splits are: ['train']"
I use the following code:
(training_set, validation_set), dataset_info = tfds.load(
'tf_flowers',
split=['train[:70%]', 'train[70%:]'],
with_info=True,
as_supervised=True,
)
How I can fix this error?
According to the Tensorflow Dataset docs the approach you presented is now supported. Splitting is possible by passing split parameter to tfds.load
like so split="test[:70%]"
.
(training_set, validation_set), dataset_info = tfds.load(
'tf_flowers',
split=['train[:70%]', 'train[70%:]'],
with_info=True,
as_supervised=True,
)
With the above code the training_set
has 2569 entries, while validation_set
has 1101.
Thank you Saman for the comment on API deprecation:
In previous Tensorflow version it was possible to use tfds.Split
API which is now deprecated:
(training_set, validation_set), dataset_info = tfds.load(
'tf_flowers',
split=[
tfds.Split.TRAIN.subsplit(tfds.percent[:70]),
tfds.Split.TRAIN.subsplit(tfds.percent[70:])
],
with_info=True,
as_supervised=True,
)