Search code examples
tensorflowtensorflow-datasets

Tensorflow tf.data.Dataset API, dataset unzip function?


In tensorflow 1.12 there is the Dataset.zip function: documented here.

However, I was wondering if there is a dataset unzip function which will return back the original two datasets.

# NOTE: The following examples use `{ ... }` to represent the
# contents of a dataset.
a = { 1, 2, 3 }
b = { 4, 5, 6 }
c = { (7, 8), (9, 10), (11, 12) }
d = { 13, 14 }

# The nested structure of the `datasets` argument determines the
# structure of elements in the resulting dataset.
Dataset.zip((a, b)) == { (1, 4), (2, 5), (3, 6) }
Dataset.zip((b, a)) == { (4, 1), (5, 2), (6, 3) }

# The `datasets` argument may contain an arbitrary number of
# datasets.
Dataset.zip((a, b, c)) == { (1, 4, (7, 8)),
                            (2, 5, (9, 10)),
                            (3, 6, (11, 12)) }

# The number of elements in the resulting dataset is the same as
# the size of the smallest dataset in `datasets`.
Dataset.zip((a, d)) == { (1, 13), (2, 14) }

I would like to have the following

dataset = Dataset.zip((a, d)) == { (1, 13), (2, 14) }
a, d = dataset.unzip()

Solution

  • My workaround was to just use map, not sure if there might be interest in a syntax sugar function for unzip later though.

    a = dataset.map(lambda a, b: a)
    b = dataset.map(lambda a, b: b)