CNTK reader for base64 encoded images in Python

I'm migrating training and evaluation configurations for CNTK from Brainscript over to Python. Because our training data is created in a map/reduce framework, I'm storing images and labels in a huge text file that contains the base64 encoded image as one of its columns. That all worked fine, but I still have not found a way of doing the equivalent in Python.

My CNTK.exe configuration is similar to this example configuration:

deserializers = ({
        type = "Base64ImageDeserializer" ; module = "ImageReader"
        file = "myFile.tsv"
...

All Python examples (for example this one) use the ImageDeserializer, which reads from images that are individual files. I have not found anything that sounds like a base64 image deserializer in the Python code of cntk.io.

How can I use base64 encoded images in CNTK via Python?

A related ask: The Brainscript Base64ImageDeserializer accepts files that contain a sequence ID in the first column, which is critical for us to identify individual examples at test time. How can I use that in Python?

Solution

The Base64 deserializer support for Python was merged to master. For sample usage please see /bindings/python/cntk/io/tests/io_tests.py test_base64_image_deserializer.

Regarding getting labels - currently there is no easy way, you can compose CNTKTextFormat that will contain the ids. But this is cumbersome, we are discussing to make this easier.