Search code examples
dockerminiconda

Building package based on miniconda


There is a package called "facets" that will help visualize data.

https://github.com/PAIR-code/facets

Is it possible to dockerize the installation of this using conda?

Currently I am using the following line to start docker container that has everything that I need.

docker run -i -t -p 8888:8888 -v /tmp:/tmp continuumio/miniconda3 /bin/bash -c "/opt/conda/bin/conda install jupyter -y --quiet && cd /tmp/ && /opt/conda/bin/jupyter notebook --NotebookApp.token='passwd' --notebook-dir=/tmp --ip='*' --port=8888 --no-browser --allow-root"

How do I extend this line or use dockerfile to include installation of facets?

I have found a dockerfile but it is using tensoreflow as base image.

https://github.com/gel/facets/blob/master/docker/Dockerfile

If I just change it to minicoda, build fails immedately with error:

Package 'openjdk-8-jdk' has no installation candidate

Is it possible to build facets package based on miniconda?


Solution

  • Facets are now included in tensorflow data validation module.

    !pip install -q tensorflow_data_validation tensorflow
    
    # !wget https://storage.googleapis.com/tfx-colab-datasets/chicago_data.zip
    # !unzip chicago_data.zip
    
    import tensorflow_data_validation as tfdv
    
    train_stats = tfdv.generate_statistics_from_csv(data_location='data/train/data.csv')
    tfdv.visualize_statistics(train_stats)
    
    schema = tfdv.infer_schema(statistics=train_stats)
    tfdv.display_schema(schema=schema)