Search code examples
dockerhadoophivedocker-composecloudera

Initialize Cloudera Hive Docker Container With Data


I am running the Cloudera suite in a Docker Container using the image described here: https://hub.docker.com/r/cloudera/quickstart/

I have the following configuration:

Dockerfile

FROM cloudera/quickstart:latest

Docker Compose file

version: '3.1'
services:

  db-hive:
    container_name: mobydq-test-db-hive
    image: mobydq-test-db-hive
    restart: always
    build:
      context: .
      dockerfile: ./db-hive/Dockerfile
    expose:
      - 10000
    networks:
      - default
    hostname: quickstart.cloudera
    privileged: true
    tty: true
    command: ["/usr/bin/docker-quickstart"]

networks:
  default:
    external:
      name: mobydq-network

When the container start, I would like it creates automatically a new database, a table and populates it with data. What would be the best way to do that?


Solution

  • The solution I have found is to copy the content of the script /user/bin/docker-quickstart into a new shell script entrypoint.sh. Then I added the Create table and Insert statements directly in the entrypoint.sh.

    Example here: https://github.com/ubisoft/mobydq/blob/master/test/db-cloudera/init/entrypoint.sh

    Finally I run the command entrypoint.sh in the docker compose file instead of the quickstart script.