Search code examples
hivesqlalchemyprestoapache-superset

Not able to make Apache Superset connect to Presto DB (this PrestoDB is connected to Apache Pinot)


I am new to Apache Pinot, PrestoDb and Superset. I have successfully setup PrestoDB and connected it to Apache Pinot using the following steps:

    docker run \
      --network pinot-demo \
      --name=presto-coordinator \
      -p 8080:8080 \
      -d apachepinot/pinot-presto:latest

I can query PrestoDB and have verified that PrestoDB is able to fetch records from Pinot.

$ presto/presto-cli --server localhost:8080 \
  --catalog pinot --schema default
    
  presto:default> show catalogs;
  presto:default> select * from my_table limit 10;
    col1   | col2
    --------------------
     val 1 | 1990380139 
     val 2 | 1990380130 
     val 3 | 1990380130 

Now, I start and init Apache Superset using the official docker image and steps. As Superset also runs on 8080 by default, I have used -p 8088:8080. (don't know if this is a problem)

docker run --network pinot-demo --name superset -p 8088:8080 -d apache/superset:latest

docker exec -it superset superset fab create-admin \
   --username admin \
   --firstname Superset \
   --lastname Admin \
   --email [email protected] \
   --password admin

docker exec -it superset superset db upgrade

docker exec -it superset superset init

On this Superset web app, I want to add a new PrestoDb database. The following are the SQLAlchemy URIs that I have tried but Superset won't connect to PrestoDB.

1. hive://hive@localhost:8080/pinot
2. presto://localhost:8080/
3. presto://localhost:8080/pinot
4. presto://localhost:8080/pinot/default
5. hive://hive@localhost:8080/pinot/default

Please help me find out the correct SQLAlchemy URI to make connection between Apache Superset and PrestoDB.

enter image description here

Notably, I face the following errors when I use hive and presto dialect/driver respectively.

When used

hive://hive@localhost:8000/pinot

INFO:thrift.transport.TSocket:Could not connect to ('127.0.0.1', 8000)
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/thrift/transport/TSocket.py", line 113, in open
    handle.connect(sockaddr)
ConnectionRefusedError: [Errno 111] Connection refused
INFO:thrift.transport.TSocket:Could not connect to ('127.0.0.1', 8000)
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/thrift/transport/TSocket.py", line 113, in open
    handle.connect(sockaddr)
ConnectionRefusedError: [Errno 111] Connection refused
ERROR:thrift.transport.TSocket:Could not connect to any of [('127.0.0.1', 8000), ('127.0.0.1', 8000)]

When used

presto://localhost:8000/pinot

INFO:pyhive.presto:SELECT 1
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:8000

Solution

  • When you try to access presto from superset, the network connection is between superset container to presto container, so localhost will not work.

    You will need to get the real ip of prestodb container, either container ip or host ip. Can you try the following?

    presto://{continer_ip}:8080/pinot
    presto://{host_id}:8080/pinot