Search code examples
duckdb

DuckDB httpfs extension for .db files


I'm experimenting with DuckDB (0.10.1) and the httpfs extension. Documentation states

With the httpfs extension, it is possible to directly query files over the HTTP(S) protocol. This works for all files supported by DuckDB or its various extensions, and provides read-only access.

However the only examples I can find are for Parquet or CSV files. When I try to query a DuckDB .db file over https the errors indicate I'm doing something wrong:

from 'https://.../file.db';
# Catalog Error: Table with name https://.../file.db does not exist!
# Did you mean "sqlite_master"?
# LINE 1: from 'https://...

and

from 's3://.../file.db';
# Catalog Error: Table with name s3://.../file.db does not exist!
# Did you mean "pg_attrdef"?
# LINE 1: from 's3://...

I have verified using curl and s3 that both URLs are publicly accessible. I can successfully query a Parquet file in the same S3 bucket using the httpfs extension, so I'm confident the extension is working.

Given that the errors refer to missing tables, is there a specific syntax to identify the name of a table within that remote data source? Or is the documentation wrong, and not all file types are supported?


Solution

  • You can use attach to achieve this.

    For example:

    ATTACH 'https://herp.derp/my.db' AS db (READONLY 1);
    FROM db.some_table;
    

    Note that this part of DuckDB is being actively worked on, with s3 support coming up and more performance optimizations being planned here. Also note that currently attaching DB files over HTTP is read-only.