Search code examples
rspatialduckdb

How to read and convert a CSV file it into an sf object using R and duckdb?


I am exploring duckdb the new spatial extension and I am trying to read a CSV file and convert it into a sf object.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(sf)
#> Linking to GEOS 3.11.1, GDAL 3.7.1, PROJ 9.2.0; sf_use_s2() is TRUE
library(duckdb)
#> Loading required package: DBI

conn <- DBI::dbConnect(duckdb::duckdb())
status <- DBI::dbExecute(conn, "INSTALL 'spatial';")
status <- DBI::dbExecute(conn, "LOAD 'spatial';")


dbExecute(conn, "CREATE OR REPLACE TABLE df AS FROM
  read_csv_auto('https://raw.githubusercontent.com/cboettig/duckdbfs/main/inst/extdata/spatial-test.csv')")
#> [1] 10

dbSendQuery(conn, "SHOW df;") |>
  dbFetch()
#>   column_name column_type null  key default extra
#> 1        site     VARCHAR  YES <NA>    <NA>    NA
#> 2    latitude      BIGINT  YES <NA>    <NA>    NA
#> 3   longitude      BIGINT  YES <NA>    <NA>    NA

Here I am creating the geometry column, looks like it worked.

dbExecute(conn, "CREATE OR REPLACE TABLE df_sf AS SELECT *,
  ST_Point(latitude, longitude) as geometry, FROM df;")
#> [1] 10

dbSendQuery(conn, "SHOW df_sf;") |>
  dbFetch()
#>   column_name column_type null  key default extra
#> 1        site     VARCHAR  YES <NA>    <NA>    NA
#> 2    latitude      BIGINT  YES <NA>    <NA>    NA
#> 3   longitude      BIGINT  YES <NA>    <NA>    NA
#> 4    geometry    GEOMETRY  YES <NA>    <NA>    NA

My question is how do I read the df_sf table into R as a sf object?

st_read(
  conn,
  query = "SELECT * FROM df_sf;",
  geometry_column = "geometry"
)
#> Error in CPL_read_wkb(x, EWKB, spatialite): reading wkb type 0 is not supported

# Any pointers would be greatly appreciated.

Created on 2023-09-12 with reprex v2.0.2


Solution

  • I found a solution using ST_asWKB.

      dbExecute(conn, "CREATE OR REPLACE TABLE df_sf AS SELECT *,
        ST_asWKB(ST_Point(latitude, longitude)) as geometry, FROM df;")
      
      st_read(
        conn,
        query = "SELECT * FROM df_sf;",
        geometry_column = "geometry"
      )