I am trying to interact with a SFTP server from inside R. The CURL
package came highly recommended. Not RCURL
but CURL
.
One of the things I am trying to do is get a list of directories/files at an address. I have the code working so far:
# create a new curl handle
han <- new_handle()
# set options for SFTP
handle_setopt(han, verbose = TRUE)
# execute the request
result <- curl_fetch_memory(url = "{SFTP URL here}",handle = han)
# get the response data
response <- rawToChar(result$content)
The SFTP server at this URL does not have passwords. The remote has SFTP protocol version 3
The above code almost does what I am looking for, curl_fetch_memory(url = "{SFTP URL here}",handle = han)
produces a list with among other things result$content
that has the the said list of directories/files but with everything as in file names, dates and permission data all in the chars.
How to customize the request/handle to get the list of files in a cleaner manner? Just a plain list of files akin to ls
on SFTP servers? If this is at all possible. (copies of result
and response
attached below.)
If customizing the requests is not possible, is there a way to customize CURL
objects to make them a bit more human readable?
Output for response
$url
[1] "sftp://data.cyverse.org/shared/"
$status_code
[1] 0
$type
[1] NA
$headers
raw(0)
$modified
[1] "2020-02-20 16:05:33 CST"
$times
redirect namelookup connect pretransfer starttransfer
0.000000 0.000029 0.000000 0.230600 0.000000
total
0.230608
$content
[1] 64 72 77 78 72 2d 78 72 2d 78 20 20 20 20 31 20 30 20 20 20 20 20 20 20 20
[26] 30 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 30 20 44 65 63 20 33 31 20
[51] 20 31 39 36 39 20 2e 0a 64 72 77 78 72 2d 78 72 2d 78 20 20 20 20 31 20 30
[76] 20 20 20 20 20 20 20 20 30 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 30
[101] 20 44 65 63 20 33 31 20 20 31 39 36 39 20 2e 2e 0a 64 72 77 78 72 2d 78 72
[126] 2d 78 20 20 20 20 31 20 30 20 20 20 20 20 20 20 20 30 20 20 20 20 20 20 20
[151] 20 20 20 20 20 20 20 20 30 20 46 65 62 20 32 30 20 20 32 30 32 30 20 61 6c
[176] 69 67 6e 6d 65 6e 74 73 5f 61 6e 64 5f 74 72 65 65 73 0a 64 72 77 78 72 2d
[201] 78 72 2d 78 20 20 20 20 31 20 30 20 20 20 20 20 20 20 20 30 20 20 20 20 20
[226] 20 20 20 20 20 20 20 20 20 20 30 20 46 65 62 20 32 30 20 20 32 30 32 30 20
[251] 67 65 6e 65 5f 66 61 6d 69 6c 79 5f 65 76 6f 6c 75 74 69 6f 6e 0a 64 72 77
[276] 78 72 2d 78 72 2d 78 20 20 20 20 31 20 30 20 20 20 20 20 20 20 20 30 20 20
[301] 20 20 20 20 20 20 20 20 20 20 20 20 20 30 20 46 65 62 20 32 30 20 20 32 30
[326] 32 30 20 6d 61 70 73 5f 73 63 72 69 70 74 73 0a 64 72 77 78 72 2d 78 72 2d
[351] 78 20 20 20 20 31 20 30 20 20 20 20 20 20 20 20 30 20 20 20 20 20 20 20 20
[376] 20 20 20 20 20 20 20 30 20 46 65 62 20 32 30 20 20 32 30 32 30 20 74 72 61
[401] 6e 73 63 72 69 70 74 5f 61 73 73 65 6d 62 6c 69 65 73 0a 64 72 77 78 72 2d
[426] 78 72 2d 78 20 20 20 20 31 20 30 20 20 20 20 20 20 20 20 30 20 20 20 20 20
[451] 20 20 20 20 20 20 20 20 20 20 30 20 46 65 62 20 32 30 20 20 32 30 32 30 20
[476] 77 68 6f 6c 65 5f 67 65 6e 6f 6d 65 5f 64 75 70 6c 69 63 61 74 69 6f 6e 73
[501] 0a 2d 72 77 2d 72 2d 2d 72 2d 2d 20 20 20 20 31 20 30 20 20 20 20 20 20 20
[526] 20 30 20 20 20 20 20 20 20 20 20 20 20 20 20 36 36 39 20 4f 63 74 20 31 32
[551] 20 20 32 30 31 39 20 67 65 6e 65 5f 66 61 6d 69 6c 69 65 73 5f 6f 72 74 68
[576] 6f 66 69 6e 64 65 72 2e 74 78 74 0a 2d 72 77 2d 72 2d 2d 72 2d 2d 20 20 20
[601] 20 31 20 30 20 20 20 20 20 20 20 20 30 20 20 20 20 20 20 20 20 20 20 20 20
[626] 31 32 37 33 20 4f 63 74 20 31 32 20 20 32 30 31 39 20 72 65 61 64 6d 65 2e
[651] 74 78 74 0a
output for result$content
'drwxr-xr-x 1 0 0 0 Dec 31 1969 .\ndrwxr-xr-x 1 0 0 0 Dec 31 1969 ..\ndrwxr-xr-x 1 0 0 0 Nov 7 2020 curated\n'
You can set CURLOPT_DIRLISTONLY to only list names. Though you can also parse default response as a regular tabular text, i.e. with read.table()
, or readr::read_table()
. Options for curl
package are general libcurl options from upstream, so libcurl documentation can be used as a reference - https://curl.se/libcurl/c/easy_setopt_options.html
Using Rebex demo server as an example:
library(curl)
#> Using libcurl 7.84.0 with Schannel
# https://test.rebex.net/
SFTP_DEMO <- "sftp://demo:password@test.rebex.net:22"
han <- new_handle()
# list all libcurl options that include "list"
curl_options("list")
#> cookielist dirlistonly proxy_ssl_cipher_list
#> 10135 48 10259
#> ssl_cipher_list
#> 10083
# set dirlistonly
handle_setopt(han, dirlistonly = TRUE)
# dirlistonly request:
file_list <- curl_fetch_memory(url = SFTP_DEMO, handle = han)[["content"]] |> rawToChar()
cat(file_list)
#> .
#> ..
#> pub
#> readme.txt
read.table(text = file_list)
#> V1
#> 1 .
#> 2 ..
#> 3 pub
#> 4 readme.txt
strsplit(file_list, "\n") |> unlist()
#> [1] "." ".." "pub" "readme.txt"
# you can do the same with detailed file list:
handle_setopt(han, dirlistonly = FALSE)
curl_fetch_memory(url = SFTP_DEMO,
handle = han)[["content"]] |>
rawToChar() |>
read.table(text = _)
#> V1 V2 V3 V4 V5 V6 V7 V8 V9
#> 1 drwx------ 2 demo users 0 Mar 31 17:52 .
#> 2 drwx------ 2 demo users 0 Mar 31 17:52 ..
#> 3 drwx------ 2 demo users 0 Mar 31 17:52 pub
#> 4 -rw------- 1 demo users 405 Dec 17 2021 readme.txt
Created on 2023-05-12 with reprex v2.0.2