Search code examples

Downloading NOAA data

I'm trying to download NOAA data using the rnoaa package and I'm running into a bit of trouble.

I took a vector from a dataframe and it looks like this:

df <- dataframe$ghcnd

Grabbing necessary column

This gives me an output like:

[49] "GHCND:US1ALBW0079" "GHCND:US1ALBW0060"

In reality, I have about 22,000 weather stations. This is just showing the first 50.

rnoaa code

options("noaakey" = Sys.getenv("noaakey"))

weather <- ncdc(datasetid = 'GHCND', stationid = df, var = 'PRCP', startdate = "2020-05-30",
                enddate = "2020-05-30", add_units = TRUE)

Which produces the following error: Error: Request-URI Too Long (HTTP 414)

However, when I subset the df into just, say, the first 100 entries, I can't get data for more than the first 25. However, the package details say I should be able to run 10,000 queries a day.

Loop Attempt

df1 <- df[1:125] ## Splitting dataframe. Too big otherwise

for (i in 1:length(df1)){
  weather2<-ncdc(datasetid = 'GHCND', stationid=df1[i],var='PRCP',startdate ='2020-06-30',enddate='2020-06-30',
          add_units = TRUE)

But this just producing a dataframe of a single row, that row being the 125th weather station.

If anyone could give advise on what to try next that would be great :)

Also, cross linked:


  • Figured it out, with a lot of help from @Dave2e and a bud on the ropensci link above.

    df <- cleaned_emshr$ghcnd  ## Grabbing necessary column
    z <- split(df, ceiling(seq_along(df)/100))
    out <- list()
    for (i in seq_along(z)) {
      out[[i]] <- ncdc(datasetid = 'GHCND', stationid = z[[i]], var = 'PRCP', 
                       startdate = "2020-05-30", enddate = "2020-05-30", 
                       add_units = TRUE, limit = 100)
    weather <- bind_rows(lapply(out, "[[", "data"))