I am new to r
but I have looked around and tried everything I can think of. Here is the step by step:
(much harder with read.csv()
) to remove all (3 in this case) header rows and saved one to names()
and assigned header with names()
Duration (in seconds)
based upon its numerical value does not work. That is, `filter('Duration (in seconds)' == 0) yields a dataframe with no observations.I have:
typeof(test$'Duration (in seconds)')
is "double"read_csv()
imports 'Duration (in seconds)' as double (i.e., 'Duration (in seconds)' = col_double()
)Sample code
df_names <- read_csv("file.csv", n_max=0) %>% names()
test <- read_csv("file.csv", skip=3, col_names=df_names, trim_ws = T)
test2 <- test %>% filter('Duration (in seconds)' == 0) #no rows but should be 6
test2 <- test %>% filter('Duration (in seconds)' > 0) #all rows but should be 3
Data: file.csv
Try replacing your quotes with backticks when referencing your variable name:
test2 <- test %>% filter(`Duration (in seconds)` == 0) #no rows but should be 6
test2 <- test %>% filter(`Duration (in seconds)` > 0) #all rows but should be 3
Explanation: quotation marks denote strings in R; since your column is not a string, your original filter command doesn't select your desired column, and therefore won't return any rows in your filtered dataframe.
Backticks have a few uses in R, but one of them is to give you a way of referring to names that are otherwise non-syntactic. We need to use backticks in this example because your column name has spaces in it. If we didn't, R would assume that each word in Duration (in seconds)
was a separate object, which would be non-syntactic and throw an error.