I am new to r
but I have looked around and tried everything I can think of. Here is the step by step:
read_csv()
(much harder with read.csv()
) to remove all (3 in this case) header rows and saved one to names()
read_csv
and assigned header with names()
Duration (in seconds)
based upon its numerical value does not work. That is, `filter('Duration (in seconds)' == 0) yields a dataframe with no observations.I have:
typeof(test$'Duration (in seconds)')
is "double"read_csv()
imports 'Duration (in seconds)' as double (i.e., 'Duration (in seconds)' = col_double()
)Sample code
df_names <- read_csv("file.csv", n_max=0) %>% names()
test <- read_csv("file.csv", skip=3, col_names=df_names, trim_ws = T)
test2 <- test %>% filter('Duration (in seconds)' == 0) #no rows but should be 6
test2 <- test %>% filter('Duration (in seconds)' > 0) #all rows but should be 3
Data: file.csv
Try replacing your quotes with backticks when referencing your variable name:
test2 <- test %>% filter(`Duration (in seconds)` == 0) #no rows but should be 6
test2 <- test %>% filter(`Duration (in seconds)` > 0) #all rows but should be 3
Explanation: quotation marks denote strings in R; since your column is not a string, your original filter command doesn't select your desired column, and therefore won't return any rows in your filtered dataframe.
Backticks have a few uses in R, but one of them is to give you a way of referring to names that are otherwise non-syntactic. We need to use backticks in this example because your column name has spaces in it. If we didn't, R would assume that each word in Duration (in seconds)
was a separate object, which would be non-syntactic and throw an error.