Search code examples
rpipelinetidyr

Pipeline issues with a new column not being found


I am trying to run the following code and keep getting "Error: object 'caloriesdif' not found".

Here is my code:

activity <-read.csv("dailyActivity_merged.csv")
head(activity)

calories <-read.csv("dailyCalories_merged.csv")
head(calories)

activity_and_calories <- merge(activity, calories, by=c("Id", "ActivityDay"))
head(activity_and_calories)

activity_and_calories %>%
  mutate(caloriesdif = Calories.x - Calories.y) %>%
  max(caloriesdif)

I can get it to bring up the caloriesdif column but not what the max value is.

I have tried reruning the code and changing the names of the variables but I cannot find the maximum value.


Solution

  • You're piping the data frame activity_and_calories to mutate(), then you're piping the result of that to max(). max() is receiving a data frame object (the name of which can be represented by a period, "."). It doesn't know that caloriesdif is a column name - it thinks it's another variable in your environment, which doesn't exist, hence the error.

    max() expects a vector of values, which you could supply using pull().

    activity_and_calories %>%
      mutate(caloriesdif = Calories.x - Calories.y) %>%
      pull(caloriesdif) %>%
      max()