Search code examples
rfunctiongroup-bysummarize

function applied to summarise + group_by doesn't work correctly


I extract my data

fluo <- read.csv("data/ctd_SOMLIT.csv", sep=";", stringsAsFactors=FALSE)

I display in three columns : the day, the month and the year based on the original date : Y - m - d

fluo$day <- day(as.POSIXlt(fluo$DATE, format = "%Y-%m-%d"))
fluo$month <- month(as.POSIXlt(fluo$DATE, format = "%Y-%m-%d"))
fluo$year <- year(as.POSIXlt(fluo$DATE, format = "%Y-%m-%d"))

This is a part of my data_frame:

data.frame

Then, I do summarise and group_by in order to apply the function :

prof_DCM = fluo[max(fluo$FLUORESCENCE..Fluorescence.),2] 

=> I want the depth of the max of FLUORESCENCE measured for each month, for each year.

mean_fluo <- summarise(group_by(fluo, month, year), 
                       prof_DCM = fluo[max(fluo$FLUORESCENCE..Fluorescence.),2])
mean_fluo <- arrange(mean_fluo, year, month)
View(mean_fluo)

But it's not working ... The values of prof_DCM still the same all along the column 3 of the data_frame:

same value for column 3


Solution

  • Maybe try the following code.

    library(dplyr)
    mean_fluo <- fluo %>%
    group_by(month,year) %>%
    filter(FLUORESCENCE..Fluorescence. == max(FLUORESCENCE..Fluorescence.)) %>%
    arrange(year,month)
    
    View(mean_fluo)
    

    You can select the variables you want to keep with 'select'

    mean_fluo <- fluo %>%
    group_by(month,year) %>%
    filter(FLUORESCENCE..Fluorescence. == max(FLUORESCENCE..Fluorescence.)) %>%
    arrange(year,month)%>%
    select(c(month,year,PROFONDEUR))