Search code examples
rnumber-formatting

Convert factor variable minute:second to numerical variable minute.seconds in R


I am struggling with a data frame I have been given:

 game.time.total game.time.first.half game.time.second.half
1           95:09                46:04                 49:05
2           95:09                46:04                 49:05
3           95:31                46:07                 49:23
4           95:31                46:07                 49:23
5           95:39                46:08                 49:31

Currently, these columns are currently factor variables (see str output)

'data.frame':   5 obs. of  3 variables:
 $ game.time.total      : Factor w/ 29 levels "100:22","100:53",..: 7 7 10 10 12
 $ game.time.first.half : Factor w/ 27 levels "45:18","46:00",..: 3 3 5 5 6
 $ game.time.second.half: Factor w/ 29 levels "48:01","48:03",..: 12 12 15 15 17

However I wish to be able to average each column using colmeans(). From my understanding I need to convert the column to numeric and to be expressed as minutes.seconds as shown here:

game.time.total game.time.first.half game.time.second.half
1           95.09                46.04                 49.05
2           95.09                46.04                 49.05
3           95.31                46.07                 49.23
4           95.31                46.07                 49.23
5           95.39                46.08                 49.31

I understand that I could just type them out however there are many more column and rows of similar formatting...Is there a simple way of how to do this? Or do I need to re-adjust the format of the original file(.csv)?

EDIT: Thank you for the answers. My mistake as in my original question I did not provide my actual DF. I have now added this and with the str() result.

@hello_friend this is what is returned when I apply your second solution

 game.time.total game.time.first.half game.time.second.half
1               7                    3                    12
2               7                    3                    12
3              10                    5                    15
4              10                    5                    15
5              12                    6                    17

Thanks in advance.


Solution

  • Base R solution:

    numeric_df <- setNames(data.frame(lapply(data.frame(
      Vectorize(gsub)(":", ".", DF), stringsAsFactors = FALSE
    ),
    function(x) {
      as.double(x)
    })), names(DF))
    

    Data:

     DF <- structure(list(game.time.total = c("95:09", "95:09", "95:31", 
    "95:31", "95:39"), game.time.first.half = c("46:04", "46:04", 
    "46:07", "46:07", "46:08"), game.time.second.half = c("49:05", 
    "49:05", "49:23", "49:23", "49:31")), class = "data.frame", row.names = c(NA, -5L))