I'd like to use time series clustering using the dtwclust
package. The problem is the conversion of my data.frame to list of time series. All my blocks ID (named STAND
) has 180 days in negative values (DATE_TIME
) The B2_MAX
is my variable response. In my example:
library(dplyr)
library(ggplot2)
library(dtwclust)
all.B2_MAX.stands <- read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/my_ts_data.csv")
all.B2_MAX.tsc <- all.B2_MAX %>%
group_by(STAND) %>%
summarise(var = list(B2_MAX[order(DATE_TIME)]),
var_ts = purrr::map(var, ts))
clusters <- tsclust(all.B2_MAX.tsc[-1],
type="partitional",
k=2L,
distance="dtw",
centroid = "pam")
#plot
plot(cluster, type = "sc")
#Error in lapply(series, base::as.numeric) :
# 'list' object cannot be coerced to type 'double'
Please, any help with it?
In this case split
by response variable and idBlocks after using the tsclust
function, work very well:
d <- read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/my_ts_data.csv")
l <- split(d$B2_MAX,d$STAND)
o <- tsclust(l,
type="partitional",
k=2L,
distance="dtw_basic",
centroid = "pam")
#plot
plot(o)
o
# partitional clustering with 2 clusters
# Using dtw_basic distance
# Using pam centroids
# Time required for analysis:
# usuário sistema decorrido
# 1.13 0.00 0.16
# Cluster sizes with average intra-cluster distance:
# size av_dist
# 1 14 3.518299e+198
# 2 50 4.526561e+08