I have tried lots of algorithms on my dataset to perform a clustering and now would love to apply now managerial segmentation with 'which'statements on my data. I was wondering what might makes more sense if I shall do the segment on customer math or on the Years which are lasting from X1-X8. Doing managerial segmentation on X1-X8 is clear, but I don't know how to do it on the string.
Here is my df:
customer_id customer_math X1 X2 X3 X4 X5 X6 X7 X8
1 15251 10001010 1 0 0 0 1 0 1 0
2 10101 11111111 1 1 1 1 1 1 1 1
3 84787 10101010 1 0 1 0 1 0 1 0
For instance, I would like to answer the following questions:
Thank you very much for your feedback!
If I understood correctly:
library(stringr)
q1 <- df[str_count(df$customer_math, "0")==1,] #exactly one '0' occurrence in string
q2 <- df[grepl("00",df$customer_math),] #at least two zeros ina a row - or more, be aware of it, this is simple solution and it won't get only exact 00 occurences, but you can fix it easly^^
q3 <- df[str_count(df$customer_math, "0")>=1 & df$X8==1,] #at least one zero in string and always 1 at the end