Search code examples
rsplit-function

what happens when we specify column name in split function in R?


What is the difference between these two lines of codes in R?

split = sample.split(dataset$Customer_Segment, SplitRatio = 0.8)

split = sample.split(dataset, SplitRatio = 0.8)

Solution

  • if you mean caTools::sample.split, function is based on length value of object.

    let's assume dataset has 100 rows and 10 columns

    length(dataset$Customer_Segment) is 100 (equal to nrow(dataset)), so function return vector 80 TRUE and 20 FALSE value

    since length(dataset) is 10 (equal to ncol(dataset)) so function return vector 8 TRUE and 2 FALSE