Search code examples
juliaflux.jl

How do I split a custom dataset into training and test datasets in Flux.jl?


I have a custom dataset and I would like to split that dataset into a "training" and "test" set (also potentially a "validation" set if possible). How would I achieve this using Flux.jl or other Julia machine learning packages?


Solution

  • You can import the TrainTestSplit function from the Lathe package, as in:

    using Lathe.preprocess: TrainTestSplit
    

    and then implement it in your code like this for example:

    dataset_id = TrainTestSplit(datasetmap[:], 0.8); #datasetmap is your label encoded matrix
    

    Am assuming you're using Pluto notebook but, it should work in any other environment as well i,e jupyter, atom, etc.