Search code examples
machine-learningphysicskaggle

Kaggle: TrackML Particle Tracking Challenge


I'm new to ML and Kaggle. I was going through the solution of a Kaggle Challenge.
Challenge: https://www.kaggle.com/c/trackml-particle-identification
Solution: https://www.kaggle.com/outrunner/trackml-2-solution-example

While going through the code, I noticed that the author has used only train_1 file (not train_2, 3, …).

I know there is some strategy involved behind using only the train_1 file. Can someone, please, explain why is it so? Also, what are the use of blacklist_training.zip, train_sample.zip, and detectors.zip files?


Solution

  • I'm one of the organiser of the challenge. train_1 2 3 .. files are all equivalent. Outrunner has probably seen there was no improvement using more data.

    train_sample.zip is a small dataset equivalent to train_1 2 3... provided for convenience.

    blacklist_training.zip is a list of particles to be ignored due to a small bug in the simulator (not very important).

    detectors.zip is the list of the geometrical surfaces where the x y z measurements are made.

    David