All posts tagged #cross-validation all posts RSS-feed

Efficient overfitting of training data (Kaggle Bowl 2017)

During the Kaggle Data Science Bowl 2017, the leaderboard was based on only $198$ samples. The opportunity for overfitting was quickly understood, but initially only the naive option was mentioned, testing 1 submission per sample taking 66 days (still doable within the competition duration, but less than ideal). But then Oleg Trott got a perfect score in just 14 submissions! topic) I was really curious how he managed to do this. Together with Cas, I found out one way it... full post»