All posts tagged #machine-learning all posts RSS-feed

Efficient overfitting of training data (Kaggle Bowl 2017)

During the Kaggle Data Science Bowl 2017, the leaderboard was based on only $198$ samples. The opportunity for overfitting was quickly understood, but initially only the naive option was mentioned, testing 1 submission per sample taking 66 days (still doable within the competition duration, but less than ideal). But then Oleg Trott got a perfect score in just 14 submissions! topic) I was really curious how he managed to do this. Together with Cas, I found out one way it... full post»

Linear model vs decision tree (in R)

Linear models vs decision trees I’ve used the R statistical language a bit, a long time ago. It was my first real encounter with data science, but future encounters used Matlab and Python. But lately I’ve been picking up R again, as it’s popular in the data science community. As practise / demo, I thought I’d do a simple exploration of the strengths and weaknesses of linear models versus decision trees. This was inspired by Claudia Perlich at kdnuggets. Let’s... full post»

The blog!

#future, #hardware, #useless, #discussion, #biology, #china, #coding, #cross-validation, #django, #science, #games, #functional-programming


  1. Ubuntu macros @Mark Macro locks have been made more reliable, to prevent any...
  2. Ubuntu macros @Mark Updated after moving the files to
  3. Concentration noise @Mark Sorry if you're in China and can't access Youtube. If th...