The Challenge
Build ML models that predict football match outcomes better than random chance. The baseline for predicting draws is ~26%. Can we beat it significantly?
Feature Engineering > Model Complexity
We tried deep learning first. Neural networks with match embeddings, attention layers, the works. Accuracy: 38% on draws.
Then we switched to feature engineering. We created 62-65 features per match:
With Scikit-learn’s ensemble methods on these engineered features: 44.7% accuracy on draws (+19% edge over baseline).
Three Models, Three Markets
Why Scikit-learn Won
Key Takeaway
In ML, the quality of your features matters more than the complexity of your model. Spend 80% of your time on data and features, 20% on model selection.