A lesson that we software engineers learn early in our careers is that “premature optimization is the root of all evil.” This gem of advice from the inimitable Donald Knuth addresses the over-eagerness of rookie — and sometimes not-so-rookie — software engineers to optimize the efficiency of their code. Instead, the first step should be to ship code that works, and then figure out which critical components, if any, to optimize for performance.
These days, we’re increasingly likely to encounter opportunities for premature optimization in the context of machine learning models, where performance refers to accuracy rather than efficiency. As it turns out, Knuth’s advice is even more important for machine learning developers than it was for its originally intended audience.
Agile development: it’s not just for software engineering.
A cornerstone of modern software engineering is agile development, which tells us that we should first build a minimum viable product (MVP) and then iterate on it incrementally. In this context, it’s clear why you should avoid premature performance optimization: it violates the definition of an MVP. You should first build something that works and then incrementally improve it.
Good machine learning development follows a similar principle: every machine learning project should start with an MVP. In fact, as Monica Rogati explains in her “AI Hierarchy of Needs,” you shouldn’t even start with a machine learning approach, but rather with a simple heuristic as a baseline. Only then should you move on to the simplest machine learning models, like logistic regression.
Performance isn’t the only objective for machine learning models.
You may be thinking that there’s a problem with my analogy: efficiency in software engineering is a secondary objective (after correctness), while model performance in machine learning is the primary objective. It’s certainly true that a wildly inaccurate machine learning model isn’t particularly useful.
Bur once a model achieves minimally acceptable accuracy, it’s important to consider other factors. Indeed, marginal improvements in model performance are often secondary to those other factors.
Specifically, a machine learning system should address the “three ex’s”:
- Expressing the utility and input distribution
- Explaining the results
If the objective function fails to adequately.model utility, you’re optimizing for the wrong thing. If the input distribution is tainted by systematic bias, garbage in ensures garbage out. In either case, there’s no point in optimizing the performance of the resulting model. Meanwhile, explainability is critical in order to debug models and identify opportunities to improve them. Finally, the cost of experimentation determines how quickly you can improve models.
Model performance still matters. But it’s not the only thing that matters.
Donald Knuth was ahead of his time.
Here is Knuth’s full quote:
The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.
He wrote those words in 1974, nearly three decades before the principles of agile development would be enshrined in the Agile Manifesto, and even longer before the emergence of machine learning a core to mainstream software development. But his advice is timeless, and it’s even more applicable to machine learning than to traditional software development. Build something that works and then improve it.
Premature optimization is (still) the root of all evil.