Reasonable linear models

Today I read in yet another (recent) machine learning paper:

It is reasonable to suppose that [super complex and interesting phenomenon] can be approximated by a linear model based on [whatever features we could get our hands on].

Wiktionary offers three meanings for the word reasonable:

  • Just; fair; agreeable to reason.
  • Not expensive; fairly priced.
  • Satisfactory.

In this case, the second one is our only safe bet. Otherwise, what we can say about a linear model is that it is:

  • Simple, and often the most convenient to work with.
  • Readable: if the regressands have semantics, then so have the linear coefficients.
  • Sometimes, able to capture the dynamics of your problem.

But do these make it reasonable per se? If you have measures that validate your choice of this model, yes. Otherwise, keep in mind that not everything is approximable by a linear function, even in the real world. Convenience is a big driver behind the tools and scientific ideas we try, and it is unfortunately easy to believe we are guided by reason while rolling down a purely contingent path.

If you like your readings mind-teasing and refreshing, on this topic you can take a look at Poincaré's Science and Hypothesis (1902). You are welcome :-)

© Stéphane Caron — All content on this website is licensed under the CC BY 4.0 license.
π