A common tactic you hear in data science is:
“start by building a simple model, then build a more complicated one and see if it improves performance”
For example, start by building a logistic classifier, then perhaps see if a Random Forest performs better.
But how much improvement can we expect? If our logistic reg gets 60% accuracy, and our random forest gets 90% accuracy, is that normal or has something gone wrong?
I’m interesting everyone’s experiences: when you’ve done this “simple model → complex model” tactic, how big performance boost did you see? Did you see any at all?!