Model selection 101, using R
Quick and dirty markup of simple model selection using R
What‘re we doing?
Since this is a very introductory look at model selection we assume the data you’ve acquired has already been cleaned, scrubbed and ready to go. Data cleaning is a whole subject in and of itself and is actually the primary time-sink of any Data Scientist. Go to the end of this article if you want to download the data for yourself and follow along!
Edit: I’ve made a “sequel” to this article about visualizing and plotting the model we find if you want to check that out after reading this one!:
Make sure to follow my profile if you enjoy this article and want to see more!
Lets look at the pipeline:
This is the skeleton I use for creating a simple LM or GLM:
- Create a base-model using all available variables and data
- Factorize categorical variables if R didn’t do the job
- Add relevant…