Powerful Feature Selection with Recursive Feature Elimination (RFE) of Sklearn

Get the same model performance even after dropping 93 features

Published in

Towards Data Science

5 min readMay 11, 2021

Photo by Victoriano Izquierdo on Unsplash

The basic feature selection methods are mostly about individual properties of features and how they interact with each other. Variance thresholding and pairwise feature selection are a few examples that remove unnecessary features based on variance and the correlation between them. However, a more pragmatic approach would select features based on how they affect a particular model’s performance. One such technique offered by Sklearn is Recursive Feature Elimination (RFE). It reduces model complexity by removing features one by one until the optimal number of features is left.

It is one of the most popular feature selection algorithms due to its flexibility and ease of use. The algorithm can wrap around any model, and it produces the best possible set of features that gives the highest performance. By completing this tutorial, you will learn how to use its implementation in Sklearn.

The idea behind Recursive Feature Elimination

Consider this subset of the Ansur Male dataset:

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Continue in app

Or, continue in mobile web

Sign up with Google

Sign up with Facebook

Sign up with email

Already have an account? Sign in

Published in Towards Data Science

Last published 11 hours ago

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

Written by Bex T.

BEXGBoost | DataCamp Instructor |🥇Top 10 AI/ML Writer on Medium | Kaggle Master | https://www.linkedin.com/in/bextuychiev/

Responses (4)
What are your thoughts?
Also publish to my profile
Iustina Ivanova
12 months ago
how to use recursive feature elimination without mentioning the amount of features to select?
--
Tiger
about 1 year ago
This is a good illustration of the method but it seems that you’re using weightKg to predict weightLbs so the model performance has unfortunate target leakage.
--
Ben Earnest, M. Sc.
over 1 year ago
interesting breakdown, and great use of building a baseline model for comparison. Thanks for the article!
--

Recommended from Medium

Understanding Statistical Distributions

In

Tech Spectrum

by

Aarafat Islam

Understanding Statistical Distributions

Python implementation and visualization

Dec 3, 2024

Multicollinearity in Data Science and Machine Learning: The Hidden Threat and How to Tackle It

In

Academy Team

by

Mustafa Erboga, Ph.D.

Multicollinearity in Data Science and Machine Learning: The Hidden Threat and How to Tackle It

In data science and machine learning, understanding the relationships between variables is essential for building accurate and…

Nov 8, 2024

Lists

Predictive Modeling w/ Python

20 stories1766 saves

ChatGPT prompts

51 stories2447 saves

Practical Guides to Machine Learning

10 stories2138 saves

AI Regulation

6 stories669 saves

The Ultimate Guide to Finding Outliers in Your Time-Series Data (Part 2)

In

Towards Data Science

by

Sara Nóbrega

The Ultimate Guide to Finding Outliers in Your Time-Series Data (Part 2)

Effective machine learning methods and tools for outlier detection in time-series analysis

Jun 26, 2024

The Ultimate Guide to Finding Outliers in Your Time-Series Data (Part 1)

In

Towards Data Science

by

Sara Nóbrega

The Ultimate Guide to Finding Outliers in Your Time-Series Data (Part 1)

Effective statistical methods and tools for outlier detection in time-series analysis

May 21, 2024

All The SQL a Data Scientist Needs to Know

In

Towards Data Science

by

Marc Matterson

All The SQL a Data Scientist Needs to Know

What you need to know, best practices, and where you can practice your skills

3d ago

A Guide to 21 Feature Importance Methods and Packages in Machine Learning (with Code)

In

Towards Data Science

by

Dr. Theophano Mitsa

A Guide to 21 Feature Importance Methods and Packages in Machine Learning (with Code)

From the OmniXAI, Shapash, and Dalex interpretability packages to the Boruta, Relief, and Random Forest feature selection algorithms

Dec 19, 2023

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams