Machine Learning

Is a Small Dataset Risky?

Some reflections and tests on the use of a small dataset for a Data Science project.

Angelica Lo Duca
Towards Data Science
6 min readFeb 19, 2022

--

Photo by Martin Sattler on Unsplash

Recently I have written an article about the risks of using the train_test_split() function provided by the scikit-learn Python package. That article has raised a lot of comments, some positives, and others with some concerns…

--

--

Researcher | +50k monthly views | I write on Data Science, Python, Tutorials, and, occasionally, Web Applications | Book Author of Comet for Data Science