Which Data Format to Use For Your Big Data Project?

Pickle, Parquet, CSV, Feather, HDF5, ORC, JSON: which one should you be using and why?

Armand Sauzay
Towards Data Science
6 min readOct 26, 2023

--

Image by Maarten van den Heuvel — Unsplash

Choosing the right data format is crucial in Data Science projects, impacting everything from data read/write speeds to memory consumption and interoperability. This article explores seven popular serialization/deserialization formats…

--

--