Author: Sarthak Sarbahi
-
Parquet vs ORC vs Avro vs Delta Lake
13 min read -
Streamline Data Pipelines: How to Use WhyLogs with PySpark for Data Profiling and Validation
Data EngineeringLearn to use whylogs with PySpark for data profiling and validation
10 min read -
Seamless Data Analytics Workflow: From Dockerized JupyterLab and MinIO to Insights with Spark SQL
Data EngineeringAn engineered guide for data analytics with SQL
20 min read