Big Data
-
Python has grown to dominate data science, and its package Pandas has become the go-to…
14 min read -
From AI Agent to Human-In-The-Loop – Master 12 critical data concepts and turn them into…
14 min read -
From Data Lakehouses to Event-Driven Architecture – Master 12 data concepts and turn them into…
14 min read -
Why scan yesterday’s data when you can increment today’s?
7 min read -
PySpark techniques and strategies to tackle common performance challenges: A practical walkthrough
10 min read -
How we use a shared Spark server to make our Spark infrastructure more efficient
19 min read -
Learn a few basic commands to start transitioning from Pandas to PySpark
9 min read -
-
How to use causal inference to improve key business metrics Egor Kraev and Alexander Polyakov…
8 min read