Skewed Data in Spark? Add SALT to Compensate

A step-by-step guide to handle skewed data with SALT technique

Chengzhi Zhao
Towards Data Science
4 min readDec 9, 2021

--

Image: @tangerinenewt Unsplash

If you have been working with Apache Spark for a while, you must have seen the following error:java.lang.OutOfMemoryError: Java heap space

The out-of-memory (OOM) error is one of the most recurring errors preventing Spark jobs from completing successfully…

--

--