Introduction to Apache Spark
4 min readJul 11, 2019
MapReduce and Spark are both used for large-scale data processing. However, MapReduce has some shortcomings which renders Spark more useful in a number of scenarios.
Shortcomings of MapReduce
- Every workflow has to go through a map and reduce phase: Can’t accommodate a join, filter or more complicated workflows like map- reduce-map.