A Comprehensive Guide

Machine Learning With Spark

A distributed Machine Learning framework

MA Raza, Ph.D.
Towards Data Science
13 min readSep 11, 2020

--

This is a comprehensive tutorial on using the Spark distributed machine learning framework to build a scalable ML data pipeline. I will cover the basic machine learning algorithms implemented in Spark MLlib library and through this tutorial, I will use the PySpark in python environment.

Image by Author using Canva.com

--

--