Map-Reduce: Gradient Descent

Using PySpark and vanilla Python

Published in

Towards Data Science

4 min readDec 11, 2019

Some statistical models 𝑓(𝑥) are learned by optimizing a loss function 𝐿(Θ) that depends on a set of parameters Θ. There are several ways of finding the optimal Θ for the loss function, one of which is to iteratively update following the gradient:

To then, compute the update:

Because we assume independence between data points, the gradient becomes a summation:

where 𝐿𝑖 is the loss function for the 𝑖-th data point.

Map-Reduce: Gradient Descent

Using PySpark and vanilla Python

Written by Harsh Darji