Create Data Science pipelines with Luigi & PySpark and CI/CD
Published in
7 min readNov 12, 2019
This article will give you all the details about create a robust data pipeline using the following Python packages:
- Luigi, a package from pipelines
- PySpark, a package to use Spark through a Python API
- Pandas, a package to manipulate data
- Unittest, a package to implement unit tests.