Introduction to Apache Spark RDDs using Python

Apache Spark is a must for Big data’s lovers. In a few words, Spark is a fast and powerful framework that provides an API to perform massive distributed processing over resilient sets of data.

Prerequisites :

Before we begin, please set up the Python and Apache Spark environment on your machine. Head over to this blog here to install if you have…

--

--

Jaafar Benabderrazak (Human/Not A Robot)

Committed lifelong learner. I am passionate about machine learning, MLOps and currently working as a Machine Learning Engineer.