Notes From Industry
How to Connect a Local or Remote Machine to a Databricks Cluster
The intersection of Databricks, Python, and Docker
When you start working with Databricks, you will reach the point that you decide to code outside of Databricks and remotely connect to its computation power, a.k.a. Databricks Cluster. Why? Mainly because one of the main features of Databricks is its Spark job management, which can make your life easy. Using this service, thanks to its Spark engines, you can submit a series of Spark jobs to a large-scale dataset and get back your results in a matter of seconds. In this article, I want to describe how you can configure your local or remote machine to connect to a Databricks Cluster as the first step.
— What is Databricks?
Databricks is an abstract layer sitting on cold cloud infrastructures like AWS and Azure that lets you easily manage computation power, data storage, job scheduling, and model management. It provides a development environment to obtain preliminary…