The world’s leading publication for data science, AI, and ML professionals.

Automate your Data Science projects in the cloud

or: How I Learned to Stop Worrying and Love the Cloud Part 1

  • Part 1 of 2

Serverless code automatisation in the cloud with Azure Functions

Having a working notebook with an interesting case and ideally some meaningful result is great for one self. But when it comes to presenting your work to others, be it colleagues, possible future employers or just the internet, a fully automated and running showcase is just a lot more "sexy" than sending someone a notebook via mail.

Apart from additional extrinsic benefits (easier feedback, interactivity with results, leaving an impression) there comes a surprising amount of intrinsic satisfaction upon elevating your proof-of-concept-level project to a fully operational product.

This article contains Part 1 of this automation series and tackles the concept of Azure Functions.

Cloud can be a beautiful thing sometimes, picture from pexels
Cloud can be a beautiful thing sometimes, picture from pexels

Azure Functions

Azure Functions are the low cost service from Microsoft Azure which can be used to execute your code serverless in the cloud – for less than a cent per execution. You can look at these functions as a framework to get your code running in the cloud. Execution can be triggered by entering a HTTP address into your browser or by a timer. More detailed information on Azure Functions can be found here.

To make the concept more tangible check the following use case. You own a website and someone one registers there. This sends a HTTP request to your Azure Function containing the persons alias, email and other interesting parameters. Your function gets started by this request, does some fancy classification with those parameters, queries some infos, generates a personalised welcome mail and sends it to the person. All happening outside your website.

The low price comes at a cost: at the cheapest level the runtime is limited to a maximum of 10 minutes and is set per default to about 5 minutes. To get a higher maximum runtime of 60 min the premium plan must be purchased for extra cost. Keep that in mind when designing your pipeline.

This of course makes Azure Functions hardly suitable to handle Big Data or enterprise grade cases, but Azure offers its own suite of tools for those: Databricks, Synapse, Azure Data Factory to only name a few.

The concept of Durable Functions is an extension to the Functions framework. A single function can be published and executed automatically, but to create a pipeline where one function starts after another and after that starts a third, durable functions were invented to orchestrate this process.

Set up your environment

Since we are going to work with the Azure Cloud I highly recommend using Microsofts Visual Studio Code. With native Azure integration and many templates to start with this editor is the superior choice for the task. Of course you can develop your model into a neat python package with your editor of choice and later on only use VS Code for the cloud orchestration. You can get the editor for free here.

Extension manager of VS Code, Image by author
Extension manager of VS Code, Image by author

After installation you have a somewhat naked editor, on the left sits a blocky icon which lets you install extensions. Use it to install Azure Functions, this helper will greatly reduce the amount of manual work we have to do. In this section of the editor you can find all sort of helpful extension. Go wild and personalise – after all, it is you who has to work with it. Additionally make sure to install Azure Core Tools from here.

Getting Started with Azure Functions

Create a new project

Create a project in your favoured place by using the Command Palette ⇧⌘P (Windows: Ctrl+Alt+P): enter "Azure Functions: create new project" and hit Enter. This is your main hub for all functionalities of the editor. From settings, over code templates to version control with GitHub – everything can be done from here.

You can perform almost every action from the Command Palette, Image by author
You can perform almost every action from the Command Palette, Image by author

Choose a Path (more on that later), choose Python as a language, select a python interpreter, choose HTTP Trigger as a template for now, give your function a name (I named it MyTrigger) and choose anonymous. The folder you choose for the new project must be the path which contains all code which is supposed to run in the cloud. I was not able to find a way of importing code from a folder level higher than the Azure Function Projects base path.

File Structure

Creating the project and a function from these templates created also a slew of new files. This can seem overwhelming at first, but the following list contains an explanation of the important files and what you can do with them.

File Structure after creating a project and Azure Function, Image by author
File Structure after creating a project and Azure Function, Image by author
  • host.json This file contains some meta information about your project, most notably the version number of the extension bundle. This means the version of the Azure Core Tools we are using for our Data Science Project Automatisation. You can use Azure Functions with the default version [1., 2.0.0), but since we want to automate later on using Durable Functions set the version to [2., 3.0.0) and restart the editor.

  • requirements.txt This should sound familiar. Wether you work with Conda, pip or whatever other solution that suits you, at some point you created a Python environment which contains all your used packages. Azure Functions do NOT use those environment. Instead during the deployment to the cloud the environment is created from the given requirements.txt. Luckily there are ways to generate this file automatically, pip freeze > requirements.txt would be one such solution. As your code grows, remember to add new packages to this file. Also make sure that azure-functions and azure-functions-durable>=1.0.0b12 are listed.

  • .venv This folder contains the Python environment which is used when debugging Azure Functions.

  • MyTrigger/function.json This file contains meta information about your function, what type it is, how it is triggered etc. It can alse be used to set global parameters which can be read from within your code.

  • My_Trigger/init.py This file now contains the actual code in the main function. This ist where you can cut the middle part out, import your package and execute it in the main area. But more on that later. A preconfigured func.HttpResponse is returned by the main function, this is the massage that will be displayed after your function finishes.

Debugging

After setup ist done we can now start executing our newly created function MyTrigger. First, open the Command Palette (⇧⌘P) and enter "View: Toggle Integrated Terminal command". In the bottom part of the editor a shell should have opened up. Execute the following statements to update the python environment in the .venv folder:

source .venv/bin/activate (Windows: .venvscriptsactivate)
python -m pip install -r requirements.txt

Now press the F5 Key or open Run and select Start Debugging to start up the host. Notice that the blue bar at the bottom of the editor turns orange once the host is set up and the your function ready to be executed.

This is what it looks like when your function is ready to be started, Image by author
This is what it looks like when your function is ready to be started, Image by author

Your function is written in yellow and in green a link to trigger it. Command + Click on the link and a browser window opens, which executes your function. Note that after successful execution of the code, the returned func.HttpResponse (from the init.py file) is displayed. Error messages or logging are displayed in blue on the bottom side of your editor.

Sometimes you must ignore some of the error highlighting the editor does, since it not always entirely gets the architecture of the Azure Functions and can become quite unhappy with how imports are handled as one example.

On a side-note: I highly recommend using "logging.info" for logging like in the template. These messages are displayed during execution in the debugging process and can also be read easy and for free in the cloud once the process is deployed. If you have some custom logging in your code, it would be a good idea to switch it to this logging style or add a parameter and make it switch accordingly by itself.

Using azure Functions with your code

After some front-loaded work we come to the juicy part: adding your custom code, your cool model, your value generating scripts into the mix.

Example import of existing code, Image by author
Example import of existing code, Image by author

Make sure that your custom code is in the project folder created above and test it to make sure it runs smoothly. Now all that is left is to import your code and call the functions. Things get easy if you already combined all your code into a main function, since now you just call it within the init.py and you are done.

It’s that simple.

Use the F5 key to debug your code and make it run within the Azure Functions framework. If you would like an example as reference you can this file from my repo, w[here](https://ai-for-everyone.org/?page_id=32) I automate the task of loading data from an Azure SQL DB to google sheets documents for a tableau dashboard. If you want more details you can visit my article on this topic here and check out the results on my homepage here.

Deployment and serverless execution

Now that your Azure Function is fully developed, a last step is left to make the magic happen: deployment.

In order to deploy to Azure you must have created an Azure Account (obviously). Hit the Azure Icon on the left side and connect to your account. Once the connection is established you have to hit the small blue arrow that shows upwards. This starts the deployment process of your Azure Functions.

Azure Window within VS Studio Code, circled is the button for deployment, Image by author
Azure Window within VS Studio Code, circled is the button for deployment, Image by author

First thing it asks you to create a new "Function App" or deploy to an already existing one. Creating and owning such anApp is to the best of my knowledge completely free, so go ahead and create a new one. It basically works as a management tool for your various functions: you can view logs, check performance and in some languages edit the function coce directly in your Azure Portal from within the Function App (Python is not one of those languages tho).

After the deployment finishes open the azure portal and check your Function App. By clicking on the Function menu you get access to all your deployed functions and some nice meta information when you click on one.

Overview over an exemplary Azure Function, Image by author
Overview over an exemplary Azure Function, Image by author

On the functions Overview the actual URL can be acquired under the "Get Function URL" button. Copy the URL and paste it in your browser – your function gets executed. After some delay you can enter the Monitor tab and check how the execution went: you can view, when your function was executed, if it executed successfully and check the logs of each execution (just click on the blue timestamp). For a more detailed analysis of the logs click on the "Run query in application insights" button. But I won’t go into more detail on that in my article.

Now you have all the power in the world to execute your code in the cloud whenever and from where ever you want. Oh, and did I mention that you can add parameters to the URL and read them out in your code? Fully parameterised function calls from within any process. Paste the URL, call in from within your favourite pipeline. Look at all the possibilities this offers.

Get creative and run wild!

Summary

Letting go of control and starting to rely on some nebulous framework feels kind of unsettling. Developers especially are used to have ultimate control over their code and might feel more than one ounce of suspicion. But after the front-loaded setup Azure Functions turned out to be quite easy to handle and to open up a whole new set of possibilities.

With experience you will begin to understand the inner workings of this Azure Functions framework thus making it less nebulous and instead a new powerful cutting edge tool in your ever evolving Data Science toolbox.

Elevate your data science work from Proof-of-Concept-in-a-notebook status to a fully working micro-product. Get ahead of your competition. With a little bit of trust in the cloud.

Sometimes it really is just that easy.

Next Article: Building a pipeline

Stay tuned for Part 2 of my Automatisation series where I explain how we can use Durable Functions to chain Azure Functions together and orchestrate whole execution pipelines from a simple trigger.

Shameless self-promotion

Until then read my other articles, visit my portfolio website and follow me on Twitter under @88Andreasd.


Related Articles