The world’s leading publication for data science, AI, and ML professionals.

Serverless Alternative: Executing Python Functions using AWS, Terraform, and Github Actions

Automate the deployment and execution of a Python function without worrying about package size, execution time, or portability.

Photo by Alex Knight on Unsplash
Photo by Alex Knight on Unsplash

What’s better than Serverless? Serverless is all the buzz these days and for good reason. Serverless is a simple, yet powerful cloud resource to execute function calls without worrying about the underlying infrastructure. But every superhero has their kryptonite and recently, I’ve ran into a few issues with AWS Lambda Serverless Functions:

  1. Package Size Limitation: My Python dependencies are larger than the 50 MB compressed (and 250 uncompressed) size limits.
  2. Execution Time Limitation: My Python function takes longer than the 15 minute limit.
  3. Lack of Portability: AWS Lambda functions aren’t easily portable to other cloud vendors.

The obvious alternative is provisioning an EC2 instance to install the dependencies and execute the function, but I don’t want the server to be on all the time. Time is money, and EC2 instances running 24/7 cost money. I don’t want to manage the deployment, manually turning on and off the instance and executing the function. I also want to have function portability in case I want to deploy this function in a different cloud.

Ultimately, I want to automate the process of provisioning an EC2 instance, executing the Python function, then destroying the EC2 instance and all underlying infrastructure. (If you simply turn-off the EC2 instance, you will continue to pay for the volume). Enter Terraform and Github Workflow.

Terraform and Github Workflow are tools any modern DevOps or Cloud engineer need to build and deploy applications. Terraform quickly provisions cloud infrastructure to execute the function. Terraform scripts are also easily portable to other cloud vendors with changes to the services used. Github Workflow manages the deployment. We are also using a Github repository to hold all the Terraform and Python code used by Github Workflow.

Here is a video of me running the Github Actions showing how the function is executed and Terraform makes changes in the the AWS console:

Outline:

  1. AWS Setup
  2. Terraform Script
  3. Github Secrets
  4. Github Workflow YAML Setup
  5. Executing Python Function
  6. Conclusion

AWS Setup

The first step is to setup AWS so we have the right user permissions and key pairs to use for the Terraform scripting later. I won’t delve too deeply into user permissions here. For this tutorial, I simply created a new user in IAM and gave my user administrative access (I don’t recommend this; you should always provide a user the least amount of access required for the user to accomplish tasks). Copy the access and secret key somewhere to be used later in this tutorial.

Next, you want to create a PEM key to use in the terraform scripting and for Github Workflow to access AWS. While on the AWS services homepage, select "EC2". On the left side of the console, select the "Key Pairs". On the top right of the screen, there is a button which states "Create Key Pair". Enter the name of the key, and select "PEM" as the file format. Finally, hit the "Create Key Pair" button to create the PEM key. Your browser should automatically download the private key. Place this key somewhere accessible since it is integral to the entire process.

You will also need the public key that corresponds to your private key. To get this, open terminal, change directory (cd) to the location of the private key, and run the following script:

ssh-keygen -e -f aws_private_key.pem > aws_public_key.pem

The result of this script should output the corresponding public key. You can copy this to your favorite code text editor. This public key will be important later.

Note: I recommend testing the keys before running Terraform scripts by creating an EC2 instance and trying to SSH into the instance with the PEM key that we just created in AWS.

Terraform Script

Now that we have AWS properly configured, we can create Terraform scripts to provision the resources needed to execute the Python function:

Notice that we included an S3 bucket which isn’t really needed, but I wanted to provide some additional scripts just in case this resource is applicable for your project. Also notice that the public key we created in the previous step can be entered into "". The egress and ingress rules are not secure, they allow anyone with valid credentials to connect to the instance. But since the purpose of this tutorial is to provide an example, I haven’t configured security properly. I selected a random AMI, but makes sure to find the right image for your workload.

Note: I recommend test running terraform scripts on your local machine before creating Github Workflow. I created a folder on my Mac desktop and added the path to the Terraform executable in my Bash profile before successfully initializing Terraform. You can run the Terraform related Github Workflow actions defined later in this tutorial on your terminal. Please use this link to install Terraform.

export PATH=/path/to/terraform/executable:$PATH

Note: if you are completely new to Terraform I recommend this LinkedIn Learning Course on Terraform.

Github Secrets

Before using the Github Workflow to run the terraform script, we need to setup Github secrets with a few keys related to AWS and Terraform. Here is a screenshot of my secrets:

My Github Repo Secrets
My Github Repo Secrets

The "SSH_KEY" secret contains the private AWS Key automatically download when creating a key pair on the EC2 console. You can output the private key value by entering this command:

cat aws_private_key.pem

The "TF_API_TOKEN" key needed is for the Terraform API that Github Workflow will use to execute the scripts. Use this link to gain access to the Hashicorp Terraform API token (you may need to create an account).

Github Workflow YAML Setup

Now that our Github secrets are properly configured, we can create the YAML file in Github Workflow:

At a high level, when this YAML executes upon a new push to the Github Repository, a new "runner" is created, which is a newly created virtual environment on a Github host which "runs-on" the operating system you define. Then it seeks to complete all the "jobs" defined in parallel. In this case, I only have 1 job and thus all the "steps" (consisting of "actions") are completed sequentially. Each "step" builds upon one another which means that any changes made in previous steps is accessible to future steps. Now some of the "actions" completed in each "step" "uses" pre-defined actions, these are actions created by others that can be imported. In this tutorial I am using 4 actions. The Github Workflow syntax is confusing. I recommend spending some understanding the key terms I put in quotes.

Note: This is a pretty good introduction to Github Actions. I also recommend this Github Actions course on LinkedIn Learning.

The YAML file commands are dense so I will focus on some of nuances and peculiarities of the code starting from the top and working down:

  • On line 42, we must change permissions for the key in order to use them for SCP and SSH later.
  • On line 53, we must import the private key to Terraform before being able to provision infrastructure on AWS.
  • On line 59, I am using "auto-approve" to automatically create the infrastructure. If you try to run this command without "auto-approve" the terminal requires a "yes" to create the infrastructure creation.
  • On line 62 and 65, we are setting environment variables that are needed in future steps. The command on line 62 stores the infrastructure created by Terraform in a JSON format. Then the Python script on line 65 iterates through the JSON text and creates a variable for the EC2 public ip address that we SSH into later. Each time we run this workflow, a new EC2 instance with a different Public IP address is created. Thus, we need a script to get the Public IP address that we SSH and SCP to later.

Here is the Python Script which I call "tf_parse.py" in the YAML:

  • There is probably a lot of head-scratching on line 71. Why would anyone add time to the workflow? This took me the longest to debug. My assumption was that once Terraform completes the infrastructure, I can SSH and SCP to the instance. I was wrong. You need some time for the EC2 instance to initialize before running the subsequent commands. I’ve added 20 seconds, but it may take more or less time depending on the type of instance you’ve provisioned.
  • On lines 78 and 79 I’ve added some additional parameters to prevent the terminal requesting authorization to add the host name. Here are the functions you can use if you prefer greater readability:

    Note: Use the functions above by entering the following command in the YAML file:

sshtmp -i aws_key_pair.pem ec2-user@ec2-$aws_compute_public_ip.compute-1.amazonaws.com
  • Finally, the command on line 83 prevents Terraform from destroying the aws_key_pair in the next step. Here is a useful resource to output all the Terraform states in case you want to prevent the destruction of other resources.

Executing Python Function

The Python function is being executed on line 80 on the AWS EC2 instance. For this tutorial, I am executing a basic Python function, but the sky is the limit. If you want to install some dependencies before running the script, check out line 50 and beyond in the YAML file from my previous article on creating a CI/CD pipelines on AWS.

Note that the dependencies need to be installed on the EC2 instance and not the Github Workflow "runner".

Conclusion

This tutorial showcases how to automate the deployment and execution of a Python function using AWS, Terraform, and Github Workflow. We highlighted some the problems with Serverless functions and how this workflow can be a reasonable substitute or replacement. However, its important to remember that we pay for the time that the Terraform initiated EC2 instance is running. It also takes much longer to use Terraform to provision the instances and run the function when compared to simply executing a Serverless function. Remember, we have to provision the underlying infrastructure every time we want to execute the function.

Another reason I prefer Terraform and Github Worflow is because AWS Lambda functions lack portability. Once Lambda Functions are used, its difficult to transport that function elsewhere. This is due, in part, to the syntax restrictions for Lambda Function returns, Lambda Handlers, Layers, and other configurations. Also leveraging AWS API Gateway to invoke the function further prevents portability to another cloud vendor. Terraform makes it easier to find the corresponding services in another cloud vendor and deploying the workflow. Serverless functions remain powerful tools for creating scalable services in the cloud but there are significant flaws and disadvantages.

What other possibilities are enabled by this structure of infrastructure creation and deployment? How about managing infrequent, time-insensitive services with this workflow? With some changes to this tutorial, we create and deploy the underlying infrastructure for the application servers, load balancers, S3 Bucket, and then destroy those instances when the services are completed. This might be crucial for any startup with large data-intensive applications seeking an effective way to mitigate costs for their DEV and TEST environments or even PROD.

Substack:

I recently created a substack to learn how to program and solve LeetCode questions in Python.


Related Articles