The world’s leading publication for data science, AI, and ML professionals.

AWS Serverless Hit n’ Run

AWS Lambda Functions + AWS Layers + S3 Buckets + API Gateway

A brief overview of leveraging AWS Lambda Functions for your applications. Integration with various AWS tools + Open-source Python libraries, along with some hacks & general usage.

Photo by Magda V on Unsplash
Photo by Magda V on Unsplash

Serverless is like a trump card for all the minions putting in a shift in the World of Tech. The good part is you only pay for the resources consumed by your application. There is no involvement in deploying and maintaining a server from your side, it is all taken care of by AWS. (How Convenient!? 👊 )

Why Hit n’ Run? Well, since this was my very first experience with Serverless and It turns out to be very efficient!!

Now getting back to lambda functions, assuming you have an AWS Subscription (free tier available). Navigate to the Lambda console page:

Image by Author
Image by Author

Go Ahead and create a basic lambda function. AWS offers various regions that you can select from. The catch being, a trade-off between the latency in response time and pricing.

After creation, you will come across the following panels: Designer, Function code, Basic Settings, and many more.

For the scope of this article, we will go with the first three I have mentioned. Once you set up and test your use case a few times the other panels will be handy in optimizing the performance.

  1. The Basic Settings panel is where you can select the memory and timeout for your lambda function. Larger memory will ensure faster execution however the cost will go up. With the recent introduction of millisecond billing, it is advisable to test and find an optimum combination.
  2. The Designer Panel is a diagrammatic representation of your application architecture:
Image by Author
Image by Author

You can trigger your function in various ways. Some of the common ones are through the API Gateway (creating an API that triggers the lambda is simple enough, say a POST call sending some data which is then manipulated via the lambda and returned &/ updated in some DB), S3 Buckets (OBJECT CREATE EVENT, etc), Amazon Simple Notification Service, etc on triggering the code execution begins.

How about leveraging all the amazing open-source libraries present out there? Well yes AWS has made provisions for this too. In order to proficiently use libraries across lambda functions, we can create Layers.

AWS Lambda Layers

You have to create Lambda layers as zip files of the libraries you want to import or else by providing a layer version ARN. An important point to keep in mind while creating these is the folder structure and operating system-

Image by Author
Image by Author
Image by Author
Image by Author

I have provided links at the end to some very useful articles and resources I came across. These also include one that helps in deploying an ML Model as a Lambda Layer Instead of using an expensive resource such as AWS SageMaker.

After you create a Layer and add it to your lambda function, you can immediately import it into the Function Code.

  1. The Function Code panel:
Image by Author
Image by Author

Once a lambda is triggered the function lambda_handler(event, context) of script lambda_function.py gets executed (this is easliy editable via the runtime settings panel, as in you can specify which function has to run once the lambda is triggered, however, do not forget the two parameters ‘event’ and ‘context’). From the two parameters event contains the data say the JSON you posted via an API call or S3 OBJECT CREATE EVENT as a JSON containing details about the bucket name, file name, etc

Once you have configured your lambda the same can be put to test after creating TEST events which simply sends a JSON to your lambda function ‘event’ parameter. The output of which will be the value returned via your lambda. The logging of the same is available by another AWS Service – CloudWatch. You can access this via the Monitoring Tab and is available as part of the lambda function, No additional configuration required.


Integration of AWS Lambda Functions with resources and general Usage

What Next? Well once you have created and Tested your lambda function, added layers for the libraries you want to import, and APIs or S3 buckets for triggering your lambda function things become a lot smoother for the development of your application.

S3 Buckets:

S3 is a Simple Storage Device Service offered by Amazon to store files. In order to work with S3 buckets using python – ‘boto3’ is the SDK used. An important point to mention is that a few libraries such as boto3, JSON, traceback, re, etc are already provided by AWS and need not be added separately as layers but can be directly imported in your function.

Using boto3 one can create a bucket, delete a bucket, read/save a file from/to a bucket:

Before being able to manipulate files present in the S3 Buckets via your lambda function, you will have to attach the ‘AmazonS3FullAccess’ policy to your lambda function role. Every lambda function is assigned a role name on creation.

You can view this by navigating to IAM -> Roles -> ‘your-lambda-name-role’. Here one can attach various policies that behave as permissions.

Amazon API Gateway:

You can trigger your lambda function using this service. After creating an API one can configure it on the API Gateway page i.e., create new methods (POST call, GET call), select the lambda function to trigger, and finally deploy it.

Image by Author
Image by Author

/tmp Folder:

This Folder comes in handy when dealing with the temporary storage of files. It spawns up only during the execution period of a lambda function, Memory (500 MB). It was useful while trying various open-source libraries for the conversion of file formats (PDF Creation, etc).

Image by Author
Image by Author

These points seem to scratch the possibilities when it comes to lambda functions giving you a basic understanding of the working of lambdas and how one can tune them for their own applications. There are various other resources that you can leverage in sync with AWS Lambdas.


Based on my experience Once you start working with AWS Lambdas it is simply like working on another IDE, and also, I would suggest against declaring global variables in the function code (In my experience, data got mixed up when multiple instances were spawned 😵 ) and waiting a few seconds before deploying your code after making some changes on the console browser (or simpler make changes in the zip file and upload 😬 ).

I have listed down some key points to keep in mind while working with lambda functions:

  1. Concurrency:

If you make a request to the function while it is already executing a previous request, it spawns another instance.

In AWS Unreserved Account concurrency is at 1000. This means if you trigger your lambda 1000 times in 1 second you will have 1000 instances running each executing the same code in your function.

  1. Serverless:

No hassle of maintaining a server and changing your plan every month based on usage in order to optimize costs.

3.AWS Resources:

Integration with various other amazing AWS Resources such as Textract, SNS, etc.

Limitations:

  1. Time – A lambda function can run for a maximum of 15 minutes, beyond which the code terminates.
  2. You can send a maximum of 10MB of data using the Aws Api Gateway.
  3. You can have a maximum of 5 Layers, the combined size of which should be <250 MB for a lambda.

I hope this provides some valuable insight to people beginning to use Cloud Based Solutions like AWS Lambdas or rather pushes you into trying it out. 😀

Informative links and resources: ⚡️

  1. Machine Learning Model Deployment and lambda layers creation – https://towardsdatascience.com/hosting-your-ml-model-on-aws-lambdas-api-gateway-part-1-9052e6b63b25
  2. For creating lambda layers using ARN – https://github.com/keithrozario/Klayers
  3. Configuration of AWS Textract with Lambdas – https://medium.com/@sumindaniro/aws-textract-with-lambda-walkthrough-ed4473aedd9d
  4. Document Processing Workflow – https://github.com/aws-samples/amazon-textract-serverless-large-scale-document-processing

Related Articles