
A. Introduction
A.1. Background & Motivation
In the data science life cycle, deployment is the stage where we can finally put our AI model into practice. Hence, after we build and evaluate the model, we need to deploy it as a solution that helps businesses solve real-world problems. Only from doing this, we can gain feedback from users or stakeholders to refine the model and assess it for performance and impact. In other words, the skill to manage an end-to-end data science project is a must for any data scientist out there.
A.2. Objectives
Now, imagine we need to deal with lots of real-world clothes images and your responsibility is to create an automated image classifier for an e-commerce company. As a result, the challenge is not only to build a robust deep learning model but also to deploy it as a serverless app. ** This serverless method helps us focus only on the business solution, not the heavy-lifting infrastructure that hosts the app. Luckily, the combination of AWS Lambda and API Gatewa**y can be used for hosting serverless APIs.
In this project, we will learn together how to:
- build a deep learning model to classify images using TensorFlow.
- convert the model into a more size-efficient format using TensorFlow Lite.
- deploy the model locally on our machine using Docker.
- deploy the model as a REST API using AWS Lambda and API Gateway.
A.3. Table of Contents
- Introduction > Background and Motivation > Objectives > Table of Contents
- Model Training > The Image Dataset > Build the Model > Train the Model > Evaluate the Model
- Model Conversion > Convert the Model > Use the Converted Model
- Model Deployment > Lambda Function > Deploy Locally with Docker > Deploy on AWS
Since this tutorial article will be quite extensive, feel free to jump into a specific section that suits your needs.
Notes:To follow along with this project, we expect you to have a basic understanding of how to build a deep learning model using TensorFlow, Docker [1], AWS terminologies, and own an AWS account to access its services.
This project is documented well in my GitHub repository. For those who are curious about the full code, please do have a visit 👍 .
B. Model Training
B.1. The Image Dataset
The dataset contains 3781 clothes images with the top 10 most popular categories, divided into the train, test, and validation sets. Table 1 shows the dataset summary for better understanding. We can access the data here [2].

To see the images using Python, we can use matplotlib.pyplot.gcf()
class to get the current figure and set it to have a specific number of rows and columns. Hence, in each row and column, we can put an image as a subplot.
B.2. Build the Model
We will build a Deep Learning model using the transfer learning method and image augmentation to achieve a good performance and prevent overfitting. The pre-trained model we use is InceptionV3, but feel free to experiment with another model as well. TensorFlow Keras has a model definition built-in for InceptionV3
. We will use (150,150, 3) as the desired input shape image, without including the fully connected layer at the top, and use the local weight that we can download here [3]. Import the class and instantiate it by specifying the mentioned configurations as follows:
As we can see, each layer has its own name, where the last layer’s name is mixed10 which has been convolved to 3 by 3. What’s interesting is that we can decide to move up the last layer to use a little more information. For instance, mixed7, with the output of 7 by 7. Hence, feel free to experiment by choosing the last layer for our needs.
We will define the new model that takes the pre_trained_model of InceptionV3 into account to classify clothes images with 10 different categories. Here, we can build the last layers for the new model as follows:
B.3. Train the Model
Now, we are ready to train the model. Notice that we normalize the image pixel values by dividing them by 255, set several parameters in the ImageDataGenerator to augment the input images for training to prevent overfitting, set the batch_size to be 32, and set the target image size to be (150, 150) to fit the model input shape.
Congrats! We just build a robust deep learning model using transfer learning and image augmentation. We achieved 90.59% on test accuracy with 0.273 on test loss and manage to avoid overfitting. In fact, our test accuracy is ~5% higher than the training accuracy, which is great!
B.4. Evaluate the Model
Let’s validate the model by making new predictions on unseen images. As we expected, the model works really well in making a correct prediction for each test image.
C. Model Conversion
After we build the model using TensorFlow, we will soon notice that the file size is too large and not optimized for deployment, especially on mobile or edge devices. This is where TensorFlow Lite (TFLite) comes into play. TFLite will help us convert the model to a more efficient format in .tflite. This will generate **** a small binary-sized model that is lightweight, low-latency and having a minor impact on accuracy.
C.1. Convert the Model
Here are the steps we need to do to make our best-trained model to be converted to a tflite file:
- load the model in the h5 file,
- instantiate a TFLiteConverter object from a loaded trained model,
- convert and save the converted model in tflite file format.
C.2. Use the Converted Model
Once we converted the model into the tflite file format, we can use it using a TFLite interpreter to see how the model will perform in making a prediction before deploying it on a mobile or edge device.
D. Model Deployment
In this final step, we will deploy the model using Docker, AWS Lambda, and AWS API Gateway. Firstly, we need to create a lambda_function.py
to deploy the model either on AWS Lambda or Docker since both options need this file for a deep learning model to run.
D.1. Lambda Function
The lambda_function.py
stores all the functions needed to run the app, starting from defining the interpreter, receiving the input image, preprocessing the image, and use the saved model to make the prediction.
D.2. Deploy Locally with Docker
We just created the lambda_fucntion.py
. Next, we want to take and deploy it using AWS Lambda. For that, we will use Docker. AWS Lambda supports docker, so we can use a container image to deploy our function.
In this section, you will learn how to run the model locally using Docker within your machine.
D.2.1. Dockerfile
The next step is to create a Dockerfile. Dockerfile is a way for you to put all the dependencies you need for running the code into one single image that contains everything. A Docker image is a private file system just for your container. It provides all the files and codes your container needs, such as:
- installing the python package management system.
- installing the pillow library to deal with image files.
- installing the TensorFlow Lite tflite_runtime interpreter.
- taking our model in tflite file and copy it to the docker image.
- taking the lambda_function.py and copy it to the docker image.
What we need to do now is to run and build this docker image and running it locally.
D.2.2. Build the Docker Image
The followings are the steps we do to run the application locally:
Run the docker daemon. There are 2 ways to do this:
- The first option is to open cmd as administrator, then launch the following command:
"C:Program FilesDockerDockerDockerCli.exe" -SwitchDaemon
- The second option is to run the Docker Desktop from the start menu and validate that the docker is in a running state.
Build an image from a Dockerfile.
- The command below will build the image from the content of the folder you are currently in, with the tag name
tf-lite-lambda
.
$ docker build -t tf-lite-lambda .
D.2.2. Run the Container Image
Start a container based on the image you built in the previous step. Running a container launches your application with private resources, securely isolated from the rest of your machine.
$ docker run --rm -p 8080:8080 --name clothes-classifier tf-lite-lambda
- The
-p
(stands for publish) indicates that we want to map the container port 80 to the host machine port 80. The container opens a web server on port 80, and we can map ports on our computer to ports exposed by the container. - The
--rm
(stands for remove) indicates that we want to automatically remove the container when it exists. - The
--name
gives a name to a new container, andtf-lite-lambda
is the image name we use to create the container.
Here are the screenshots of the results from the previous commands:

D.2.3. Test the Container Image
After we run the model, we want to test it. We need to create a special file that we can call to see the results of what the model has predicted.
The file contains:
- the complete categories from the expected input image.
- a PANTS (test) image obtained from this link: http://bit.ly/mlbookcamp-pants. We will send a request that has a key
url
and a URL of the image. - a URL address indicating that we deploy on the localhost inside the docker.
- a procedure to send a POST request to the target URL address to obtain the prediction result.
- parsing the prediction result and showing it to the user.
Run the test.py
on your CLI and see the result for yourself:

D.3. Deploy on AWS
We just deployed the model locally with Docker. Now, we can bring the same container and deploy it on AWS. AWS has everything you need to deploy your deep learning model online. For this case, we will use AWS CLI, AWS ECR, AWS Lambda, and AWS API Gateway.
D.3.1. Install AWS CLI
Everything we do with AWS is an API call. Hence, we must have a tool that allows us to program or script these API calls. One of the tools is AWS Command Line Interface (CLI). Before we continue, make sure you have installed AWS CLI in your local machine [4].
D.3.2. Configure Your AWS Account
If we want to deploy the app on AWS, it’s obvious we need to set up an account there. After you make an AWS IAM User account, set up your Access Key ID, Secret Access Key, Default Region, and Default Output Format (commonly JSON). Once we have done this, we can make programmatic calls to AWS from the AWS CLI.
$ aws configure

D.3.3. Create a Repo in AWS ECR (Elastic Container Registry)
AWS ECR is a place for us to put Docker images. By running the following command, we will create a private repository to store the Docker image we have built previously.
$ aws ecr create-repository --repository-name lambda-images

D.3.4. Publish the Image to the Repo
Now, we want to publish the image that we have built locally. The followings are the steps cited directly from AWS (AWS ECR > Repositories > lambda-images > View Push Command):
- Retrieve an authentication token and authenticate your Docker client to your registry.
$ aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin XXXXXXXXX474.dkr.ecr.us-east-1.amazonaws.com
- Build your Docker image using the following command.
$ docker build -t lambda-images .
- Tag your image so you can push the image to this repository.
$ docker tag tf-lite-lambda XXXXXXXXX474.dkr.ecr.us-east-1.amazonaws.com/lambda-images:tf-lite-lambda
- Run the following command to push this image to your newly created AWS repository.
$ docker push XXXXXXXXX474.dkr.ecr.us-east-1.amazonaws.com/lambda-images:tf-lite-lambda
Check the pushed image on the AWS ECR web page. Make sure to copy the URI because we need it to create a Lambda Function.

D.3.5. Create Lambda Function
Now, we are ready to create a Lambda Function. Go to AWS Lambda
and click Create function
. Choose Container Image
.

Give your function a unique name and fill in the Container Image URL with the Image URL that you copied earlier. By leaving everything to default, click Create function
You just created a lambda function for a prediction task. However, the current configuration does not give us sufficient memory and timeout. We have a big model and the function will take some time to run and load everything to the memory for the first time. Thus, we need to reconfigure it. Go to Configuration
> General Configuration
> click Edit
and set RAM and Timeout to 512/1024 and 30 sec respectively. Save it.

D.3.6. Test the Lambda Function
Next, create a test with this JSON file format:
{
"url": "https://tinyurl.com/clothes-t-shirt"
}

Give a new event a name and click Test
after you save it. Then, you will see the following result:

One thing you need to be aware of is that with AWS Lambda, you will be charged based on the number of requests and the duration, that is, the time it takes for our code to be executed. Please refer to this link for more pricing info.
D.3.7. API Gateway Integration
You just tested the function and it seems to work well in making the prediction. What’s left is to use it from outside (online). To do this, we need to create an API via AWS API Gateway.
1. Create a New API
- Visit AWS API Gateway, then choose REST API by clicking
Build
button. - Choose the protocol:
REST
. Choose New API for Creating New API. Then, fill in the API Name and add some description.


2. Create a resource: Predict and a method POST
- From
Actions
, choose Make Resource > fill in "predict". - From
Actions
, choose Make Method > selectPOST

3. Select the Lambda Function and add some details.
- Click on
POST
, then make sure to write the correct name for the Lambda Function and leave everything by default.

4. Test the API.
- From the flow chart execution, click
Test
.

- To test it, input the code in the Request Body. Click Test. Then, we should see the result in the Response Body.


5. Deploy the API
- Finally, we need to deploy the API to use it outside. From
Actions
, clickDEPLOY API
.

- Obtain the URL from the "Invoke URL" section. In this case, we have: https://xw2bv0y8mb.execute-api.us-east-1.amazonaws.com/test
- Open the Postman App or go to reqbin to test the REST API we just created. Notice, since we specify
predict
as our method forPOST
, we need to add/predict
at the end of the URL. Hence, the complete URL to make an API call for making a prediction is: https://xw2bv0y8mb.execute-api.us-east-1.amazonaws.com/test/predict - Copy and paste the link to the URL section in the app.
- Copy the following object in JSON as the body to make this POST request. Click
Send
.
{
"url": "https://tinyurl.com/clothes-t-shirt"
}
- You can see the prediction result as the content received after making the API call POST request.

- Alternatively, we can use
cURL
(stands for client URL) to send the data (in this case, the t-shirt image) in POST request to our service (in this case, the clothes image classifier) via terminal (i.e. Git Bash).
$ curl -d '{"url": "https://tinyurl.com/clothes-t-shirt"}' -H "Content-Type: application/json" -X POST https://xw2bv0y8mb.execute-api.us-east-1.amazonaws.com/test/predict
- Run the command above will generate this prediction result:

Congrats! Now your deep learning model is totally online and ready to help the world become a better place!
E. Conclusion
Conducting an end-to-end deep learning project can be challenging due to the many procedures to follow. However, thanks to the rich features of TensorFlow, we can build and deploy the model with ease. The presence of services such as Docker and AWS helps data scientists to quickly deliver the app either offline or online to solve business problems.
Using the Serverless clothes image classifier as an example, I hope this project gives you a basic idea of how to deal with a similar case in the future. Keep inspiring!
Thank you,
Diardano Raihan LinkedIn Profile
Note: Everything you have seen is documented in my GitHub repository. For those who are curious about the full code, please do have a visit 👍 .
References
- [1] T. Soam, "Installing Docker on Window 10", Medium, https://medium.com/@tushar0618/installing-docker-desktop-on-window-10-501e594fc5eb, June, 2021.
- [2] A. Grigorev, "Clothing dataset (subset)", GitHub, https://github.com/alexeygrigorev/clothing-dataset-small, June, 2021.
- [3] Machine Learning Edu, https://storage.googleapis.com/mledu-datasets/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5, June, 2021.
- [4] "Installing, updating, and uninstalling the AWS CLI version 2 on Windows", Amazon Web Services (AWS), https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2-windows.html, June, 2021.