The world’s leading publication for data science, AI, and ML professionals.

FastAPI and Streamlit: The Python Duo You Must Know About

Lesson 6: Consume and Visualize your Model's Predictions using FastAPI and Streamlit. Dockerize Everything

THE FULL STACK 7-STEPS MLOPS FRAMEWORK

Photo by Hassan Pasha on Unsplash
Photo by Hassan Pasha on Unsplash

This tutorial represents lesson 6 out of a 7-lesson course that will walk you step-by-step through how to design, implement, and deploy an ML system using MLOps good practices. During the course, you will build a production-ready model to forecast energy consumption levels for the next 24 hours across multiple consumer types from Denmark.

By the end of this course, you will understand all the fundamentals of designing, coding and deploying an ML system using a batch-serving architecture.

This course targets mid/advanced machine learning engineers who want to level up their skills by building their own end-to-end projects.

Nowadays, certificates are everywhere. Building advanced end-to-end projects that you can later show off is the best way to get recognition as a professional engineer.


Table of Contents:

  • Course Introduction
  • Course Lessons
  • Data Source
  • Lesson 6: Consume and Visualize your Model’s Predictions using Fastapi and Streamlit. Dockerize Everything.
  • Lesson 6: Code
  • Conclusion
  • References

Course Introduction

At the end of this 7 lessons course, you will know how to:

  • design a batch-serving architecture
  • use Hopsworks as a feature store
  • design a feature engineering pipeline that reads data from an API
  • build a training pipeline with hyper-parameter tunning
  • use W&B as an ML Platform to track your experiments, models, and metadata
  • implement a batch prediction pipeline
  • use Poetry to build your own Python packages
  • deploy your own private PyPi server
  • orchestrate everything with Airflow
  • use the predictions to code a web app using FastAPI and Streamlit
  • use Docker to containerize your code
  • use Great Expectations to ensure data validation and integrity
  • monitor the performance of the predictions over time
  • deploy everything to GCP
  • build a CI/CD pipeline using GitHub Actions

If that sounds like a lot, don’t worry. After you cover this course, you will understand everything I said before. Most importantly, you will know WHY I used all these tools and how they work together as a system.

If you want to get the most out of this course, I suggest you access the GitHub repository containing all the lessons’ code. This course is designed to quickly read and replicate the code along the articles.

By the end of the course, you will know how to implement the diagram below. Don’t worry if something doesn’t make sense to you. I will explain everything in detail.

Diagram of the architecture you will build during the course [Image by the Author].
Diagram of the architecture you will build during the course [Image by the Author].

By the end of Lesson 6, you will know how to consume the predictions and the monitoring metrics from the GCP bucket within a web app using FastAPI and Streamlit.


Course Lessons:

  1. Batch Serving. Feature Stores. Feature Engineering Pipelines.
  2. Training Pipelines. ML Platforms. Hyperparameter Tuning.
  3. Batch Prediction Pipeline. Package Python Modules with Poetry.
  4. Private PyPi Server. Orchestrate Everything with Airflow.
  5. Data Validation for Quality and Integrity using GE. Model Performance Continuous Monitoring.
  6. Consume and Visualize your Model’s Predictions using FastAPI and Streamlit. Dockerize Everything.
  7. Deploy All the ML Components to GCP. Build a CI/CD Pipeline Using Github Actions.
  8. [Bonus] Behind the Scenes of an ‘Imperfect’ ML Project – Lessons and Insights

Check out Lesson 3 to learn how we computed and stored the predictions in a GCP bucket.

Also, in Lesson 5, you can see how we calculated the monitoring metrics, which are also stored in a GCP bucket.

You will consume the predictions and the monitoring metrics from the GCP bucket and display them in a friendly dashboard using FastAPI and Streamlit.


Data Source

We used a free & open API that provides hourly energy consumption values for all the energy consumer types within Denmark [1].

They provide an intuitive interface where you can easily query and visualize the data. You can access the data here [1].

The data has 4 main attributes:

  • Hour UTC: the UTC datetime when the data point was observed.
  • Price Area: Denmark is divided into two price areas: DK1 and DK2 – divided by the Great Belt. DK1 is west of the Great Belt, and DK2 is east of the Great Belt.
  • Consumer Type: The consumer type is the Industry Code DE35, owned and maintained by Danish Energy.
  • Total Consumption: Total electricity consumption in kWh

Note: The observations have a lag of 15 days! But for our demo use case, that is not a problem, as we can simulate the same steps as it would in real-time.

A screenshot from our web app showing how we forecasted the energy consumption for area = 1 and consumer_type = 212 [Image by the Author].
A screenshot from our web app showing how we forecasted the energy consumption for area = 1 and consumer_type = 212 [Image by the Author].

The data points have an hourly resolution. For example: "2023–04–15 21:00Z", "2023–04–15 20:00Z", "2023–04–15 19:00Z", etc.

We will model the data as multiple time series. Each unique price area and consumer type tuple represents its unique time series.

Thus, we will build a model that independently forecasts the energy consumption for the next 24 hours for every time series.

Check out the video below to better understand what the data looks like 👇


Lesson 6: Consume and Visualize your Model’s Predictions using FastAPI and Streamlit. Dockerize Everything.

The goal of Lesson 6

In Lesson 6, you will build a FastAPI backend that will consume the predictions and monitoring metrics from GCS and expose them through a RESTful API. More concretely, through a set of endpoints that will expose the data through HTTP(S).

Also, you will implement 2 different frontend applications using solely Streamlit:

  1. a dashboard showing the forecasts (aka your application),
  2. a dashboard showing the monitoring metrics (aka your monitoring dashboard).

Both frontend applications will request data from the FastAPI RESTful API through HTTP(s) and use Streamlit to render the data into some beautiful plots.

I want to highlight that you can use both frameworks (FastAPI & Streamlit) in Python. This is extremely useful for a DS or MLE, as Python is their holy grail.

Diagram of the final architecture with the Lesson 6 components highlighted in blue [Image by the Author].
Diagram of the final architecture with the Lesson 6 components highlighted in blue [Image by the Author].

Note that consuming the predictions from the bucket is completely decoupled from the 3 pipeline design. For example, running the 3 pipelines: feature engineer, training, and inference takes ~10 minutes. But to read the predictions or the monitor metrics from the bucket is almost instant.

Thus, by caching the predictions into GCP, you served the ML model online from the client’s point of view: the predictions are served in real time.

This is the magic of the batch architecture.

The next natural steps are to move your architecture from a batch architecture to a request-response or streaming one.

The good news is that the FE and training pipelines would be almost the same, and you would have to move only the batch prediction pipeline (aka the inference step) into your web infrastructure. Read this article to learn the basics of deploying your model in a request-response fashion using Docker.

Why?

Because the training pipeline uploads the weights of the trained model into the model registry. From there, you can use the weights as it fits best for your use case.


Theoretical Concepts & Tools

FastAPI: One of the latest and most famous Python API web frameworks. I have tried all of the top Python API web frameworks: Django, Flask, and FastAPI, and my heart goes to FastAPI.

Why?

First, it is natively async, which can boost performance with fewer computing resources.

Secondly, it is easy and intuitive to use, which makes it suitable for applications of all sizes. Even though, for behemoth monoliths, I would still choose Django. But this is a topic for another time.

Streamlit: Streamlit makes coding simple UI components, mostly dashboards, extremely accessible using solely Python.

The scope of Streamlit is to let Data Scientists and ML engineers use what they know best, aka Python, to build a beautiful frontend for their models quickly.

Which is precisely what we did **** ✌️

Thus, you will use FastAPI as your backend and Streamlit as your frontend to build a web app solely in Python.


Lesson 6: Code

You can access the GitHub repository here.

Note: All the installation instructions are in the READMEs of the repository. Here you will jump straight to the code.

The code within Lesson 6 is located under the following:

Using Docker, you can quickly spin up all 3 components at once:

docker compose -f deploy/app-docker-compose.yml --project-directory . up --build

Directly storing credentials in your git repository is a huge security risk. That is why you will inject sensitive information using a .env file.

The .env.default is an example of all the variables you must configure. It is also helpful to store default values for attributes that are not sensitive (e.g., project name).

A screenshot of the .env.default file [Image by the Author].
A screenshot of the .env.default file [Image by the Author].

Prepare Credentials

For this lesson, the only service you need access to is GCS. In the Prepare Credentials section of **** Lesson 3, we already explained in detail how to do this. Also, you have more information in the GitHub README.

To keep things concise, in this lesson, I want to highlight that the web app GCP service account should have read access only for security reasons.

Why?

Because **** the FastAPI API will only read data from the GCP buckets & keeping the permissions to the bare minimum is good practice.

Thus, if your web app is hacked, the attacker can only read the data using the stolen service account credentials. He can’t delete or overwrite the data, which is much more dangerous in this case.

Thus, repeat the same steps as in the Prepare Credentials section of Lesson 3, but instead of choosing the Store Object Admin role, choose the Storage Object Viewer role.

Remember that now you have to download a different JSON file containing your GCP service account key with read-only access.

Check out the README to learn how to complete the .env file. I want to highlight that only the FastAPI backend will have to load the .env file. Thus, you must place the .env file only in the app-api folder.


FastAPI Backend

As a reminder, the FastAPI code can be found under app-api/api.

Step 1: Create the FastAPI application, where we configured the docs, the CORS middleware and the endpoints root API router.

Step 2: Define the Settings class. The scope of this class is to hold all the constants and configurations you need across your API code, such as:

  • generic configurations: the port, log level or version,
  • GCP credentials: bucket name or path to the JSON service account keys.

You will use the Settings object across the project using the get_settings() function.

Also, inside the Config class, we programmed FastAPI to look for a .env file in the current directory and load all the variables prefixed with APPAPI.

As you can see in the .env.default file, all the variables start with APPAPI.

A screenshot of the .env.default file [Image by the Author].
A screenshot of the .env.default file [Image by the Author].

Step 3: Define the schemas of the API data using Pydantic. These schemas encode or decode data from JSON to a Python object or vice versa. Also, they validate the type and structure of your JSON object based on your defined data model.

When defining a Pydantic BaseModel, it is essential to add a type to every variable, which will be used at the validation step.

Step 4: Define your endpoints, in web lingo, known as views. Usually, a view has access to some data storage and based on a query, it returns a subset of the data source to the requester.

Thus, a standard flow for retrieving (aka GET request) data looks like this:

"client → request data → endpoint → access data storage → encode to a Pydantic schema → decode to JSON → respond with requested data"

Let’s see how we defined an endpoint to GET all the consumer types:

We used "gcsfs.GCSFileSystem" to access the GCS bucket as a standard filesystem.

We attached the endpoint to the api_router.

Using the api_router.get() Python decorator, we attached a basic function to the /consumer_type_values endpoint.

In the example above, when calling "https://:8001/api/v1/consumer_type_values" the consumer_type_values() function will be triggered, and the response of the endpoint will be strictly based on what the function return.

Another important thing is to highlight that by defining the response_model (aka the schema) in the Python decorator, you don’t have to create the Pydantic schema explicitly.

If you return a dictionary that is 1:1, respecting the schema structure, FastAPI will automatically create the Pydantic object for you.

That’s it. Now we will repeat the same logic to define the rest of the endpoints. FastAPI makes everything so easy and intuitive for you.

Now, let’s take a look at the whole views.py file, where we defined endpoints for the following:

  • /health → health check
  • /consumer_type_values → GET all possible consumer types
  • /area_values → GET all possible area types
  • /predictions/{area}/{consumer_type} → GET the predictions for a given area and consumer type. Note that using the {} syntax, you can add parameters to your endpoint – FastAPI docs [2].
  • /monitoring/metrics → GET the aggregated monitoring metrics
  • /monitoring/values/{area}/{consumer_type} → GET the monitoring values for a given area and consumer type

    I want to highlight again that the FastAPI backend only reads the GCS bucket’s predictions. The inference step is done solely in the batch prediction pipeline.

You can also go to "http://:8001/api/v1/docs" to access the Swagger docs of the API, where you can easily see and test all your endpoints:

Screenshot of the Swapper API docs [Image by the Author].
Screenshot of the Swapper API docs [Image by the Author].

Thats it! Now you know how to build a FastAPI backend. Things might get more complicated when adding a database layer and user sessions, but you learned all the main concepts that will get you started!


Streamlit Predictions Dashboard

Access the code under app-frontend/frontend.

Using Streamlit is quite simple. The whole UI is defined using the code below that does the following

  • it defines the title,
  • it makes a request to the backend for all possible area types & creates a dropdown based on it,
  • it makes a request to the backend for all possible consumer types & creates a dropdown based on it,
  • based on the current chosen area and consumer types, it builds and renders a plotly chart.

    Straight forward, right?

Note that we could have made additional checks for the status code of the HTTP requests. For example, if the request status code differs from 200, display a text with "The server is down." But we wanted to keep things simple and emphasize only the Streamlit code ✌️

We moved all the constants to a different file to be easily accessible all over the code. As a next step, you could make them configurable through a .env file, similar to the FastAPI setup.

Now, let’s see how we built the chart 🔥

This part contains no Streamlit code, only some Pandas and Plotly code.

The build_data_plot() function performs 3 main steps:

  1. It requests the prediction data for an area and consumer type from the FastAPI backend.
  2. If the response is valid (status_code == 200), it extracts the data from the response and builds a DataFrame from it. Otherwise, it creates an empty DataFrame to pass the same structure further.
  3. It builds a line plot – plotly graph using the DataFrame computed above.

The role of the build_dataframe() function is to take 2 lists:

  1. a list of datetimes which will be used as the X axis of the line plot;
  2. a list of values that be used as the Y-axis of the line plot;

…and to convert them into a DataFrame. If some data points are missing, we resample the datetimes to a frequency of 1H to have the data continuous and highlight the missing data points.

Quite simple, right? That is why people love Streamlit.


Streamlit Monitoring Dashboard

The monitoring code can be accessed under app-monitoring/monitoring.

You will see that the code is almost identical to the predictions dashboard.

When defining the Streamlit UI structure, we additionally implemented a plot containing the aggregated metrics and a divider.

The nice thing about decoupling the definition of the UI components with the data access is that you can inject any data in the UI without modifying it as long as you respect the interface of the expected data.

The build_metrics_plot() function is almost identical to the build_data_plot() function from the predictions dashboard, except for the data we request from the API.

The same story goes for the build_data_plot() function from the monitoring dashboard:

As you can see, all the data access and manipulation are handled on the FastAPI backend. The Streamlit UI’s job is to request and display the data.

It is nice that we just reused 90% of the predictions dashboard code to build a friendly monitoring dashboard.


Wrap Everything with Docker

The final step is to Dockerize the 3 web applications and wrap them up in a docker-compose file.

Thus, we can start the whole web application with a single command:

docker compose -f deploy/app-docker-compose.yml --project-directory . up --build

Here is the FastAPI Dockerfile:

One interesting thing to highlight is that we initially copied & installed only the Poetry dependencies. Thus, when you modify the code, the Docker image will be rebuilt only starting from line 19, aka copying your code.

This is a common strategy to leverage the Docker caching features when building an image to speed up your development process, as you rarely add new dependencies and installing them is the most time-consuming step.

Also, inside run.sh we call:

/usr/local/bin/python -m api

But wait, there is no Python file in the command 😟

Well, you can actually define a main.py file inside a module, making your module executable.

Thus, when calling the api module, you call the main.py file:

In our case, in the main.py file, we use the uvicorn web server to start the FastAPI backend and configure it with the right IP, port, log_level, etc.

Here is the Streamlit predictions dashboard Dockerfile:

As you can see, this Dockerfile is almost identical to the one used for the FastAPI backend, except for the last CMD command, which is a standard CLI command for starting your Streamlit application.

The Streamlit monitoring dashboard Dockerfile is identical to the predictions dashboard Dockerfile. So it is redundant to copy-paste it here.

The good news is that you can leverage the Dockerfile template I showed you above to Dockerize most of your Python applications ✌️

Finally, let’s see how to wrap up everything with docker-compose. You can access the file under deploy/app-docker-compose.yml:

As you can see, the frontend and monitoring services must wait for the API to turn on before starting.

Also, only the API needs to load the credentials from a .env file.

Now, you can run the entire web application using only the following command, and Docker will take care of building the images and running the containers:

docker compose -f deploy/app-docker-compose.yml --project-directory . up --build

Conclusion

Congratulations! You finished the sixth lesson from the Full Stack 7-Steps MLOps Framework course. It means that now you understand how to consume the predictions of your ML system to build your awesome application.

In this lesson, you learned how to:

  • consume the predictions & monitoring metrics from GCS,
  • build a FastAPI backend to load and serve the data from GCS,
  • implement a dashboard in Streamlit to show the predictions,
  • create a monitoring dashboard in Streamlit to visualize the performance of the model.

Now that you understand the flexibility of building an application on top of an ML system that uses a batch prediction architecture, you can easily design full-stack Machine Learning applications.

Check out Lesson 7 for the final step of the Full Stack 7-Steps MLOps Framework, which is to deploy everything to GCP and build a CI/CD pipeline using GitHub Actions.

Also, you can access the GitHub repository here.


💡 My goal is to help machine learning engineers level up in designing and productionizing ML systems. Follow me on LinkedIn or subscribe to my weekly newsletter for more insights!

🔥 If you enjoy reading articles like this and wish to support my writing, consider becoming a Medium member. Using my referral link, you can support me without extra cost while enjoying limitless access to Medium’s rich collection of stories.

Join Medium with my referral link – Paul Iusztin

Thank you ✌🏼 !


References

[1] Energy Consumption per DE35 Industry Code from Denmark API, Denmark Energy Data Service

[2] Path Parameters, FastAPI Documentation


Related Articles