
Introduction
As a Data Scientist, you might want to create dashboards for data visualization, visualize data and even implement business applications to assist stakeholders in making actionable decisions.
Multiple tools and technology can be used to perform those tasks, whether open-source or proprietary software. However, these might not be ideal for the following reasons:
- Some of the open-source technologies require a steep learning curve and hiring individuals with the appropriate expertise. Consequently, organizations may face an increased onboarding time for new employees, higher training costs, and potential challenges in finding qualified candidates.
- Other open-source solutions are great for prototypes but will not scale to a production-ready application
- Similarly, proprietary tools also come with their own challenges, including higher licensing costs, limited customization, and difficulty for businesses to switch to other solutions.
Wouldn’t it be nice if there was a tool that is not only open-source but also easy to learn and able to scale into a full application?
That’s where Taipy comes in handy 🎉
This article will explain what Taipy is, along with some business cases that it can solve before exploring its key features. Furthermore, it will illustrate all the steps to create a full web application.
What is Taipy and why should you care?
It is an open-source, 100% Python library and only requires basic knowledge of Python Programming. It allows data scientists and machine learning engineers, and any other Python programmer to quickly turn their data and machine learning models into a fully functional web application.
In today’s rapidly changing environment, the demand for robust, flexible, and powerful tools becomes essential, and below are some of the features that make Taipy such a unique platform:
- It is not exclusively designed for pilots but can also be extended to industrialized projects.
- The simplicity of Taipy combined with powerful functionalities allow Python developers with a minimal programming background to build robust solutions in a short amount of time.
- A high level of customizability authorizes users quickly modify and adapt Taipy’s functionalities to their needs, which provides a personalized experience many open-source tools fail to offer.
- The synchronous and asynchronous calls provided by Taipy allow the execution of multiple tasks simultaneously, which improves its overall performance.
- A Taipy application can be developed using Python scripts or Jupyter Notebooks
- With Taipy’s pipeline versioning capability, users can effectively manage different project versions.
Taipy studio extension can be installed to Visual Studio Code to significantly accelerate the development of Taipy applications.
Key Features of Taipy
Even though Taipy is great for Front-End or Back-End development, its true potential shines when developing a full web app with both front-end and back-end components
Let’s have a closer look at the main features of each one of them:
Taipy Front-End Functionalities
- Creating a user interface is performed with a basic knowledge of Python programming.
- Taipy is designed to be user-friendly, which makes the user interface creation simple and intuitive.
- No web design knowledge is required and it eliminates all prerequisites for CSS and HTML.
- It leverages augmented markdown syntax to assist users in the creation of their desired web pages.
Taipy Back-End Functionalities
- Taipy supports the creation of a robust pipeline to handle different scenarios.
- It makes the modeling of Directed Acyclic Graphs (DAGs) straightforward.
- The data caching feature improves the overall performance of Taipy applications.
- Registry of Pipeline executions.
- Pipeline Versioning.
- Users can track and evaluate their applications’ performance with Taipy’s KPI tracking tool.
- Built-in Visualization of your pipelines and associated data.
Getting started with Taipy
Now that you have a better understanding of Taipy, let’s dive into an end-to-end implementation.
The core Taipy documentation and community contributions contain relevant information, and this article will by no means replace them but can be used as an alternative place to start learning about Taipy in a real-world scenario.
To better illustrate our case, we will use the health-related data breaches maintained by the U.S. Department of Health and Human Services Office for Civil Rights. It provides information on reported breaches of unsecured protected health information about 500+ individuals.
This section will be two-fold:
- Build a graphical interface using Taipy to help end users have a global overview of different types of breaches for actionable decision-making.
- Develop a Taipy back-end framework to interact with a classification machine learning model in order to predict the type of breach for a given information.
Quick installation
Using Taipy requires Python 3.8 or above. Anaconda Python distribution (conda) and visual studio code IDE are used to install Taipy as follows:
Create the virtual environment with the name taipy-env and install Python 3.8
conda create –-name taipy-env python=3.8
Activate the previously created environment
conda activate taipy-env
The following command installs the taipy library within the virtual environment
pip install taipy
Running a Taipy App
- Create a Python script file
- Enter the following code, then save the file:
from taipy import Gui
analytics_choice = ["Breach types distribution",
"Breach by State",
"Top 10 Risky States",
"Covered Entity Type",
""]
choice = ""
my_app_page = """
# Security Breach Analytics Dashboard
## Breach Analysis
Please choose from the list below to start your analysis
<|{choice}|selector|lov={analytics_choice}|dropdown|>
Your choice: <|{choice}|text|>
"""
if __name__ == '__main__':
Gui(page=my_app_page).run(host="0.0.0.0", port=9696)
In the conda console, from the taipy_app.py type the command below:
python taipy_app.py
Successful execution of the code above generates this URL, and automatically opens a navigator window:

That’s awesome!
Now, let’s understand the previous code.
- Import the Gui module used for creating Dashboards.
- The
analytics_choice
is the list of possible choices. - Then the variable
choice
will hold a value from theanalytics_choice
**** and the interpolation of these variables is done using the <|…|> syntax. - my_page contains the information below in markdown format:
- Security Breach Analytics Dashboard has the H1 level represented with a single "#" symbol.
- Breach Analysis has the H2 level represented with a double "#" symbol followed by a simple text "Please choose from … analysis"
- We create a dropdown list using the original
analytics_choice
and choice **** variables. - Display the choice made by the user.
Finally, run the application by giving the my_app_page and specifying the port and host. Not specifying the server port will open on a default port (5000). For this specific example, the app opens on 9696 at http://localhost:9696
Time to create a Taipy Dashboard from Scratch
Let’s take our Taipy knowledge to the next level by implementing a complete dashboard. The main sections of the dashboard will leverage the following visual elements of Taipy:
- Make a choice from a list of options using Selectors.
- Trigger an action by clicking the button using Buttons.
- Show the raw data in Tables.
Display the graphical results with Charts.
All these visualization elements mentioned above are created by introducing the following markdown syntax:
<|{variable}|visual_element_name|param1=param1|param2=param2|…|>
The final dashboard will appear as follows, and the final source code is available at the end of the article.
To perform a step-by-step illustration, an example of each component will be given in a separate file and each file is run with the following command:
python file_name.py
Selectors
These give users the opportunity to choose from a dropdown list and it corresponds to what we have implemented in the "Running a Taipy App" section.
Buttons and Tables
Buttons in the user interface initiate a specific function when clicked or pressed. The _on_action_ function is triggered upon the button press.
Tables, on the other hand, are used to organize data offering three display modes: paginated, _allow_all_rows, unpaginated, and auto_loading_. The official documentation provides more information about each one of these modes.
Create a new file button.py
with the following code:
from taipy import Gui
import pandas as pd
breach_data = pd.read_csv("data/breach_report_data.csv")
def toggle_table_dialog(state):
state.show_table_dialog = not state.show_table_dialog
show_table_dialog = False
my_app_page = """
<center> Security Breach Analytics Dashboard</center>
------------------------------
<br/>
<center> Click the Button below to display data </center>
<br/>
<center><|Display Raw Data|button|on_action=toggle_table_dialog|></center>
<|{show_table_dialog}|dialog|on_action=toggle_table_dialog|width=90vw|labels=Cancel|
<center><|{breach_data}|table|width=fit-content|height=65vh|></center>
|>
"""
We start by loading the breach data into a Pandas dataframe. Then, selecting "Display Raw Data" displays the whole data in a table format as shown below:
Charts
With a better understanding of the above components, we can combine them to create charts, built upon the comprehensive poltly.js graphs library Otherwise, Taipy’s documentation provides great examples to serve as starting points. Similarly to the previous section, create a charts.py
with the following code:
A chart of type bar is created with State on the x-axis
and the Proportion on the y-axis
.
# import libraries here
my_app_page = """
<center> Security Breach Analytics Dashboard</center>
------------------------------
<center> Graph 3: Top 10 Most Affected States</center>
<br/>
<|{breach_df}|chart|type=bar|x=State|y=Individuals_Affected|>
"""
# Put the '__main__' section here
The final result is this dynamic chart of the number of individuals affected by State, and California seems to be the most affected.
Display an Image
Displaying an image in Taipy GUI is also straightforward and can be achieved with the image
attribute. The following code displays the word cloud generated by the generate_word_cloud
. The image has a width of 2400 pixels and a height of 1000 pixels. Whenever the user’s mouse is on the image, the value of the hover_text
attribute is shown: "Word Cloud of Breach Location" in this specific scenario.
<|{breach_location_image}|image|width="2400px"|height="1000px"|hover_text="Word cloud of Breach Location"|>

Also, the helper function generate_word_cloud
is defined as follows:
from wordcloud import WordCloud
from PIL import Image
from io import BytesIO
def generate_word_cloud(data, column_name):
# Join all the location information into one long string
text = ' '.join(data[str(column_name)])
wordcloud = WordCloud(
background_color="#1E3043"
)
# Generate the word cloud
my_wordcloud = wordcloud.generate(text)
image = my_wordcloud.to_image()
my_buffer = BytesIO()
image.save(my_buffer, format = 'PNG')
return my_buffer.getvalue()
Callback function
The goal is to have a dynamic GUI that is updated based on the user’s selection. This is achieved using Taipys callback function which automatically triggers any function with the name on_change
in the local namespace as the global callback function. The implementation is given as follows:
def update_Type_of_Breach(state, var_name, var_value):
if var_name == "Type_of_Breach":
state.df = breach_df[breach_df.Type_of_Breach == var_value]
Layouts
Multiple charts can provide valuable business insights, but displaying them vertically one after another may not be the most effective approach
Instead, we can create a layout to organize the components into a regular grid between layout.start
and layout.end
block. Each component is created within the part.start
and part.end
block.
The following basic syntax creates a 2 columns grid with a 1.8 root element’s font size:
<|layout.start|columns= 1 2|gap=1.8rem|
<optional_id|part|>
<|{first content}|>
|optional_id>
...
<
<|{second content}|>
>
>
With this understanding of the layout, we can create the final dashboard with five main charts:
- Chart 1 gives the word cloud related to the location of breach information.
- Chart 2 shows the number of individuals affected by State.
- Chart 3 determines the total number of individuals affected by the Type of breach.
- Chart 4 gives for each year the total number of individuals affected.
- Chart 5 shows the number of individuals affected per Covered Entity.
# Preprocessing of the DateTime column
breach_df['Breach_Submission_Date'] = pd.to_datetime(breach_df['Breach_Submission_Date'])
breach_df["Year"] = breach_df["Breach_Submission_Date"].dt.year
markdown = """
<|toggle|theme|>
# <center>Security Breach Analytics Dashboard 🚨 </center>
<center>**Chart 1:**General Trend Location of Breached Information </center>
<center><|{breach_location_image}|image|width=2400px|height=1000px|hover_text=Word cloud of Breach Location|></center>
------------------------------
<|layout|columns=2 5 5|gap=1.5rem|
<column_1|
### Type of Breach:
<|{breach_type}|selector|lov={breach_types}|dropdown|width=100%|>
------------------------------
<|Display Raw Data|button|on_action=toggle_table_dialog|>
<|{show_table_dialog}|dialog|on_action=toggle_table_dialog|width=90vw|labels=Cancel|
<center><|{breach_df}|table|width=fit-content|height=65vh|></center>
|>
|column_1>
<column_2|
**Chart 2:** Individuals Affected by State
<|{df}|chart|type=bar|x=State|y=Individuals_Affected|>
**Chart 4:** Individuals Affected by Year
<|{df}|chart|type=bar|x=Year|y=Individuals_Affected|>
|column_2>
<column_3|
**Chart 3:** Individuals Affected by Type of Breach
<|{df}|chart|type=bar|x=Type_of_Breach|y=Individuals_Affected|>
**Chart 5:** Individuals Affected per Covered Entity Type
<|{df}|chart|type=bar|x=Covered_Entity_Type|y=Individuals_Affected|>
|column_3>
|>
"""
if __name__ == "__main__":
gui = Gui(page=markdown)
gui.run(dark_mode=False, host="0.0.0.0", port=9696)
Before configuring the dashboard, a new Year
column is created from the Breach_Submission
column, which is then used as the x-axis in Chart 4.
Running all the code should generate the first dashboard illustrated above.
Taipy Back-end in Action
In the next section, you will use Taipy’s back-end capabilities to easily and efficiently create, manage and execute your data pipelines to train a Random Forest classifier and so determine the type of breach of a given breach information.
There are two main parts in this section. First, you will build the complete graphical representation of the workflow using Taipy Studio. Then, write the corresponding Python code.
Taipy Studio
Taipy Studio is an extension to Visual Studio Code and can be installed as follows:
Restart VSCode after the installation is completed, then a Taipy Studio interface will be displayed after clicking on the Taipy logo on the bottom left. This will show four main tabs such as Config Files, Data Notes, Tasks, Pipelines, and Scenarios.
All these tabs can be used to achieve our goal of implementing an end-to-end pipeline, and the first step is to create a configuration file (taipy_config.toml) that will contain all these tabs represented by 4 logos on the top right after selecting the "Taipy: Show View" icon.


Below are the main functions that will be implemented, along with a brief explanation of each of the previous tabs.
filter_columns
function is responsible for selecting the relevant columns from the data and generating a Pandas dataframe.preprocess_columns
used for performing feature engineering.encode_features
responsible for encoding the relevant features in the correct format.split_data
is the function to split the data into training and testing datasets.train_model
is used to train the model.show_performance
is the final stage for displaying the performance of the model.
Scenarios and Pipelines
This is the first thing to do when setting up a pipeline. A scenario is made up of one or more pipelines. It works as a registry of executions. Let’s create a scenario with the name DATA_BREACH_SCENARIO followed by the pipeline DATA_BREACH_PIPELINE as follows:

Tasks
A task refers to a Python function that can be executed, and there are overall six tasks that will be implemented, from filter_columns
to show_performance
.
The output of the pipeline is connected to the input of each task as follows:

The next step is to configure these tasks in Taipy Studio by connecting each Python function to the corresponding task. But before that, we need to create those functions’ signatures in the data_breach_tasks.py
file as follows:
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import (
confusion_matrix,
accuracy_score,
precision_score,
recall_score,
f1_score
)
def filter_columns(df, list_columns_to_skip):
filtered_df = df.drop(list_columns_to_skip, axis=1)
return filtered_df
def preprocess_columns(df):
df['Breach_Submission_Date'] = pd.to_datetime(data['Breach_Submission_Date'])
df['Breach_Submission_Month'] = df['Breach_Submission_Date'].dt.month
df['Breach_Submission_Year'] = df['Breach_Submission_Date'].dt.year
df.drop("Breach_Submission_Date", axis=1, inplace=True)
return df
def encode_features(df):
list_columns_to_encode = ['State','Location_of_Breached_Information',
'Business_Associate_Present',
'Covered_Entity_Type']
le = LabelEncoder()
for col in list_columns_to_encode:
df[col] = le.fit_transform(df[col])
X = df.drop('Type_of_Breach', axis=1)
y = le.fit_transform(df['Type_of_Breach'])
return {"X": X, "y": y}
def split_data(features_target_dict):
X_train, X_test, y_train, y_test =
train_test_split(features_target_dict["X"],
features_target_dict["y"],
test_size=0.3,
random_state=42)
return {
"X_train": X_train, "X_test": X_test,
"y_train": y_train, "y_test": y_test
}
def train_model(train_test_dictionary):
classifier = RandomForestClassifier()
classifier.fit(train_test_dictionary["X_train"],
train_test_dictionary["y_train"])
predictions = classifier.predict(train_test_dictionary["X_test"],
train_test_dictionary["y_test"])
return predictions
def show_performance(train_test_dictionary, predictions):
y_test = train_test_dictionary["y_test"]
accuracy = accuracy_score(y_test, predictions)
precision = precision_score(y_test, predictions)
recall = recall_score(y_test, predictions)
f1score = f1_score(y_test, predictions)
return pd.DataFrame({
"Metrics": ['accuracy', 'precision', 'recall', 'f1_score'],
"Values": [accuracy, precision, recall, f1score]
})
Next, we link each task to the corresponding Python following the 3 steps below. The illustration is given for the filter_columns
tasks but has to be performed for every task.

Data Nodes
Data nodes do not contain the actual data but contain all the necessary information to read and write those data. They can be the reference to any data type such as text, CSV, JSON, and more.
For instance, the filter_columns
function has:
- One input node (filtering_node) which is a .CSV file, and
- One output node (filtered_df): which is also stored as a .CSV file. This is then used as the input of the preprocess_columns function.
The node for the interaction is defined as follows showing the modification of the storage type from pickle to .csv:


The next step is to define the path to the original input dataset. This is done with the help of the "New property" attribute in the data node. Then, type Enter and provide the path to the .CSV file.


Repeat the same process for all the inputs where a .CSV file is required, and the final diagram will look like this after specifying all the data nodes and their relationships.

After the configuration of the pipeline, a .toml script format of the whole diagram is generated in the taipy_config.toml file and looks like the one shown in the animation below.
Then, this .toml file can be loaded in any Python script to execute the pipeline. Let’s create such a file with the name run_pipeline.py
.
from taipy import Core, create_scenario
from taipy.core.config import Config
config_file_name = "./taipy_config.toml"
scenario_name = "DATA_BREACH_SCENARIO"
Config.load(config_file_name)
scenario_config = Config.scenarios[scenario_name]
if __name__ == "__main__":
Core().run()
pipeline_scenario = create_scenario(scenario_config)
pipeline_scenario.submit() # This executes the scenario
model_metrics = pipeline_scenario.performance_data.read()
print(model_metrics)
We start by importing the relevant modules, followed by the definition of the configuration file and the name of the scenario to trigger.
Then, the pipeline is executed using the submit() function.
Finally, we retrieve the model’s performance and print the results, as shown below:

This dataframe can be further integrated into the initial dashboard to display in a graphical manner the numerical values.
Conclusion
This article has provided a complete overview of Taipy, and how to bring front-end and back-end to any data and machine learning models to create fully functional web applications.
Furthermore, with the new release, Taipy provides Core visual elements that allow seamless integration between the front-end and the back-end, empowering users to create powerful Business objects effortlessly, and these integrations are available from the official website.
If you are still hesitant about using Taipy, it is time to give it a try to save time, energy, and most importantly, money. Finally, these awesome tutorials from Taipy can help you further your learning and strengthen your skill sets.