
TUTORIAL – TABLES – R
I am a Data Scientist, and most of the time, I think about a perfect way to visualize a vast amount of data to convey interesting findings to clients and team members. And to be honest, in most cases, if not in every case, showing the data and its structure in the form of a simple table is necessary and will help to improve the overall understanding.
However, in most cases, I use PowerPoint or Excel to create this table to look presentable and/ or publishable. This, of course, breaks the possibility to automatically reproduce this result. For one of my latest projects, I learned about and applied a package that allowed me to create beautiful and publication-ready data tables without leaving my Data Science platform.
1 Introduction
In this article, I will show you how to use the Grammar of Tables (gt) package to create flawless and publication-ready tables, turn your settings into a theme for quick reuse, and how to apply this theme in your next data science project.

2 Setup
Most of my client work involves Python and Pandas. However, by training, I am an R person. I worked this out for the R data science platform, but I will investigate how this can be achieved using Python and Pandas for one of my upcoming articles.
Nevertheless, here is the list of software and packages that I am using:
- R & RStudio – the data science platform and IDE of voice.
- tidyverse package – this package allows me to write elegant, readable, and efficient code to manipulate data frames
- gt package – the Grammar of Tables (gt) package to create flawless table designs
- Gapminder package – excerpt of the Gapminder data on life expectancy, GDP per capita, and population by country
2.1 Brief overview of Grammar of Tables (gt)
The gt package follows a descriptive approach to create tables, such as the Grammar of Graphics. In short, it allows us to specify what should happen and not to specify how it should happen – such a great and readable way to write code.

The gt package defines an exhaustive number of areas to add to your table and manipulate their visualization. In the examples below, I will explain to you how to use these areas.
It is also important to note that you may create a table for your R notebook and save the table in several formats, including HTML and PNG, which is helpful if you need to report your tables in different publications, i.e., a website or PowerPoint document.
2.2 Packages and Constants
Before I start with creating a table, I share this code that will load – and, if necessary, install – required packages:
I also use some constants that helped me to write a flexible R script. Please note that I use c_rn to specify the maximal row numbers to include in the table, c_save to determine whether to save every step of the table creation process as a file (which takes a little bit of time) and c_format to specify the output format.
The general output of the Gapminder data set looks like this:

3 Create a Flawless and Publication-ready Table
The most basic use of the gt package is just to pass the filtered data frame to the gt function. This is not too exciting, though, or adds any benefit to the standard console output.

3.1 Adding a Grouping Column
This might not be relevant to the majority of your data frames. However, I would like to show how this works and will help to understand your table better. To do so, I pass the column continent as the grouping column and specify the country column as the row label column.

3.2 Adding Summary Rows
The following code allows us to add summary rows. What the summary might contain is up to you and worthwhile for your audience. I decided to add the functions sum, average, and standard deviation. Although not all summary functions make sense for this data set, I would like to show how to implement them.

3.3 Changing the Label for each Column
I believe you experienced this also in your projects. How to label your columns? Your data frame often uses technical (i.e., short and blank space-free names), yet there are functional names that are meaningful for your audience. An example in the Gapminder data set would be the column lifeExp that really stands for "Life Expectancy". The gt package allows us to change the labels for the resulting table without changing it in your dataset.

3.4 Formatting Columns
Formatting columns include several things. In this example, I tell the package to differentiate between number and currency columns, their alignment, and how much space (in px) they should have. The function opt_row_striping() creates banded rows that improve the table’s readability.

3.5 Adding Titles, Footnotes, and Sources
If you plan to have all relevant meta-information as part of the table layout, the gt package will help you. Especially the possibilities for the footnotes are beneficial because you might apply a function to it. In the following example, a footnote will be added to the country with the lowest population. Please note that it is possible to use markup to modify the text layout using the md function.

3.6 Applying Formatting to the table
This is a rather long one, but I hope the code explains what might be happening here. Please note, that I use the functions tab_options() as well as tab_style(). tab_options looks like to manipulate general settings, while tab_style is used for more specific locations. Please share your ideas to simplify the following code. Very much appreciated.

3.7 Applying Conditional Cell Coloring
Another helpful feature of the gt package is the ability to color cells based on values. In the following example, I will make use of it in two different ways. First, I would like to apply blue shading for the column "Life Expectancy" in the first one. For this, I will use the color palette with the name c_col, which was specified as one of the constants in the beginning.
I would like to color the row with the minimum population in the color blue in the second way.

4 Creating a Reusable gt Theme
To create a theme, I needed to understand to differentiate between settings related to the look and associated with the data-specific columns (that will change with every data set). To showcase this, I will use the data set "Daily S&P 500 Index data" that is part of the gt package.
4.1 Create a gt Table from the S&P Data Set

4.2 Create a Theme
I created a function my_theme() that may then quickly applied to any of your gt tables.

Please note that this theme is built with my limited knowledge. So please share ideas on how to improve and simplify this code. Very much appreciated.
4.3 Apply Column-specific formats
The remaining step is to format the S&P-specific columns. Firstly, I specify columns with a currency format.

Lastly, I add summary rows to the table, including mean and standard deviation.

5 Conclusion
In this article, I introduced you to the gt package. Then, I showed you how to format a table, include summary rows, and apply conditional cell formatting. Further, I explained how to create an individual theme, which you might reuse for every data project.
Please note that I only scratched the surface of the gt package. Also, my knowledge might be limited, and there are better ways to achieve these results with less code. If you are interested, please reach out to other tutorials to further educate yourself. I am happy to share great tutorials with the following list:
- http://www.danieldsjoberg.com/gt-and-gtsummary-presentation/#1
- https://themockup.blog/static/slides/intro-tables.html#1
- https://malco.io/2020/05/16/replicating-an-nyt-table-of-swedish-covid-deaths-with-gt/
- https://themockup.blog/posts/2020-05-16-gt-a-grammer-of-tables/
What do you think? Do you have any feedback for me?
Please feel free to contact me with any questions and comments. Thank you.