
Data visualizations are great tools to infer meaningful results from plain data. They are widely-used in exploratory Data Analysis process in order to better understand the data at hand. What if we integrate a few visualization structures into pandas dataframes? I think it makes them look better than plain numbers. Furthermore, we may add some informative power on the display of a dataframe.
We can achieve this by using Style property of pandas dataframes. Style property returns a styler object which provides many options for formatting and displaying dataframes. In this post, we will walk through several examples and see how a dataframe can be displayed in different styles.
There are built-in style functions that we can use by adjusting parameters. We can also write our own style functions and pass it to the styler object which then implement styles before rendering.
There are two ways to use styler objects. One is element-wise styling that can be done with applymap method. The other one is column- or row-wise styling which requires to use apply method.
Let’s first create a sample dataframe with numpy and pandas.
df = pd.DataFrame({'A':np.linspace(1,8,8),
'B':np.random.random(8),
'C':np.random.randn(8),
'D':np.random.randn(8),
'E':np.random.randint(-5,5,8)})
df.iloc[[1,5],[1,3]] = np.nan
df

It looks plain and simple. We can write a function that displays some values with a different color based on a condition. For instance, we can choose to display negative values with red. Here is the function to accomplish this task.
def color_negative_values(val):
color = 'red' if val < 0 else 'black'
return 'color: %s' % color
Then we just pass it to applymap method.
df.style.applymap(color_negative_values)

Applymap executes element-wise operations whereas apply does it based on columns or rows. Here is a function that changes the background color of the max value in a column.
def color_max(s):
is_max = s == s.max()
return ['background-color: lightblue' if v else '' for v in is_max]
We just need to pass it to apply method.
df.style.apply(color_max)

We can also apply this function to rows by setting axis parameter as 1.
df.style.apply(color_max, axis=1)

Maximum value of each row is colored. They happened to be in column "A" in this case.
We can combine different style functions by chain operations.
df.style.applymap(color_negative_values).apply(color_max)

Style functions can be partially applied to a dataframe by selecting particular rows or columns using subset parameter.
df.style.apply(color_max, subset=['B','C'])

Color_max function is applied to columns "B" and "C".
In addition to customized functions, pandas have some built-in style functions that might satisfy common tasks. For instance, highlight_null function marks missing values.
df.style.highlight_null(null_color='yellow')

We can change the color with null_color parameter. Another useful built-in function is background_gradient which marks cell proportional to the values with some help from seaborn.
import seaborn as sns
cm = sns.light_palette("green", as_cmap=True)
df.style.background_gradient(cmap=cm)

The bigger the value, the darker the background color. Missing values are separated from the rest.
Highlight_max and highlight_min functions mark the maximum and minimum values in a column or row like our custom color_max function.
df.style.highlight_min(color='lightgreen', axis=1)

df.style.highlight_max()

Default value of axis parameter is 0 which does column-wise operations.
Set_properties function allows to combine multiple style selections.
df.style.set_properties(**{'background-color': 'lightblue',
'color': 'black',
'border-color': 'white'})

Another highly useful function is bar which plots bars over the cells whose lenghts are proportional to the values in the cells.
df.style.bar(color='lightgreen')

By using align parameter, we can show negative and positive values with different colors.
df.style.bar(align='mid', color=['red', 'lightgreen'])

Transfering styles
The style functions we used here are pretty simple ones. However, we can also create more complex style functions that enhance the informative power of dataframes. We may want to use same styling on all the dataframes we work on. Pandas offers a way to transfer styles between dataframes.
We first save the style to a styler object.
style = df.style.applymap(color_negative_values).apply(color_max)
style

Let’s create another sample dataframe to work on.
df2 = pd.DataFrame({'col1':np.random.random(8),
'col2':np.random.randn(8),
'col3':np.random.randint(-5,5,8)})
df2

We can then create another styler object and use same styles saved in the previous styler object.
style2 = df2.style
style2.use(style.export())
style2

We have covered
- How to create custom styling functions and apply to dataframes
- How to use built-in style functions
- How to transfer styles from one styler object to another
There are other styling and formatting options available that can be accessed on the styling section of pandas user guide.
Thank you for reading. Please let me know if you have any feedback.