The world’s leading publication for data science, AI, and ML professionals.

How to Make Stunning Bar Charts in R: A Complete Guide with ggplot2

Make data visualization people will remember. The only guide for R you'll need.

Photo by Edward Howell on Unsplash
Photo by Edward Howell on Unsplash

The language of Data Visualization is universal. Not everyone will recognize a great visualization, but everyone will remember a terrible one. If you use tools and techniques discussed in this article, the chances for your visualization to be classified as "terrible" will be close to zero.

This article shows you how to make all sorts of bar charts with R and ggplot2. You’ll also learn how to make them aesthetically-pleasing with colors, themes, titles, and labels.

Today you’ll learn how to:

  • Make your first bar chart
  • Change colors and themes
  • Add titles, subtitles, and captions
  • Edit axis labels
  • Make stacked, grouped, and horizontal bar charts
  • Add labels

Make your first bar chart

There are plenty of datasets built into R and thousands of others available online. Still, you’ll declare your own. The reasoning is simple – you’re here to learn how to make bar charts, not how to aggregate data.

Here’s the dataset you’ll use today:

library(ggplot2)

data <- data.frame(
  quarter=c("Q1", "Q1", "Q2", "Q2", "Q3", "Q3", "Q4", "Q4"),
  product=c("A", "B", "A", "B", "A", "B", "A", "B"),
  profit=c(10, 14, 12, 11, 13, 15, 16, 18)
)

R’s standard library for data visualization is ggplot2. It’s based on the layering principle. For example, you first declare a data layer and then a visualization layer.

These two are mandatory for any visualization. You’ll see later how additional layers can make charts more informative and appealing.

To start, you’ll make a bar chart that has the column quarter on the x-axis and profit on the y-axis. That’s declared in the first layer (data), and the second layer (visualization) specifies which type of visualization you want.

The geom_bar and geom_col layers are used to create bar charts. With the first option, you need to specify stat = "identity" for it to work, so the ladder is used throughout the article.

You can create a simple bar chart with this code:

ggplot(data, aes(x = quarter, y = profit)) + 
  geom_col()

Here’s the corresponding visualization:

Image 1 - Simple bar chart (image by author)
Image 1 – Simple bar chart (image by author)

This one gets the job done but doesn’t look like something you’d want to show to your boss. You’ll fix it in the following sections.


Colors and themes

Tweaking colors and themes is the simplest thing you can do to make visualization look better. The geom_bar() has two useful parameters:

  • color – outline color of the bars
  • fill – fill color of the bars

Here’s how to use fill to make your chart blue:

ggplot(data, aes(x = quarter, y = profit)) + 
  geom_col(fill = "#0099f9")
Image 2 - Using fill to change the bar color (image by author)
Image 2 – Using fill to change the bar color (image by author)

The color parameter changes only the outline. The dataset you’re using has two distinct products. R draws a fill line between products’ values, as stacked bar charts are used by default. You’ll learn more about the stacked charts later.

Here’s what this means in practice. The code snippet below sets the fill color to white and outline color to blue:

ggplot(data, aes(x = quarter, y = profit)) +
  geom_col(color = "#0099f9", fill = "#ffffff")
Image 3 - Changing the outline color (image by author)
Image 3 – Changing the outline color (image by author)

In case coloring doesn’t do the trick, you can completely change the theme. That’s yet another layer to add after the initial visualization layer. Here’s how to do it:

ggplot(data, aes(x = quarter, y = profit)) +
  geom_col(fill = "#0099f9") +
  theme_classic()
Image 4 - Changing the visualization theme (image by author)
Image 4 – Changing the visualization theme (image by author)

If this theme isn’t your thing, there‘s plenty more to pick from. You can find the entire list here.


Titles, subtitles, and captions

A visualization without a title is useless. There’s no way to know if you’re looking at Election votes or 2020 USA election votes in California. You can use subtitles to put additional information, but it’s not mandatory. Captions are useful for placing visualization credits and sources.

The most convenient way to add these is through a labs() layer. It takes in values for title, subtitle, and caption.

Let’s see how to add all three:

ggplot(data, aes(x = quarter, y = profit)) +
  geom_col(fill = "#0099f9") +
  labs(
    title = "Quarterly Profit (in million U.S. dollars)",
    subtitle = "A simple bar chart",
    caption = "Source: ImaginaryCo"
  )
Image 5 - Title, subtitle, and caption with default styles (image by author)
Image 5 – Title, subtitle, and caption with default styles (image by author)

It’s a good start, but what if you want to add styles? Let’s see how to color the title, bold the subtitle, and italicize the caption:

ggplot(data, aes(x = quarter, y = profit)) +
  geom_col(fill = "#0099f9") +
  labs(
    title = "Quarterly Profit (in million U.S. dollars)",
    subtitle = "A simple bar chart",
    caption = "Source: ImaginaryCo"
  ) +
  theme(
    plot.title = element_text(color = "#0099f9", size = 20),
    plot.subtitle = element_text(face = "bold"),
    plot.caption = element_text(face = "italic")
  )
Image 6 - Styling title, subtitle, and caption (image by author)
Image 6 – Styling title, subtitle, and caption (image by author)

Let’s take this a step further. Here’s how to align the title to the middle, subtitle to the right, and caption to the left:

ggplot(data, aes(x = quarter, y = profit)) +
  geom_col(fill = "#0099f9") +
  labs(
    title = "Quarterly Profit (in million U.S. dollars)",
    subtitle = "A simple bar chart",
    caption = "Source: ImaginaryCo"
  ) +
  theme(
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_text(hjust = 1),
    plot.caption = element_text(hjust = 0)
  )
Image 7 - Aligning title, subtitle, and caption (image by author)
Image 7 – Aligning title, subtitle, and caption (image by author)

You’ve learned how to add a nicely-formatted title, but the default axis labels still holds your visualization back. You’ll learn how to change them next.


Axis Labels

Long story short – it works identically as with titles and subtitles. The labs() layer takes in values for both X and Y-axis labels.

Here’s how to change the text:

ggplot(data, aes(x = quarter, y = profit)) +
  geom_col(fill = "#0099f9") +
  labs(
    x = "Quarter of 2020",
    y = "Profit in 2020"
  )
Image 8 - Changing X and Y axis labels (image by author)
Image 8 – Changing X and Y axis labels (image by author)

You can change the styles the same way you did with titles, subtitles, and captions. The following code snippet will make your x-axis label blue and bold, and y-axis label italic:

ggplot(data, aes(x = quarter, y = profit)) +
  geom_col(fill = "#0099f9") +
  labs(
    x = "Quarter of 2020",
    y = "Profit in 2020"
  ) +
  theme(
    axis.title.x = element_text(color = "#0099f9", size = 15, face = "bold"),
    axis.title.y = element_text(size = 15, face = "italic")
  )
Image 9 - Changing stylings of X and Y axis labels (image by author)
Image 9 – Changing stylings of X and Y axis labels (image by author)

And that does it for changing the basic visuals. You’ll learn how to work with different bar charts next – stacked, grouped, and horizontal.


Stacked, grouped, and horizontal bar charts

The ggplot2 package uses stacked bar charts by default. Stacked bar charts are best used when all portions are colored differently.

To change the coloring, you only need to change the fill value in the data layer. Here’s an example:

ggplot(data, aes(x = quarter, y = profit, fill = product)) +
  geom_col()
Image 10 - Default stacked bar chart (image by author)
Image 10 – Default stacked bar chart (image by author)

There’s a visible distinction between products, and you can now see how much profit each product made quarterly.

There are two ways to change each portion’s color:

  • Manually – by specifying a vector of color names or color hex codes
  • With palettes – by using built-in color palettes

Let’s cover the manual approach first. You have to add a layer with scale_fill_manual:

ggplot(data, aes(x = quarter, y = profit, fill = product)) +
  geom_col() +
  scale_fill_manual(values = c("#69c6ff", "#0099f9"))
Image 11 - Stacked bar chart with custom colors (image by author)
Image 11 – Stacked bar chart with custom colors (image by author)

Palettes are a bit easier because you don’t need to know exact color values. For the same reason, it can also be considered as a limitation. Here’s a list of built-in palettes. The scale_fill_brewer layer is used to work with palettes:

ggplot(data, aes(x = quarter, y = profit, fill = product)) +
  geom_col() +
  scale_fill_brewer(palette = "Set1")
Image 12 - Stacked bar chart colored with a built-in palette (image by author)
Image 12 – Stacked bar chart colored with a built-in palette (image by author)

Onto the grouped bar charts now. They display bars corresponding to a group next to each other instead of on top of each other. To use grouped bar charts, you need to put position = position_dodge() into a geom_bar layer:

ggplot(data, aes(x = quarter, y = profit, fill = product)) +
  geom_col(position = position_dodge())
Image 13 - Grouped bar chart (default) (image by author)
Image 13 – Grouped bar chart (default) (image by author)

You can change the coloring the same way you did with stacked bar charts – through the scale_fill_manual or scale_fill_brewer layers. Here’s an example:

ggplot(data, aes(x = quarter, y = profit, fill = product)) +
  geom_col(position = position_dodge()) +
  scale_fill_manual(values = c("#3db5ff", "#0099f9"))
Image 14 - Grouped bar chart with custom colors (image by author)
Image 14 – Grouped bar chart with custom colors (image by author)

Finally, let’s cover horizontal bar charts. They are useful when there are many categories on the x-axis or when their names are long. The coord_flip() is used to turn any vertical bar chart into a horizontal one:

ggplot(data, aes(x = quarter, y = profit)) +
  geom_col(fill = "#0099f9") +
  coord_flip()
Image 15 - Horizontal bar chart (default) (image by author)
Image 15 – Horizontal bar chart (default) (image by author)

You can use the scale_fill_manual or scale_fill_brewer layers to change the color. Here’s an example:

ggplot(data, aes(x = quarter, y = profit, fill = product)) +
  geom_col(position = position_dodge()) +
  scale_fill_manual(values = c("#3db5ff", "#0099f9")) +
  coord_flip()
Image 16 - Horizontal bar chart with custom colors (image by author)
Image 16 – Horizontal bar chart with custom colors (image by author)

Now you know how to make every type of bar chart – but there’s still one thing you can improve. Let’s see what that is in the next section.


Labels

Bar charts can be hard to look at. Knowing the exact value is often a requirement. If the y-axis is on a scale of millions, reading values from a chart becomes an approximation (at best). That’s where labels come in.

You can put text somewhere near the top of each bar to show the exact value. That solves the problem of reading values from the chart. It also makes it more user-friendly, as you don’t have to divert your view to the y-axis constantly.

You’ll learn how to put labels on top of bars. For the first example, you’ll need to filter the dataset so only product A is shown. The reason is simple – ggplot2 uses stacked bar charts by default, and there are two products in the stack for each quarter.

You’ll learn how to add labels for multiple stacks later, but let’s start with the basics.

Here’s the code:

library(dplyr)

data_a <- data %>%
  filter(product == "A")

ggplot(data_a, aes(x = quarter, y = profit)) +
  geom_col(fill = "#0099f9") +
  geom_text(aes(label = profit), vjust = -0.5, size = 5)
Image 17 - Labels on top of bars (image by author)
Image 17 – Labels on top of bars (image by author)

But what if you want to put the labels inside? Just play with vjust a bit. Setting it to 2 does the trick:

ggplot(data_a, aes(x = quarter, y = profit)) +
  geom_col(fill = "#0099f9") +
  geom_text(aes(label = profit), vjust = 2, size = 5, color = "#ffffff")
Image 18 -Labels inside the bars (image by author)
Image 18 -Labels inside the bars (image by author)

Things get a bit trickier if you need labels for multiple stacks. You have to specify position = position_stack() inside the geom_text layer. Setting the vjust to 0.5 makes them centered:

ggplot(data, aes(x = quarter, y = profit, fill = product, label = profit)) +
  geom_col() +
  scale_fill_manual(values = c("#3db5ff", "#0099f9")) +
  geom_text(position = position_stack(vjust = 0.5), size = 4, color = "#ffffff")
Image 19 - Labels inside the stacked bar chart (image by author)
Image 19 – Labels inside the stacked bar chart (image by author)

There’s an alternative for a grouped bar chart. You’ll have to specify position = position_dodge() for it to work. This code centers the labels inside every group:

ggplot(data, aes(x = quarter, y = profit, fill = product)) +
  geom_col(position = position_dodge()) +
  scale_fill_manual(values = c("#3db5ff", "#0099f9")) +
  geom_text(aes(label = profit), position = position_dodge(0.9), vjust = 2, size = 4, color = "#ffffff")
Image 20 - Labels centered inside the grouped bar chart (image by author)
Image 20 – Labels centered inside the grouped bar chart (image by author)

And that’s all there is about labels and bar charts. Let’s wrap things up next.


Conclusion

Today you’ve learned how to make every type of bar chart in R and how to customize it with colors, titles, subtitles, and labels. You’re now able to use bar charts for basic visualizations, reports, and dashboards.

Are you completely new to R? Check out this detailed R guide for programmers.


Loved the article? Become a Medium member to continue learning without limits. I’ll receive a portion of your membership fee if you use the following link, with no extra cost to you.

Join Medium with my referral link – Dario Radečić


Originally published at https://appsilon.com on December 7, 2020.


Related Articles