A note on long labels

A collection of common dataviz caveats by Data-to-Viz.com




The issue


Most of the time, barplot or lollipop plots are plotted vertically with the Y-axis representing the value of the numeric variable. If your labels on the X-axis are long, they need to be rotated in order not to overlap.

As a result, these labels become hard to read:

# Libraries
library(tidyverse)
library(hrbrthemes)
library(kableExtra)
options(knitr.table.format = "html")

# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/7_OneCatOneNum.csv", header=TRUE, sep=",")

# Barplot
data %>%
  filter(!is.na(Value)) %>%
  arrange(Value) %>%
  tail(20) %>%
  mutate(Country=factor(Country, Country)) %>%
  ggplot( aes(x=Country, y=Value) ) +
    geom_bar(stat="identity", fill="#69b3a2") +
    theme_ipsum() +
    theme(
      panel.grid.minor.x = element_blank(),
      panel.grid.major.x = element_blank(),
      legend.position="none",
      axis.text.x = element_text(angle = 80, hjust=1)
    ) +
    xlab("") +
    ylab("Weapon quantity (SIPRI trend-indicator value)")

Note: this barplot shows the quantity of weapons exported by the top 20 largest exporters in 2017, read more.

Solving the issue


The workaround is pretty simple, why not considering an horizontal version of the chart?

# Barplot
data %>%
  filter(!is.na(Value)) %>%
  arrange(Value) %>%
  tail(20) %>%
  mutate(Country=factor(Country, Country)) %>%
  ggplot( aes(x=Country, y=Value) ) +
    geom_bar(stat="identity", fill="#69b3a2") +
    theme_ipsum() +
    theme(
      panel.grid.minor.y = element_blank(),
      panel.grid.major.y = element_blank(),
      legend.position="none"
    ) +
    xlab("") +
    ylab("Weapon quantity (SIPRI trend-indicator value)") +
    coord_flip()

Warning


Note that the horizontal version is not always an option though. If you categorical variable has a natural order, it is better to stick to the vertical version. It happens when:

  • you’re represented time series: time must be represented on the X axis by convention. Not doing so could mislead your audience.
  • you have an ordinal variable like age range.

Going further



Dataviz decision tree

Data To Viz is a comprehensive classification of chart types organized by data input format. Get a high-resolution version of our decision tree delivered to your inbox now!


High Resolution Poster
 

A work by Yan Holtz for data-to-viz.com