Multi-variable visualisation

Topic 2 · Day 5 · 2 hours

Christian González Martel

Department of Quantitative Methods in Economics and Management · ULPGC

Juan M. Hernández Guerra

Department of Quantitative Methods in Economics and Management · ULPGC

April 29, 2026

Outline

  • Mapping aesthetics: x, y, colour, size, shape, alpha.
  • Two continuous variables: scatter, with smoothers.
  • One continuous + one categorical: side-by-side distributions.
  • Facets: small multiples.
  • Correlation matrices and heatmaps.

Setup · sample data

Run this once at the start of the session — every example below uses hotels.

library(tibble)

hotels <- tibble(
  island = c("Gran Canaria", "Tenerife", "Lanzarote",
             "Fuerteventura", "La Palma"),
  stars  = c(4L, 5L, 4L, 3L, 3L),
  price  = c(82, 95, 110, 100, 78),
  rating = c(8.2, 9.1, 7.5, 6.9, 8.0),
  nights = c(12.5, 18.3, 9.8, 11.2, 6.4)
)

Scatter · price vs rating

library(ggplot2)

ggplot(hotels, aes(x = rating, y = price, colour = island)) +
  geom_point(alpha = 0.7) +
  geom_smooth(method = "lm", se = FALSE) +
  labs(x = "Rating (1–10)", y = "Nightly price (€)") +
  theme_minimal()

Facets · one panel per island

ggplot(hotels, aes(x = nights, y = price)) +
  geom_point(alpha = 0.5) +
  facet_wrap(~ island, ncol = 4) +
  theme_minimal()

Categorical × continuous · violin + boxplot

ggplot(hotels, aes(x = island, y = price)) +
  geom_violin(fill = "#f4f3ee", linewidth = 0.3) +
  geom_boxplot(width = 0.2, fill = "#0067a2", colour = "white") +
  coord_flip() +
  theme_minimal()

Correlation heatmap

library(dplyr)
library(ggplot2)
library(tidyr)

hotels |>
  select(where(is.numeric)) |>
  cor(use = "pairwise.complete.obs") |>
  as.data.frame() |>
  tibble::rownames_to_column("var1") |>
  pivot_longer(-var1, names_to = "var2", values_to = "r") |>
  ggplot(aes(var1, var2, fill = r)) +
  geom_tile() +
  scale_fill_gradient2(midpoint = 0, limits = c(-1, 1)) +
  theme_minimal()

Principles

  • One story per plot. If it takes a paragraph to read, split it.
  • Colour should encode one variable at a time.
  • Prefer facets over a forest of overlaid lines.
  • Check for overplotting — use alpha, jitter, or 2D binning.

Hands-on

Using the Eurostat monthly nights file already downloaded (datasets/raw/eurostat-nights_monthly.csv), pick 4-6 EU countries and produce a faceted line plot:

  • one panel per country (facet_wrap(~ geo)),
  • x = month, y = monthly nights,
  • one line per year (colour = factor(year)),
  • 2019-2024.

The plot should make seasonality and the COVID dip visible at a glance.