Chapter 1 Introduction


1.1 Overview

Want to learn how to effectively visualise your data in R using the elegant {ggplot2} package? With {ggplot2} it is easy to customise everything from plot layouts and themes to scales, colours and more! This course will comprehensively take you through basic plot types such as bar and line charts as well as customise charts using the DfT-specific package: {dftplotr}.

1.2 Session aims

  • understand how to create widely used visualisations using the {ggplot2}: package
  • customise charts using themes (including DfT-specific via the {dftplotr} package)
  • use annotations to explain charts better

1.3 Course instrcutions

This course is available as a self-directed learning option. To complete it, you will need to:

  • get access to Cloud R account

  • download the exercise repository as a ZIP file and upload it into your RStudio environment, then work through the book in order, trying out all the exercises as you go

    1. Download the exercise repository ZIP file (ggplot_training.zip) to your local computer
    2. Log into your Cloud R account and open RStudio
    3. In RStudio, go to the Files Pane (usually bottom right)
    4. Click Upload, then browse to select the ZIP file from your computer.
    5. After uploading, click on the ZIP file in the Files pane and select More > Extract to unzip the contents
    6. You can now navigate into the extracted folder and start working on the exercises

Each exercise has a SOLUTION dropdown, where you can find helpful prompts and answers.

After completing the course, please fill out the feedback form to help us improve the experience of DfT coding courses. You can also join the DfT R community to post questions, share knowledge and get support from other R users.

1.4 Pressumed knowledge

Users of this book should feel comfortable with using R, including basic data manipulation and data tidying. Our Introduction to R provides a brief introduction to data manipulation and the tidying data concept. We also have our Tidy Data in R course which goes over the tidy functions in more detail.

1.5 What is {ggplot2}?

{ggplot2} is a very powerful data visualisation package that provides a flexible and elegant way to create static and interactive visuals. It is great for exploring data and producing publication quality figures, as an example. {ggplot2} charts are created by combining visual elements like geometric shapes, lines and colours to represent different types of data. These elements can be layered with built-in themes and additional components to enhance the clarity and explanatory power of even the most complex charts.

This course will focus primarily on creating static charts.

Advantages:

  • it is a popular package with a large, active user community, offering extensive resources, tutorials and support
  • it is easy to learn as it follows a consistent grammatical structure that simplifies learning and usage
  • it is flexible for creating and customising visualisations to suit different needs
  • produces aesthetically appealing, high-quality charts that are both clear and elegant
  • enables powerful advanced statistical mappings, such as logarithms, distributions and geographic visualisations

1.6 Reproducibility Standard at the Civil Service

A key advantage of using ggplot is reproducibility—the ability for others to recreate the same visualisation from the original code and data. In the Civil Service, where routine and recurring publications are common, reproducibility is essential.

Benefits:

  • others can replicate and build on existing work, avoiding duplicated effort
  • clear code shows exactly how data was processed, modelled and visualised
  • code provides an audit trail, allowing verification of results, reducing errors, and boosting confidence in processes
  • sharing code makes it easier for colleagues to work together and improve analyses
  • well-documented code preserves expertise and methods, ensuring work remains accessible even when staff move to new roles

1.6.1 How ggplot helps with Reproducibility Standard?

{ggplot2} offers a consistent syntax for creating visualisations, making code easier to read, understand, and share. Its declarative style allows you to describe how a chart should look, rather than focusing on the steps to create it. This consistency makes it simpler for colleagues to learn, reuse, and adapt ggplot2 code, ensuring charts are reproducible across the department.

Key benefits include:

  • built-in themes: Easily apply consistent styles to plots with a single line of code, ensuring a uniform look and feel
  • layered components: Charts are made up of layers, so they’re easy to update when data changes or tweaks are needed—no need to start from scratch
  • tidyverse integration: As part of the tidyverse, {ggplot2} aligns with DfT’s preferred coding style. This means staff benefit from training, technical support, and consistent coding practices, making code easier to debug and share
  • consistency for users: Charts created with {ggplot2} have a consistent appearance, making it quicker and easier for readers to understand key messages

By using {ggplot2}, we improve the reproducibility, clarity, and quality of the charts we create and publish.