Data Visualisation with ggplot2
2025-01-23
Chapter 1 Introduction
Want to learn how to effectively visualise your data in R using the elegant ggplot2
package? With ggplot2
it is easy to customise everything from plot layouts and themes to scales, colours and more! This course will comprehensively take you through basic plot types such as bar and line charts as well as customise charts using the DfT-specific package: dftplotr
.
1.1 Session aims
At the end of this course, you will be able to:
- understand how to create widely used visualisations using the
ggplot2::
package - customise charts using themes (including DfT-specific via the
dftplotr::
package) - use annotations to explain charts better
1.2 Course instrcutions
This book is designed to accompany the ggplot-focus training that we run at DfT. To complete this course, you will need a Cloud R account, and a Github account to clone the training repository.
If you’re running through this book solo, it is recommended to clone the exercise repository, run through the book in order and try out all the of the exercises as you go through.
1.3 What is ggplot2?
ggplot2::
is a very powerful data visualisation package that provides a flexible and elegant way to create static and interactive visuals. It is great for exploring data and producing publication quality figures, as an example. ggplot2
charts are created by combining visual elements like geometric shapes, lines and colours to represent different types of data. These elements can be layered with built-in themes and additional components to enhance the clarity and explanatory power of even the most complex charts.
This course will focus primarily on creating static charts.
Advantages:
- a popular package with a large, active user community, offering extensive resources, tutorials and support
- it is easy to learn as it follows a consistent grammatical structure that simplifies learning and usage
- it is flexible for creating and customising visualisations to suit different needs
- produces aesthetically appealing, high-quality charts that are both clear and elegant
- enables powerful advanced statistical mappings, such as logarithms, distributions and geographic visualisations
1.4 Reproducibility Standard at the Civil Service
A key advantage of using ggplot
is reproducibility—the ability for others to recreate the same visualisation from the original code and data. In the Civil Service, where routine and recurring publications are common, reproducibility is essential.
Benefits:
- others can replicate and build on existing work, avoiding duplicated effort
- clear code shows exactly how data was processed, modelled and visualised
- code provides an audit trail, allowing verification of results, reducing errors, and boosting confidence in processes
- sharing code makes it easier for colleagues to work together and improve analyses
- well-documented code preserves expertise and methods, ensuring work remains accessible even when staff move to new roles
1.4.1 How ggplot helps with Reproducibility Standard?
ggplot2
offers a consistent syntax for creating visualisations, making code easier to read, understand, and share. Its declarative style allows you to describe how a chart should look, rather than focusing on the steps to create it. This consistency makes it simpler for colleagues to learn, reuse, and adapt ggplot2 code, ensuring charts are reproducible across the department.
Key benefits include:
- built-in themes: Easily apply consistent styles to plots with a single line of code, ensuring a uniform look and feel
- layered components: Charts are made up of layers, so they’re easy to update when data changes or tweaks are needed—no need to start from scratch
- tidyverse integration: As part of the tidyverse, ggplot2 aligns with DfT’s preferred coding style. This means staff benefit from training, technical support, and consistent coding practices, making code easier to debug and share
- consistency for users: Charts created with ggplot2 have a consistent appearance, making it quicker and easier for readers to understand key messages
By using ggplot2, we improve the reproducibility, clarity, and quality of the charts we create and publish.