Chapter 6 Practice
6.1 Exercise
60:00
Use the transport_expenditure
data to create a time series showing household expenditure on transport over time. The data shows household spending per week in pounds on different types of transport and subcategories of transport, such as petrol, between 2002 and 2021.
In your analysis, try to use as many components of creating a complex chart in ggplot2 covered in this course.
- Inspect the
transport_expenditure
data and decide on an area you would like to analyse or compare. - Tidy and manipulate the data in preparation to create a time series chart.
- Create a multi-line time series chart data in ggplot2.
- Adapt or change the scale of your chart to one you find appropriate for your analysis.
- Add a title, and label your
x
andy
axis. - Create a theme by changing text elements, line elements, legend and background elements.
- Add an annotation to enhance the readability or narrative of your chart.
This is a preview of the data you will be using:
transport_group_category | spend_category | spend_subcategory | year | spend_per_week_GBP |
---|---|---|---|---|
Motoring and bicycle costs | Purchase of vehicles | New cars and vans | 2002 | 10.7 |
Motoring and bicycle costs | Purchase of vehicles | New cars and vans | 2003 | 11.3 |
Motoring and bicycle costs | Purchase of vehicles | New cars and vans | 2004 | 11.4 |
Motoring and bicycle costs | Purchase of vehicles | New cars and vans | 2005 | 10.1 |
Motoring and bicycle costs | Purchase of vehicles | New cars and vans | 2006 | 8.0 |
Motoring and bicycle costs | Purchase of vehicles | New cars and vans | 2007 | 7.8 |
Solution
Code
# Q1 & Q2: Data tidying
practice_data <- transport_expenditure %>%
dplyr::filter(transport_group_category == "Motoring and bicycle costs") %>%
dplyr::select(year, spend_category, spend_per_week_GBP) %>%
dplyr::group_by(year, spend_category) %>%
dplyr::summarise(spend_per_week_GBP = sum(spend_per_week_GBP)) %>%
dplyr::mutate(year = as.numeric(year))
# Q7 Creating labels that will be used for annotation
practice_labels <- practice_data %>%
dplyr::filter(year == "2021")
ggplot(practice_data, aes(x = year, y = spend_per_week_GBP, color = spend_category, group = spend_category)) +
# Q3 Using geom_line() to create a multi-line time series chart
geom_line(linewidth = 1) +
# Q4 Adapting the scale of the chart using scale continuous function for x and y
scale_y_continuous(
limits = c(0, 30),
breaks = seq(from = 0, to = 30, by = 5),
expand = c(0, 0)
) +
scale_x_continuous(
limits = c(min(practice_data$year), max(practice_data$year) + 6),
breaks = seq(from = 2006, to = 2021, by = 5)
) +
dftplotr::scale_colour_dft(palette = "joyful.journey") +
# Q5 Adding a title and labelling the x and y axis
labs(
title = "Household spend (£) on motor and cycling vehicles per week, 2002 to 2021",
x = "Year",
y = "GBP (£)"
) +
# Q6 Creating a theme
theme(
axis.text.x = element_text(size = 10),
axis.text.y = element_text(size = 10),
axis.title.y = element_text(size = 12),
axis.title.x = element_text(size = 12),
plot.title = element_text(lineheight = 0.8, face = "bold", hjust = 0.5),
panel.background = element_rect(fill = "white"),
panel.grid.major.y = element_line(colour = "grey", linetype = "dashed", linewidth = 0.2),
legend.position = "none"
) +
# Q7 Adding a annotation to enhance readability of the chart
geom_text_repel(
data = practice_labels, aes(x = year,
y = spend_per_week_GBP,
group = spend_category,
label = spend_category),
colour = "grey8",
segment.color = 'NA',
nudge_x = 1.2,
nudge_y = 0,
size = 3.5,
hjust = 0
)