The data
The data we’re using today is the daily transport usage stats, initially produced for Covid reporting. This data provides a daily figure for usage by mode, indexed to a pre-Covid baseline figure.
The data is available here in a tidy, machine-readable format, and you can see the first 10 rows of the data below:
date | transport_type | value |
---|---|---|
2020-03-01 | cars | 1.03 |
2020-03-01 | light_commercial_vehicles | 1.11 |
2020-03-01 | heavy_goods_vehicles | 1.08 |
2020-03-01 | all_motor_vehicles | 1.04 |
2020-03-01 | tfl_tube | 1.03 |
2020-03-01 | tfl_bus | 1.02 |
2020-03-01 | national_rail | 0.95 |
2020-03-01 | national_rail_noCR | 0.95 |
2020-03-02 | cars | 1.02 |
2020-03-02 | light_commercial_vehicles | 1.06 |
The data covers the following modes:
- all_motor_vehicles
- tfl_tube
- tfl_bus
- national_rail
- cycling
- bus_excluding_london
Motor vehicles is further broken down by vehicle type:
- cars
- light_commercial_vehicles
- heavy_goods_vehicles
There are two series available for rail, one which includes Crossrail and one which excludes it:
- national_rail
- national_rail_noCR
Where no data is available for a specific date, you can treat that as an NA value.
The task
Read the data in to R. You don’t need to save the file locally, you can do this directly from the web link (https://assets.publishing.service.gov.uk/media/65257d612548ca0014ddf09b/full_data_clean.csv)
Check that the data is clean, and the different modal names are in a publication-ready format You can use a combination of mutate and gsub/str_replace to swap underscores for spaces
Create a ggplot line chart of the data, with dates on the x axis and value on the y axis, with one line per transport mode.
Make your chart publication-worthy! Aspects you may want to consider include:
- The theme and colours used in your charts
- The formatting and labelling of your chart axes
- What should the date range of your data be?
- Do you want to include weekends and bank holidays?
- Do you want to include every mode? Could you split the chart to show different groupings on different charts?