The data

The data we’re using today is the now-defunct Google Mobility Data Series, produced to allow people to monitor how people travelled in different countries across the world, based on Google Maps data. This data provides a daily figure for change in mobility compared to the baseline, for different countries, regions, and sub-regions around the world.

The data is available here in a semi-tidy, machine-readable format, and you can see the first 10 rows of the data below:

country_region_code country_region sub_region_1 sub_region_2 metro_area iso_3166_2_code census_fips_code place_id date retail_and_recreation_percent_change_from_baseline grocery_and_pharmacy_percent_change_from_baseline parks_percent_change_from_baseline transit_stations_percent_change_from_baseline workplaces_percent_change_from_baseline residential_percent_change_from_baseline
AE United Arab Emirates NA ChIJvRKrsd9IXj4RpwoIwFYv0zM 2020-02-15 0 4 5 0 2 1
AE United Arab Emirates NA ChIJvRKrsd9IXj4RpwoIwFYv0zM 2020-02-16 1 4 4 1 2 1
AE United Arab Emirates NA ChIJvRKrsd9IXj4RpwoIwFYv0zM 2020-02-17 -1 1 5 1 2 1
AE United Arab Emirates NA ChIJvRKrsd9IXj4RpwoIwFYv0zM 2020-02-18 -2 1 5 0 2 1
AE United Arab Emirates NA ChIJvRKrsd9IXj4RpwoIwFYv0zM 2020-02-19 -2 0 4 -1 2 1
AE United Arab Emirates NA ChIJvRKrsd9IXj4RpwoIwFYv0zM 2020-02-20 -2 1 6 1 1 1

The data covers travel in the following types of area:

Part of the challenge of this data is how large it is; you will need to get used to handling it solely R rather than looking at the CSV in Excel, and working efficiently so your code doesn’t take ages to run!

The task

  1. Read the data in to R. You don’t need to save the file locally, you can do this directly from the web link (https://www.gstatic.com/covid19/mobility/Global_Mobility_Report.csv)

  2. Filter the data down to just the United Kingdom and the country-level data only. The country_region_code for the United Kingdom is GB, and country-level data is indicated by no sub_region_1

  3. Bring the data into a tidy format, so there is only one column for value and one for type of area. You will want to use the tidyr pivot_longer() function for this

  4. Create a ggplot chart of the data, with dates on the x axis and value on the y axis, with one line per type of area.

  5. Make your chart publication-worthy! Aspects you may want to consider include: