The data

The data we’re using today are new experimental statistics looking at the percentage of taxis and PHVs (private hire vehicles) of different ages by region.

The data is available in the data folder associated with this project (“Data/taxi0116.xlsx”) in a non-tidy, human-readable format. You can see the first rows of the data below:

Region Up to 1 year 1 up to 2 years 2 up to 3 years 3 up to 4 years 4 up to 5 years 5 up to 6 years 6 up to 10 years 10 up to 13 years 13 years and over Unknown [note 2] Average age (years)
East Midlands 0.3 2.0 1.4 2.5 7.6 10.4 49.2 16.1 8.6 1.8 7.3
East of England 0.4 3.4 2.2 2.5 8.2 10.8 44.7 14.6 12.1 1.1 7.5
London 2.1 10.3 6.8 5.8 12.4 6.5 32.3 20.0 3.9 0.0 6.1
North East 0.3 2.2 1.8 3.0 6.9 10.6 51.9 16.0 6.3 1.1 7.1
North West 0.1 1.9 1.4 1.3 5.1 6.5 32.2 22.1 28.3 1.1 9.4
South East 0.4 2.1 1.7 2.8 7.5 11.2 47.1 16.3 8.3 2.6 7.2

The data covers both taxis and PHVs in separate sheets (called taxi and phv respectively). Each different age range of the vehicles is recorded in a separate column, and each row is for a different region.

Each cell contains a value of percentage of vehicles in each region or country that are in each age bracket. Each region or country sums to 100%.

The task

  1. Read the data in to R using the readxl read_excel() function. It is saved in the Data folder of this repository, and is called taxi0116.xlsx. You will want to read in sheet 1 for taxi data. You will want to include the folder name in the code when reading the file in e.g. “Data/taxi0116.xlsx”

  2. Pivot the data longer into a tidy data format, so you have the taxi ages in one column, and the percentage of the total in another You will want to use the tidyr function pivot_longer() to do this

  3. Filter the data to remove the England and Wales and England total rows.

  4. Create a stacked bar chart in ggplot of the data, with region on the x axis and percentages on the y axis, splitting the data by taxi age.

  5. Make your chart publication-worthy! Aspects you may want to consider include: