The (spooky) data

The data we’re looking at today is taken from the ORR passenger footfall in train stations over time. Some stations in the UK show very low footfall over the course of the year, and are often termed “ghost stations”. We’re going to take a look at how the footfall has changed in these stations over time.

station_name NLC TLC region local_authority 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022
Abbey Wood 5131 ABW London Greenwich 2060584 2284585 2355943 2443651 2400216 2425400 NA 2201541 2089975 2804493 3096498 3029176 2882868 3030212 3134250 3175430 3282240 3319408 2929472 2988802 3124850 3769402 3825206 1412638 2638456
Aber 3813 ABE Wales Caerphilly - Caerffili 88714 87910 112812 115079 115667 134397 NA 134191 136549 169463 183136 192180 192788 202486 203432 209622 219868 212546 214996 227270 251108 245218 228480 15712 73642
Abercynon 3801 ACY Wales Rhondda Cynon Taf - Rhondda Cynon Taf NA NA NA NA NA NA NA NA NA NA NA 194164 195702 214492 240070 243948 251688 265458 275404 293638 298358 289008 282886 33006 105822
Abercynon North NA NA NA NA 43073 42890 47417 52014 62184 82961 NA 112811 114833 123455 127598 NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Abercynon South NA NA NA NA 51191 49553 56520 67107 72290 68360 NA 92256 82208 70294 64660 NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Aberdare 3982 ABA Wales Rhondda Cynon Taf - Rhondda Cynon Taf 158945 178680 219986 260244 302859 331006 NA 464026 465675 469086 471586 506004 507820 524592 537542 552436 557992 551928 566904 569364 571746 555780 504622 53668 177350
Aberdeen 8976 ABD Scotland Aberdeen City 1550570 1590804 1609251 1607702 1762708 1761041 NA 1931973 2107855 2278872 2470281 2568810 2657014 2964302 3170464 3338072 3599431 3742646 3459944 3058268 2948150 2616142 2497108 393982 1536720
Aberdour 9090 AUR Scotland Fife 78964 87940 95220 103545 95850 96376 NA 112941 109580 120420 121724 128074 131874 129786 126000 124298 127470 129474 127312 125208 135240 140048 126340 14726 62990
Aberdovey 4435 AVY Wales Gwynedd - Gwynedd 20938 19424 21204 21030 21040 21302 NA 23299 20461 23365 25093 25418 27996 32190 36696 33612 34450 36684 38094 35960 37706 40390 36560 5396 22886
Abererch 4440 ABH Wales Gwynedd - Gwynedd 513 426 516 735 473 498 NA 1038 1027 1095 1261 1258 1326 1620 1786 1214 1380 326 1984 2140 2506 2228 2148 0 396

The data provides details for each station including name, code, region, local authority, and then a footfall figure by year from 1998 onwards. The footfall figures are calculated based on tickets purchased, and years are financial years. The data is in human readable format, with one column per year, and the data is provided in CSV format.

The task

  1. Read the data in to R. It is saved in the Data folder of this repository, and is called ghost_stations.csv. If you haven’t cloned this repository, it’s also available to read in directly from: https://raw.githubusercontent.com/department-for-transport/learn_r_by_doing/main/Data/ghost_stations.csv

  2. Order the data by footfall in 2020 (the last normal year before Covid!), with the stations with the lowest footfall at the top of the table.

  3. Keep only the 10 stations with the lowest footfall in 2020 You will want to use the head function with n = 10 to do this

  4. Pivot the data longer into a tidy data format, so you have the year in one column, and the footfall in another You will want to use the tidyr function pivot_longer() to do this

  5. Create a line chart in ggplot of the data, with year on the x axis and footfall on the y axis.

  6. Make your chart publication-worthy! Aspects you may want to consider include: