Chapter 5 Weighting
5.1 Introduction
Following a recommendation in the 2000 National Statistics Quality Review of the NTS, a strategy for weighting the NTS data to reduce the effect of non-response bias was developed using the NTS data for 2002. The weighting methodology was published in 2005, together with a report showing comparisons between weighted and unweighted data for 2002. The methodology was subsequently revised slightly and applied to data back to 1995. The revised methodology, together with a report comparing weighted and unweighted trend data from 1995 to 2004 was published in 2006. These reports are available from DfT. As well as adjusting for non-response bias, the weighting strategy also adjusts for the drop-off in the number of trips recorded by respondents during the course of the travel week. The weighting strategy was reviewed in 2013 (in advance of the NTS 2013 weighting) using data from the NTS 2012 survey. For further information, see Morris, S, et. al. (2014) National Travel Survey 2013 Technical Report.
5.2 The interview sample weights
The interview sample weights were developed to be used for analyses of all participating households with completed individual interviews for all household members (either in person or by proxy), regardless of the amount of travel diary information collected. This sample is referred to as the ‘interview sample’. In 2023, the number of households included in the interview sample was 7,427 and the number of individuals and vehicles covered were 16,822 and 9,450 respectively.
The approach for generating weights for the interview sample was to:
generate the weights (w1) for the selection of the dwelling unit and or household at the sampled address (if sampling was required), as covered in section 5.2.1
produce weights for household-level non-participation (w2), as covered in section 5.2.2
select the participating households
generate weights for the exclusion of participating household at which not every individual completed the interview (w3), as covered in section 5.2.3
select the interview sample households
compute composite weights for selection and participation with the interview survey: w5 = w1 x w2 x w3
generate calibration weights which adjust the household/individuals in the interview sample to known household population estimates for age and or sex and region, as well as 50% of the weights per half-year to account for more addresses being issued in the latter six months, using the final composite weights (w5) as initial estimates, as covered in section 5.2.4
5.2.1 Selection weights for addresses and multiple dwelling units and households
For addresses at which more than one dwelling unit or household is identified, there is a defined procedure for selecting the dwelling units and households to be included (see section 2.9 for further information).
Most addresses consist of a single dwelling unit, and for these addresses no selection is required. For the relatively few addresses (less than 1%) that contain more than one dwelling unit, interviewers list these dwelling units in the electronic Address Record Form system (eARF) on their laptops so that the computer can randomly sample one of them. This selection needs to be corrected by applying an appropriate selection weight, otherwise dwelling units at split address would be under-represented in the final sample. The dwelling unit weight (Wdu) was calculated to be equal to the number of dwelling units identified at the address.
An adjustment also needs to be made for addresses that contain more than one household. Again, where more than one household is identified, the interviewer lists the households in the eARF and the computer selects one at random. A household selection weight (Whh) is calculated as the number of households identified at the address and or dwelling unit.
The address selection, dwelling unit, and household weight are then combined (w1 = Wdu x Whh) to give the composite household and or dwelling unit selection weight. Note that the selection weight w1 was trimmed at 4 to avoid a small number of very high weights which would inflate the standard errors, reduce the precision of the survey estimates and cause the weighted sample to be less efficient.
5.2.2 Weighting for household participation
The aim of the household participation weights is to attempt to reduce bias caused by systematic differences between the households that participated (that is, for which a household interview was obtained) in the NTS and those that did not. To generate the non-response weights, a logistic regression model was fitted with whether or not an eligible household participated as the outcome measure and terms associated with household participation as the covariates. Note that all NTS non-response models were fitted unweighted, as a result of the 2013 weighting review mentioned above.
From this model, the predicted propensity to participate was estimated for each household. The weights for household participation (w2) were calculated as the reciprocal of these propensities.
The models for household participation are shown in Appendix L. Items in the models were: region, Acorn group, rural-urban status (6 categories), and month that the address was issued. This model was developed based on analysis of the NTS 2002 (see Pickering et al., 2006) and was reviewed for the NTS 2013 weighting. For further information on the NTS 2013 weighting review, see Morris, S, et al. (2014) National Travel Survey 2013 Technical Report. This model is being reviewed again for future survey years as part of the 2023 weighting review process, using 2023 data and census 2021 data.
5.2.3 Weighting for the removal of households with missing individual interviews
The aim of these weights is to reduce the bias from the removal of households that did not have a completed individual interview for all household members. The proportion of households that did not have a complete individual interview for all household members was small. Therefore it was decided to base the weights solely on the size of household, the main predictor of complete household participation. To generate the weights, a logistic regression model was fitted which included the size of the household as the only covariate. Note that because interviews for the participating single-person households were completed for all household members, these were assigned a weight of 1 and excluded from the logistic regression model. The weights (w3) were again calculated as the reciprocal of the propensities (for having complete individual interviews for all household members) estimated from this model.
5.2.4 Calibration weighting
The final stage of the weighting procedure for the interview sample was to adjust the weights using calibration weighting in Stata. The approach used followed Deville, J and Sarndal, C (1992) Calibration Estimators in Survey Sampling, Journal of the American Statistical Association, Volume 87, 376-382. Calibration weighting adjusts the weights so that characteristics of the weighted achieved sample match population estimates. This reduces (but does not completely remove) any residual non-response bias and (less so) any impact of sampling and coverage error.
One of the advantages of calibration weighting is that it generates household-level weights that are actually based on the characteristics of the household members. A second advantage of calibration weighting is that the household-level weight produced can also be applied for analyses of household members (that is, at the individual level).
For the NTS 2023, the composite (household-level) weight was adjusted from the previous stages (w5) so that the distribution for groups defined by age and sex and region matched 2022 mid-year population estimates of household residents (see Appendix M).
The population estimates used were based on Census data in England, with an adjustment to estimate household residents only.
In addition to the population estimates above, each half-year of the weights was calibrated to 50% of the total to account for more addresses being issued, and therefore more responses received, in the latter six months of the year (see Chapter 2 for further information on the reduced sample in quarters 1 and 2).
After calibrating the interview sample to the population estimates, some regional bias remained, particularly in the North-West of England within each half year. The final calibration weights were scaled accordingly to bring this region closer to the population estimates in each half year.
5.3 Fully responding sample weights
Weights were also produced for the analyses of the fully responding (co-operating) sample. In the NTS 2023, 6,314 households were defined as fully co-operating with completed individual interviews and travel diaries for 14,102 household members and 7,925 vehicle questionnaires.
The approach for generating weights for the fully responding sample was to:
generate the weights (w1) for the selection of the dwelling unit or household at the sampled address (if sampling was required), as covered in section 5.2.1.
produce weights for household-level non-participation (w2), as covered in section 5.2.2.
select the participating households
generate weights for the exclusion of participating household at which not every individual completed the interview (w3), as covered in section 5.2.3.
select the interview sample households
generate weights for the removal of households which did not fully respond (w4), as covered in section 5.3.1.
select the fully responding sample
compute composite weights for selection and being fully productive: w6 = w1 x w2 x w3 x w4.
generate calibration weights which adjust the household/individuals in the fully responding sample to known household population estimates for age and or sex and region, using the final composite weights (w6) as initial estimates, and additionally calibrate to 50% of the weights per half-year year to account for more addresses being issued in the latter six months, as covered in section 5.3.2.
The calibration weights (wt_fully) were then the final weights for households, individuals and vehicles in the fully responding sample.
5.3.1 Weighting for the removal of households which did not fully respond
The aim of these weights is to reduce the bias from the removal of households that did not fully respond. Of the 7,427 interview sample households in the NTS 2023, 1,113 (15%) would be excluded from the analyses of the fully responding households while 6,314 were defined as fully responding.
The non-response model was fitted with whether a household in the interview sample fully responded as the response variable and pre-determined measures as covariates.
These measures had been originally identified from analysis of the NTS 2002 (see Pickering et al., 2006), and updated based on the review for NTS 2013. For further information on the NTS 2013 weighting review, see Morris, S, et al. (2014) National Travel Survey 2013 Technical Report. Measures included in the model were: region, tenure, number of adults, any married couples, any cohabiting couples, use of a vehicle, age category of youngest household member, ethnic groups of household members, an rural-urban measure (ru11ind), and month that the address was issued. See Appendix N for further details. This model is being reviewed again for future survey years as part of the 2023 weighting review process.
The weights (w4) were calculated as the reciprocal of the propensity to fully respond estimated from this model.
5.3.2 Calibration weighting
The next stage of the weighting procedure was to adjust the weights using calibration weighting in Stata.
As in previous years, these composite (household-level) weights were adjusted from the previous stages (w6) so that the distribution for groups defined by age and sex and region matched 2022 mid-year population estimates of household residents (see Appendix O). The population estimates used were based on Census data in England, with an adjustment to estimate household residents only.
As in the interview sample weighting, each half-year of the weights was calibrated to 50% of the total to account for more addresses being issued, and therefore more responses received, in the latter six months of 2023 (see Chapter 2 for further information on the reduced sample in quarters 1 and 2).
After calibrating the full sample to the population estimates, some regional bias remained, particularly in the North-West of England within each half year. The final calibration weights (w6) were scaled accordingly to bring this region closer to the population estimates in each half year.
5.4 Weighting the travel data
5.4.1 The travel diary
Table 5.1 shows the average number of journeys recorded for each day of the travel diary (excluding short walks which were only collected on the first day). This indicates that there was a gradual reduction in the (weighted) number of journeys recorded throughout the travel diary week from an average of 2.00 per person on the first day to 1.72 on the seventh day, a fall of about 14%. In 2023 this pattern was broadly consistent with previous years. In order to reduce any biases from the under-reporting of journeys during the course of the travel diary week, appropriate weights were produced.
Table 5.1: Average number of journeys (weighted and unweighted) recorded on each day of the travel diary (excluding short walks)
Day of travel diary | Average number of journeys (weighted) | Average number of journeys (unweighted) |
---|---|---|
Day 1 | 2.00 | 2.05 |
Day 2 | 1.92 | 1.97 |
Day 3 | 1.86 | 1.91 |
Day 4 | 1.83 | 1.86 |
Day 5 | 1.79 | 1.82 |
Day 6 | 1.75 | 1.79 |
Day 7 | 1.72 | 1.77 |
Note: Weighted figures are based on 14,923 individuals and unweighted figures are based on 14,102 individuals. Weighted figures use the adjusted version of weight variable wt_fully.
The strategy to reduce the bias from the drop-off in reporting in the travel diary was to generate weights so that the weighted total number of journeys made on a particular day of the travel diary always equalled the number reported for the first day of the travel diary. This was done separately for each journey purpose, because the rate of drop-off varied by journey purpose. For reference, see Table 5.2 below for the average number of journeys (weighted) recorded on each day of the travel diary, by purpose of journey.
For example, the number of shopping journeys reported fell from 0.359 on the first day to 0.272 on the seventh day of the travel diary, whereas for business journeys or education journeys the number remained fairly constant over the seven days of the travel week. This approach assumes that the reporting on the first day of the travel diary is the most accurate and that the drop-off on the following days of the travel diary is only a result of under-reporting. NTS 2023 diaries showed broadly similar pattern of drop-off in reporting for all journey types in previous years.
During NTS 2023, it was not always possible to follow rules for the start date of the diary due to fieldwork constraints. This meant that diary start days were not evenly spread across all seven days of the week in NTS 2023, which was also the case in 2020, 2021 and 2022. To adjust for this, the fully responding start weight was rescaled to give an even spread of diary start days across the week (14.3% of diaries starting per day). This start weight was then used as the basis of the diary weights.
There are a couple of special cases for the diary weighting. First, because the number of journeys reported for business remained constant through the diary week for all years of the NTS (1995 to 2022), the weights were set to 1 for the whole week for this journey purpose. Second, the weights for journeys made at the weekend for education and escort education, which are relatively rare, were also set to 1. These two adjustments were still made in 2023. For historical context, note that up to the NTS 2016 the weights for holidays were also set to 1 because the number of holiday journeys remained constant through the diary week. Since 2017 there has been an observed drop-off in the number of journeys reported for holidays, therefore the weights were not set to 1.
Table 5.2: Average number of journeys (weighted) recorded on each day of the travel diary, by purpose of journey
Day of the travel diary | Commuting | Business | Education | Escort: Education | Shopping | Other | Social | Holiday: Leisure |
---|---|---|---|---|---|---|---|---|
Day 1 | 0.294 | 0.050 | 0.119 | 0.107 | 0.359 | 0.373 | 0.463 | 0.232 |
Day 2 | 0.295 | 0.056 | 0.115 | 0.099 | 0.328 | 0.361 | 0.448 | 0.214 |
Day 3 | 0.304 | 0.051 | 0.107 | 0.085 | 0.305 | 0.358 | 0.445 | 0.202 |
Day 4 | 0.304 | 0.049 | 0.114 | 0.091 | 0.304 | 0.343 | 0.421 | 0.205 |
Day 5 | 0.300 | 0.053 | 0.110 | 0.097 | 0.281 | 0.335 | 0.413 | 0.201 |
Day 6 | 0.281 | 0.052 | 0.110 | 0.093 | 0.286 | 0.316 | 0.419 | 0.192 |
Day 7 | 0.266 | 0.047 | 0.109 | 0.095 | 0.272 | 0.328 | 0.413 | 0.191 |
Note: These weighted figures use the adjusted version of weight variable wt_fully.
5.4.2 Short walks
From 2017 short walks were only recorded on the first day of the travel diary.
Analyses of short walks are not carried out at the individual level, only aggregated information is produced. Therefore, the fact that the information on short walks is collected on different days for different people should, in theory, average out for the aggregated estimates produced, assuming that the information collected is distributed approximately evenly over the seven days of the week. However, this is not the case in reality, mainly due to differential non-response between those allocated different start days.
Table 5.3 shows the distribution of individuals reporting a short walk across the days of the week (weighted by the adjusted fully responding weights). To balance the analyses over the days of the week, weights were generated that adjusted the number of respondents providing data on short walks for each day of the week to be equal to the weighted mean across the seven days (2,132). These adjustments and the resulting weights are shown in the last two columns of Table 5.3. The proportion of individuals reporting a short walk for each day of the week, after weighting, is also shown.
Table 5.3: Weighting for short walks
Day of the week | Number of individuals reporting a short walk (weighted) | Proportion across the week (weighted) | Adjustment | Weight |
---|---|---|---|---|
Sunday | 2,148 | 14.4% | 0.993 | 6.949 |
Monday | 2,131 | 14.3% | 1.001 | 7.004 |
Tuesday | 2,146 | 14.4% | 0.993 | 6.953 |
Wednesday | 2,052 | 13.7% | 1.039 | 7.273 |
Thursday | 2,149 | 14.4% | 0.992 | 6.944 |
Friday | 2,151 | 14.4% | 0.991 | 6.937 |
Saturday | 2,147 | 14.4% | 0.993 | 6.951 |
Note: These weighted figures use the adjusted version of weight variable wt_fully.
5.4.3 Long-distance travel records
Information about all journeys is collected in the travel diary week. In order to obtain additional information about long-distance journeys (LDJ), defined as journeys of 50 miles or more within Great Britain, the NTS collects information on long-distance journeys made in the one-week period prior to the interview. This information is collected during the placement interview itself. However, the number of LDJ reported in week prior to the interview (3,113) was lower than the number reported in the travel diary (4,921).
As the information collected in the travel diary was likely to be more accurate, the LDJ figures collected in the week prior to the interview were weighted so that the number of LDJ reported on each day equalled the average number (for a day) reported in the travel diary. Tables 5.4 to 5.6 below compare the unweighted figures by showing the number of diary-recorded LDJ next to the interview-recorded LDJ for each day of the travel diary, along with the accompanying weights that were produced for each day. This was done separately for the following categories of journey length: 50 to 75 miles (Table 5.4), 75 to 100 miles (Table 5.5), and 100 miles or more (Table 5.6).
Revised weights using this methodology have been calculated for LDJ data from NTS 2006. Prior to this, the weighting did not take journey length into account.
Table 5.4: Number of long-distance journeys (LDJ) made between 50 and 75 miles
Day of the travel diary | LDJ recorded in the diary during the travel week | LDJ recorded in placement interview for the week prior | Weight |
---|---|---|---|
Day 1 | 298 | 137 | 2.12 |
Day 2 | 254 | 180 | 1.62 |
Day 3 | 292 | 217 | 1.34 |
Day 4 | 275 | 148 | 1.97 |
Day 5 | 289 | 188 | 1.55 |
Day 6 | 302 | 189 | 1.54 |
Day 7 | 330 | 132 | 2.21 |
Note: The average number of LDJ between 50 and 75 miles recorded in the diary during the travel week is 291.
Table 5.5: Number of long-distance journeys (LDJ) made between 75 and 100 miles
Day of the travel diary | LDJ recorded in the diary during the travel week | LDJ recorded in placement interview for the week prior | Weight |
---|---|---|---|
Day 1 | 103 | 77 | 1.79 |
Day 2 | 137 | 86 | 1.62 |
Day 3 | 174 | 79 | 1.75 |
Day 4 | 124 | 85 | 1.62 |
Day 5 | 134 | 87 | 1.60 |
Day 6 | 129 | 78 | 1.77 |
Day 7 | 168 | 57 | 2.42 |
Note: The average number of LDJ between 75 and 100 miles recorded in the diary during the travel week is 138.
Table 5.6: Number of long-distance journeys (LDJ) made of 100 miles or more
Day of the travel diary | LDJ recorded in the diary during the travel week | LDJ recorded in placement interview for the week prior | Weight |
---|---|---|---|
Day 1 | 275 | 192 | 1.42 |
Day 2 | 299 | 186 | 1.47 |
Day 3 | 227 | 233 | 1.17 |
Day 4 | 268 | 195 | 1.40 |
Day 5 | 269 | 201 | 1.36 |
Day 6 | 322 | 194 | 1.41 |
Day 7 | 253 | 173 | 1.58 |
Note: The average number of LDJ of 100 miles or more recorded in the diary during the travel week is 273.
5.5 CASI weights
Starting in NTS 2017, a Computer Assisted Self Interviewing (CASI) module for transport satisfaction questions was added, where one adult from those present during the household interview is asked to complete the satisfaction questions. The methodology for incorporating the CASI module into the NTS sample was based on the methodological development work that NatCen carried out in 2016. See Appendix Q of the NTS 2017 Technical Report.
Respondents to the transport satisfaction questions (the ‘satisfaction sample’) need to be weighted to be representative of the NTS interview sample (and by extension representative of the adult population in England).
The satisfaction sample comprises of one adult per household randomly selected from those present during the interview. The satisfaction sample was recruited using an equal probability, except in households where both people aged 16 to 29 and 30 years or over were present. In such households, those aged 16 to 29 were selected with an 80% probability (the sampling methodology is described in Chapter 2). Sampling in this way introduces bias, as some individuals (those who are absent) have a zero probability of selection. To overcome the zero probability of selection, absent individuals can be treated as non-respondents with the application of appropriate non-response weights.
The CASI weights were developed to be used for analyses of the satisfaction sample (that is, all individuals in the interview sample who have completed the self-completion questionnaire regardless of the amount of travel diary information collected). Of the 7,427 households in the interview sample, 7,369 were eligible for the CASI questionnaire. One adult per eligible household was selected and the satisfaction sample comprised of the 7,194 individuals who responded to the CASI questionnaire and already had an interview weight.
The approach to generating the CASI weights was to:
generate weights (casi_w1) for the exclusion of individuals who were not present during the interview, as covered in section 5.5.1
produce weights (casi_w2) for the selection of one present individual per household, as covered in section 5.5.2
compute sets of composite weights for selection and CASI participation: casi_w3 = casi_w1 x casi_w2
select the responding individuals
generate calibration weights (casi_wt_calib) which adjust the individuals in the CASI sample to known household population estimates for age and or sex and region, using the composite weights (casi_w3) as initial estimates, as covered in section 5.5.3
5.5.1 Weighting for the exclusion of not present individuals
The aim of presence weighting is to reduce bias caused by systematic differences between those adults who were present during the interview and those that were not. Of the 14,257 adults aged 16 or over in the NTS 2023 interview sample, 9,638 (67.6%) were present during the interview.
To correct for differences between the profiles of the present and not present groups, stepwise logistic regression models were fitted with whether or not an interview sample (adult) respondent was present during the interview as the outcome measure and terms associated with being present as covariates. These included:
age-by-gender
region
household size
tenure
income group
marital status
economic status
frequency of traveling by car
ethnicity
quarter of issue.
From the final model, the predicted propensity of being present was estimated for each individual. The weights (casi_w1) to adjust for non-presence bias were calculated as the reciprocal of these propensities for those who were present. Note that the model was restricted to households with two or more adults. Those present in single-adult households were assigned a probability (and a weight) of 1. The weights were trimmed at the top 0.5% to reduce excess variance inflation due to a small number of large weights. Weighting in this way would remove any bias from the present sample that is linked to the variables included in the model, so that any remaining bias can be considered ignorable, and make it representative of the total NTS interview sample.
The final model is shown in Appendix P.
5.5.2 Weighting for the selection of one adult per household
The satisfaction sample was recruited using an equal probability, except in households where both people aged 16 to 29 and 30 years and over were present. In such households, those aged 16 to 29 were selected with an 80% probability.
To correct for the unequal probabilities of selection, selection weights (casi_w2) were defined as the inverse of each person’s selection probability. Note that in households with only people 16 to 29 or 30 years or over, the selection weight was simply the number of present adults per household. Additionally, casi_w2 was trimmed at 6 to avoid a small number of very high weights which would inflate the standard errors, reduce the precision of the survey estimates and cause the weighted sample to be less efficient.
5.5.3 Calibration weighting
The final stage of the weighting procedure was to adjust the weights using calibration weighting in Stata. Specifically, the composite weight from the previous stages (casi_w3) was adjusted so that the distribution for groups defined by age and sex and region matched 2022 mid-year population estimates of household residents (see Appendix Q). The population estimates used were based on Census data in England, with an adjustment to estimate household residents only. After calibration the weights were checked for outliers and the top weight trimmed.