Materials and methods
Study design and data source
A prospective cohort study was conducted using data from the Canadian Longitudinal Study on Aging (CLSA). The CLSA is a national cohort study that collects data on the health of Canadian adults every 3 years.23 24 Baseline data collection was completed from 2011 to 2015, with the participation of 51 338 community-dwelling adults between the ages of 45 and 85. Follow-up 1 (FUP1) data collection was completed from 2015 to 2018, with 44 817 adults participating (response rate=87%). At baseline, participants were required to independently complete the surveys, reside in 1 of the 10 provinces and respond in French or English. Individuals who were residing in the Canadian territories and some remote regions, First Nations reserves and other First Nations settlements in the provinces or institutions at the time of recruitment were excluded. Additionally, full-time members of the armed forces and individuals with cognitive impairment at time of recruitment were excluded. Detailed information on the CLSA is available at https://www.clsa-elcv.ca/data-collection.
In April 2020, the CLSA launched the COVID-19 Study. All pre-existing CLSA participants with valid contact information who could independently complete the surveys (N=42 511) were invited to participate by email (N=34 428) or telephone (N=8083). Of the eligible participants, 28 559 completed the COVID-19 baseline survey from 15 April 2020 to 30 May 2020 (response rate=67%). Afterwards, two biweekly (if participating via telephone) or four weekly (if participating via web) surveys were administered. Participants continued to complete the surveys via telephone or web on a monthly basis in July, August and September 2020 before completing the final COVID-19 exit survey from 29 September to 30 December 2020 (N=24 114).
Patient and public involvement
There was no patient or public involvement at any stage of this study. Participants were required to provide informed consent prior to participation in the CLSA.
Exposures and covariates
For the primary objective, the main exposure was the period of time: prepandemic (2011–2018) versus during the pandemic (2020). In the regression models, the prepandemic period was treated as the reference group. The following covariates were selected for inclusion in the models a priori based on commonly identified smoking risk factors: age group, sex, region of residence, urban or rural residence, immigrant background, racial background, marital status, household income and education level.25–27 All covariate data were taken from the CLSA baseline to be able to compare the prevalence of smoking while holding all other characteristics constant, with the exception of age. Age was included as a time-varying covariate and imputed for individuals with missing data. Region of residence was determined by asking participants what province they resided in, which were categorised as Ontario, Quebec, British Columbia, Atlantic (Newfoundland, Nova Scotia, New Brunswick, Prince Edward Island), Prairies (Alberta, Manitoba, Saskatchewan). Participants were categorised as residing in an urban or rural area based on postal code linkage with the Statistics Canada Postal Code Conversion File. Participants were categorised as immigrants if they indicated they were not born in Canada. Racial background was measured by asking participants what racial or cultural background they best identified with. All non-white participants were grouped together, due to small sample sizes. For marital status, participants were categorised into the following groups: single/never married/never lived with a partner, married/in a common-law relationship, widowed/divorced/separated. For education, participants were categorised into the following groups: less than secondary school, secondary school graduation, some postsecondary (eg, started but did not graduate) and postsecondary.
For the secondary objective, we considered tobacco smoking during the COVID-19 pandemic as the main exposure. Data on tobacco smoking during the pandemic were collected in the CLSA COVID-19 baseline survey (15 Apr 2020–30 May 2020). Participants were first asked if they had ever smoked in their lifetime. If participants indicated that they had, they were asked if they currently smoked daily, occasionally or not at all. For our analysis, participants who reported not currently smoking (formerly smoked) and participants who had never smoked in their lifetime (never smoked) were grouped together as not smoking. Participants who smoked daily or occasionally were grouped together, due to the small number of participants who reported occasional smoking. The original proportions are displayed in online supplemental figure 1.
Outcomes
For the primary objective, the outcome is daily or occasional tobacco smoking prior to and during the COVID-19 pandemic. As noted earlier, data on smoking during the COVID-19 pandemic were collected in the COVID-19 baseline survey. Data on tobacco smoking were also collected at CLSA baseline and FUP1, allowing us to calculate the odds of smoking early in the pandemic, relative to before the pandemic. The questions at FUP1 were identical to those at COVID-19 baseline, asking if participants had ever smoked in their lifetime and if they were currently smoking daily, occasionally or not at all. Participants who indicated smoking daily or occasionally were grouped together. The questions at CLSA baseline were structured differently. Participants were first asked if they had ever in their lifetime smoked 100 cigarettes. If they indicated yes, they were asked if they were currently smoking daily, occasionally or not at all. Participants who reported smoking not at all (formerly smoked) and participants who reported never smoking 100 cigarettes in their lifetime (never smoked) were grouped together, while adults who smoked daily or occasionally were grouped together.
For the secondary objective, the outcome was a PHM score developed by De Rubeis et al,28 which summarised adherence to the guidelines during the COVID-19 pandemic.29 30 The summary measure included data from the COVID-19 baseline and three monthly follow-up surveys. Data on five behaviours were collected: self-quarantining, attending a public gathering, leaving home, masking and handwashing. Handwashing was only measured in the baseline survey and masking was only included in the monthly surveys. The other behaviours were included in all four surveys. For each behaviour, participants were assigned a score ranging from 0 to 1 depending on if they had adhered (1) or did not adhere to (0) to the guidance given by public health authorities. A score for each of the four surveys was calculated by averaging the score for each of the behaviours. A final PHM score was calculated by averaging the scores on the baseline and monthly surveys. The final PHM score was then divided into quartiles and the middle two quartiles were grouped together because they had similar adherence scores. Overall, participants had either a low, medium or high level of adherence. Scores for the individual behaviours were also averaged across time and converted into a three-level outcome using the same quartile method. Participants with missing data for more than one of the surveys were not included. Details on the PHM score, including the questions and the behaviour frequencies, are included in online supplemental tables 1 and 2.
Statistical analyses
The characteristics of the 44 139 participants at CLSA baseline (used to model the change in smoking using weighted generalised estimating equations (WGEEs)) and 27 929 participants at COVID-19 (used to model the adherence to PHMs) were described.
For the primary objective, we used WGEE regression models with a logit link to calculate the prevalence of smoking in the prepandemic period and during the COVID-19 pandemic for 44 139 participants. WGEE models were used to account for longitudinal data missing at random and reduce bias in prevalence estimates from survey non-response over time.31–33 We structured the data so that participants could only have missing information on their smoking status in a monotonic pattern.34 However, they did not need to report their smoking status at all three time points as WGEEs are able to incorporate their values by using subject-specific weights that account for the probability of drop out.34 Still, participants could not have missing data on any of the SES characteristics at baseline, answer the smoking questions intermittently or lack information on smoking due to death. This information is summarised in figure 1. Of the 44 139 participants, 39 830 answered the smoking questions at FUP1 (response rate=89%) and 25 767 answered the smoking questions at COVID-19 baseline (response rate=58%).
Figure 1Flow chart showing the selection of Canadian Longitudinal Study on Aging (CLSA) participants for the weighted generalised estimating equation models. SES, socioeconomic status.
We used the WGEE models to determine the prevalence of smoking during the two time periods. We ran the unadjusted models, in which only the primary exposure period was included, and the adjusted model, in which we controlled for all SES characteristics. Then, we examined the interaction of the period variable with each of the SES characteristics. When the interaction was significant (p<0.10 using the Wald statistic),35 we stratified the sample by the different SES subgroups and calculated the prevalence of smoking at the different periods. We reported when the period variable was significant (p<0.05) for the different subgroups, suggesting a change in smoking across time for the subgroup. The odds of reporting smoking during the pandemic, relative to before the pandemic, were also modelled.
For the secondary objective, we used multinomial logistic regression to evaluate the association of tobacco smoking with the PHM score. We assessed the odds of being categorised into the low, medium and high adherence groups, treating the low adherence group as the reference group. The models were adjusted for the following SES characteristics: age group, sex, region of residence, urban or rural residence, immigrant background, racial background, marital status, household income and education level. More recent data were available on some of the characteristics. Therefore, data on household income and marital status were extracted from FUP1, while age, region of residence and urban or rural residence were extracted from the COVID-19 baseline survey.