Comparison of seven models for the progression patterns of multiple chronic conditions in longitudinal studies
•,,.
...
Abstract
Introduction Studies investigating the relationship between patterns of multimorbidity and risk of a new condition have typically defined the patterns at a baseline time and used Kaplan-Meier (KM) or Cox proportional hazards regression. These methods do not consider the competing risk of death or the changes in the patterns of conditions over time. This study illustrates how these methodological limitations can be overcome in the setting of progression from cardiometabolic conditions to dementia.
Methods Data from 11 930 women who participated in the Australian Longitudinal Study on Women’s Health were used to define patterns of diabetes, heart disease and stroke and estimate the cumulative incidence or HRs of subsequent dementia. Seven methods were compared. For cumulative incidence these were KM method, cumulative incidence function (CIF) (to account for the competing risk of death) and multistate model with Aalen-Johansen estimates (to account also for the progression of conditions over time). For HRs, the corresponding methods were Cox model and Fine and Gray model (for sub-HRs) with the cardiometabolic patterns treated as time-invariant (from baseline) or as time-varying predictors.
Results The estimated cumulative incidence of dementia using the KM method declined when the competing risk of death was considered. For example, for women with no cardiometabolic condition at baseline, the KM and CIF estimates were 35.7% (95% CI 34.6%, 36.8%) and 27.3% (26.4%, 28.2%) but these women may have developed cardiometabolic conditions during the study which would increase their risk. The Aalen-Johansen multistate estimate for women with no cardiometabolic condition over the whole study period was 11.0% (10.4%, 11.7%). Comparing models to estimate HRs, the estimates in the Fine and Gray models were lower than those in the Cox models.
Conclusions Multistate and time-varying survival analysis models should be used to study the natural development of multimorbidity.
What is already known on this topic
Previous studies have defined the patterns of multimorbidity based on reports of multiple chronic conditions at a baseline time and employed standard survival analysis methods to estimate the risk of diagnosis of a new condition. Specifically, they have used the Kaplan-Meier (KM) method to estimate the cumulative incidence or the Cox proportional hazards model to estimate HRs. These methods do not consider the competing risk of death or changing risk if the patterns of multimorbidity change after the baseline.
What this study adds
These standard methods overestimate cumulative incidence and HRs. The effects were illustrated using data from a 25-year population-based longitudinal study showing the progression of cardiometabolic multimorbidity (from diabetes, heart disease and stroke) and the risk of dementia. For cumulative incidence, multistate models with Aalen-Johansen estimates can be used to account for competing risks (such as death) and changes in risk that may occur with changing patterns of multimorbidity. For HRs competing risk models, such as that proposed by Fine and Gray, with time-varying predictors can be used to take into account the progression of multimorbidity.
How this study might affect research, practice or policy
Estimates of cumulative incidence and HRs of disease progression in the context of multimorbidity must take into account both the competing risk of death and changes in patterns of disease that can affect the risk of further chronic disease.
Introduction
The association between multimorbidity, defined as the existence of at least two chronic conditions, and poor health outcomes and greater use of health services has been well-documented.1 Multimorbidity and increasing age may put people at higher risk of additional chronic conditions.2 For example, several studies have investigated the relationship between patterns of multimorbidity (eg, patterns of cardiometabolic multimorbidity) and the risk of being diagnosed with a new condition (eg, dementia).3–7 But as Hu has recently pointed out the association will be affected by both the competing risk of death and the changes in the patterns of multimorbidity over time or as the study population ages.8
The commonly used analytic approach involves defining the patterns of multimorbidity at a baseline time and estimating the risk of diagnosis with a new condition using standard survival analyses such as the Kaplan-Meier method (KM) (to estimate cumulative incidence) or the Cox proportional hazards regression model (to estimate HRs). These methods have drawbacks. The first limitation is the failure to consider the competing risk of death. A competing risk is an event whose occurrence precludes the occurrence of the primary outcome of interest.9 For example, death is a competing risk that changes the probability of diagnosis with other conditions.10 The second limitation is that the progression of conditions between the baseline and the occurrence of the outcome is not captured.8 Often, the incidence of conditions increases with age. Therefore, if the subjects are recruited at different ages, the distribution of the conditions and hence the patterns of multimorbidity will differ.11
This paper describes the statistical methods that can be used to overcome these limitations and shows how they reduce bias when the aim is to estimate the cumulative incidence or improve the goodness of fit when the aim is to estimate HRs associated with different patterns of multimorbidity.
For illustration purposes, this research examines the risk of incident dementia (as the outcome) for subjects with different progression patterns of cardiometabolic multimorbidity (as the exposures). For simplicity, covariates (such as smoking) are not considered. Dementia was selected as the main outcome as this disease is a potential accelerator of death.12 Cardiometabolic conditions were selected as exposure variables as the association between these conditions and the risk of dementia has been confirmed in several studies.13 14
As a case study patterns of multimorbidity are illustrated by combinations of three cardiometabolic conditions diabetes (D), heart disease (H) and stroke (S). At any time, the study participants can be categorised into eight mutually exclusive patterns: no disease (none), diabetes only (D), heart disease only (H), stroke only (S), diabetes and heart disease (D+H), diabetes and stroke (D+S), heart disease and stroke (H+S), and diabetes and heart disease and stroke (D+H+S). Seven different methods of analysis are explained and illustrated using data from participants in the Australian Longitudinal Study on Women’s Health (ALSWH).
Materials and methods
Survival analysis is based on the time (shown by t) from a baseline until the occurrence of an event of interest. Two main functions in survival analysis are the hazard function (shown by h) and the survival function (shown by S). In the absence of competing risks, the hazard function describes the instantaneous rate of occurrence of the event of interest in subjects who are still at risk of the event. The survival function at time t describes the probability of being event-free at least up to time t. Technical details to estimate hazard and survival functions (ie, h and S) are provided in online supplemental materials.
Estimation of the cumulative incidence
The KM method gives a simple estimate of the survival function. The KM method relies on the assumption of non-informative censoring. This means that individuals who are censored (ie, are no longer contributing data after time t) are assumed to have the same future risk of the outcome. This assumption is violated when the reason for censoring is the occurrence of a competing event such as death. Therefore, KM estimates tend to be biased upward.15 16
More generally, in the presence of competing risks, a person can experience one of K different events (including the outcome of interest). The cumulative incidence function (CIF) method allows for the estimation of the cumulative incidence while taking the competing risks into account. Assume status is an indicator variable denoting the type of event that occurred. CIF for the kth event (eg, dementia) is an estimate of the probability of experiencing the kth event before time t and before the occurrence of any competing events (eg, death) (see online supplemental materials for technical details).
In contrast to the competing risk models in which people can only experience one of the outcomes of interest or a competing event, the multistate model provides a flexible framework that captures the progression of events and allows transition between several events over the life course.17 The term state is used to describe any one or combination of events or conditions (eg, corresponding to different multimorbidity patterns). Multistate models are described by two main quantities: transition hazard and transition probability. The hazard for the transition between one state and the next (say from state m to state n at time t) is the instantaneous rate of occurrence of the second state among people who are in the first state.18 The estimated transition hazards are then used to obtain the Aalen-Johansen estimates of the cumulative incidence.17 This differs from the KM and CIF methods which estimate the cumulative incidence of the outcome from baseline pattern (ie, ignoring changes in these patterns that may have occurred during the study period).
Estimation of the HR
The Cox proportional hazards regression model relates the hazard function (ie, rate of occurrence of the outcome) to the independent variables (eg, baseline patterns of multimorbidity). Like the KM method, the Cox regression relies on the assumption of noninformative censoring. To overcome this limitation, competing risk models can be used such as that suggested by Fine and Gray.9 They defined a subdistributional hazard function to describe the instantaneous rate of occurrence of the kth event in people who have not yet experienced this event including those who have experienced a competing event. The estimated HRs should be interpreted in terms of the relative incidence of the outcome rather than hazards.
Both Cox and Fine and Gray models can be used to estimate relative hazards of the outcome of people with various combinations of conditions at baseline compared with the rate for people with none of these conditions. However, these analyses ignore the progression of multimorbidity during the study period. This limitation can be overcome using the time-varying Cox regression and time-varying Fine and Gray models. For these models, the patterns of conditions are treated as time-varying covariates.19
Study sample
ALSWH is a prospective nationwide population-based study of four cohorts born in 1989–1995, 1973–1978, 1946–1951 and 1921–1926.20 The current analyses related to women born in 1921–1926 who consented for data linkage.21 The baseline survey was conducted in 1996, when the women were aged 70–75 years and the next surveys were conducted in a 3-year cycle up to 2011. Since November 2011, the women have been surveyed every 6 months. Data on the first record of cardiometabolic conditions (ie, diabetes, heart disease and stroke) and dementia have been obtained from multiple sources, including hospital and medication records, aged care assessments, death certificates and survey data completed by the participants or other informants (online supplemental table S1). Diabetes included type 1 and 2 diabetes mellitus and excluded gestational diabetes. Heart disease included heart surgery (heart bypass, angioplasty and angiography) and acute coronary syndrome but not heart failure. Stroke included ischaemic and haemorrhagic stroke. Dementia included Alzheimer’s dementia, vascular dementia and unspecified dementia. For data analysis, it was assumed when a condition was reported, it continued on for the whole follow-up period. This assumption may be realistic due to the chronic nature of the conditions.
Of the 12 432 women who participated in the first ALSWH survey in 1996, 12 070 women consented to data linkage. Removing 125 women for whom the first record of a condition was after or the same as the date of death and another 15 women with a record of dementia before the baseline survey, the sample size for data analysis was 11 930. The final date for follow-up was 31 December 2019 (when the surviving women would have been aged 93–98).
Statistical analysis
Seven methods of analysis were used: three methods to estimate the cumulative incidence of dementia (and competing events) up to the age of 90, and four methods to estimate HRs. Methods applied, the measurement of the exposure variables (ie, patterns of cardiometabolic multimorbidity), and the limitations of each method are summarised in table 1. The different methods need different data layouts and time metrics. An example of the appropriate data layout for different methods is provided in online supplemental materials.
Table 1
|
List of methods applied, measurement of exposure and limitations
The KM method was used to estimate the cumulative incidence of dementia from the baseline combinations of D, H and S. In the KM method, status was defined as 1 for women with a record of dementia and 0 otherwise. Time was defined as the age at dementia for those with a record of dementia, or the age at death or censoring (ie, age on 31 December 2019) for other women.
The CIF method was used to estimate the cumulative incidence of dementia from the baseline combinations of D, H and S while considering the competing risk of death. In the CIF method, status was defined as 1 for women with a record of dementia, 2 for women who died without a record of dementia and 0 otherwise. Time was defined as the age at the corresponding events.
Neither KM nor CIF methods capture the progression of conditions (and hence the progression of multimorbidity patterns). The multistage model with the Aalen-Johansen method was used to estimate the transition probabilities between all states (more details in online supplemental materials), and the cumulative incidence of dementia (and all other competing events) from the last combination of D, H and S. State was defined as a condition or combination of conditions (eg, ‘diabetes’ or ‘diabetes and stroke’). Death was defined as the absorbing state. Cox regression with separate baseline hazard functions for each transition was used to estimate the transition hazards for all possible transitions (more details in online supplemental materials). To capture the effect of ageing and holding the Markov assumption, the survival time is calculated from the beginning of the study (called clock-forward) rather than the age at entry to the states (called clock reset). In other words, the time to reach each state was set as age in that state.
Standard Cox and Fine and Gray methods were used to estimate HRs of the baseline combinations of D, H and S compared with no baseline conditions, ignoring changes in the patterns during the study period. For these analyses, time was defined as the difference between the last follow-up time (ie, age of dementia, death or censoring whichever was reported first) and the age at baseline. In the standard Cox method, status was defined as 1 for women with a record of dementia and 0 otherwise. In the Fine and Gray model, status was defined as 1 for women with a record of dementia, 2 for women who died without a record of dementia and 0 otherwise.
Finally, time-varying Cox and Fine and Gray methods were applied to take the changing patterns into account, with the multimorbidity pattern treated as a time-varying covariate. In time-dependent models, start and stop variables were used to define the periods at which women contributed to the analysis with different progression patterns. For example, for a woman with a report of diabetes at the age of 75, stroke at the age of 77 and dementia at the age of 85, the following intervals have been defined: (0, 75), (75, 77) and (77, 85) (more details in online supplemental materials). The status was defined for each follow-up window. In the time-dependent Cox model, for the last start and stop window, status was coded as 1 for women with a record of dementia and 0 otherwise. Status took 0 for other start and stop intervals. In the time-dependent Fine and Gray model, for the last start and start follow-up interval, status was coded as 1 for women with a record of dementia, 2 for women who died without a record of dementia and 0 otherwise. For all four regression models, the Bayesian information criterion (BIC) was estimated as a measure of the goodness of fit. Status took 0 for other start and stop windows.
Assumptions
The proportional hazard assumption was checked using the test of interaction with time. Moreover, an assumption behind multistate and time-dependent models is that no more than one event per person can happen at the same time. In our data, some patients had two conditions reported on the same date (ranging from 59 (0.5%) for ischaemic heart disease (IHD) and dementia to 184 (1.5%) for IHD and stroke) (known as tie). When a tie occurred, ties were split randomly. As a sensitivity analysis, the analysis was repeated excluding the ties and there were only minor changes to decimal points in the estimated risks.
Software
The following packages in R were used for data analysis: ‘tidyverse’22 and ‘lubridate’23 for data cleaning and preparation, ‘ggplot2’ for visualisation,24 ‘survival’ for KM, standard Cox and time-varying Cox models, ‘tidycmprsk’ for estimation of CIF,25 and ‘mstate’ for the multistate models.18
Results
Patterns of cardiovascular multimorbidity
Of the 11 930 women in the study sample, 83.8% had no cardiometabolic condition at baseline (ie, at ages 70–75) and the proportions with patterns of ‘S’, ‘D+S’ and ‘D+H+S’ were all below 1.0% (table 2). To show the effects of age on the changing patterns of conditions, corresponding numbers based on the reports of conditions up to the ages 76–81 (ie, age at survey 3) and ages 82–87 (ie, age at survey 5) are given in online supplemental tables S2 and S3. For example, the proportion of women with the pattern of ‘S’ was 101 (0.8%) at the baseline survey (ages 70–75 years), 419 (3.6%) at survey 3 (ages 76–81 years) and 546 (5.3%) at survey 5 (ages 82–87 years).
Table 2
|
Frequency (percentages) of women with each pattern of cardiometabolic condition at baseline (age 70–75)
Online supplemental figure S1 depicts the progression of cardiometabolic conditions over the study period and before the first report of dementia or death. A total of 3786 women (31.7%) did not have a report of any cardiometabolic condition by the end of the study. The number (proportion) of women with each cardiometabolic condition as the first condition were ‘D’ 1884 (15.8%), ‘H’ 4840 (40.5%) and ‘S’ 1420 (11.9%). Among states of two conditions, the most common was 1476 women (12.4%) whose first and second reported conditions were ‘D’ and ‘H’ in either order (ie, ‘D+H’). Moreover, 534 women (4.5%) had all three conditions before having a report of dementia or death (state of ‘D+H+S’ in online supplemental figure S1). These figures (that contributed to the multistate model) were much higher than the figures where the pattern was defined based on the history of conditions up to a particular survey.
Estimates of cumulative incidence
The estimated cumulative incidence of dementia at different ages for all three methods and all eight patterns is depicted in figure 1. Estimates from the KM and CIF were similar up to the age of 80. After the age of 80, the KM estimates increased at a faster rate. Estimates from the multistate model were lower than those from the other two methods.
Comparison of estimated cumulative incidence of dementia for women with different patterns of cardiometabolic multimorbidity using different methods; Kaplan-Meier (red solid line), cumulative incidence function (blue dashed line) and multistate model (black dotted line).
The estimated cumulative incidence of different outcomes using different methods is summarised in table 3. For example, for women with the pattern of ‘None’, the only possible outcomes in the KM and CIF methods were dementia or death (since these methods defined the pattern based on baseline reports and did not consider the progression of cardiometabolic conditions). However, in the multistate model, the possible outcomes from the pattern of ‘None’ were ‘D’, ‘H’, ‘S’, ‘dementia’ or ‘death’.
Table 3
|
Cumulative incidence of cardiometabolic conditions, dementia and death up to the age of 90 for women with different progression patterns by different methods
The estimated cumulative incidence of dementia up to the age of 90 using the KM method declined when the competing risk of death was considered using the CIF method (table 3). For example, for women with the baseline pattern of ‘None’, the KM and CIF estimates of cumulative incidence of dementia were 35.7% (95% CI 34.6%, 36.8%) and 27.3% (26.4%, 28.2%), respectively (but these women may have developed cardiometabolic conditions during the study period). The Aalen-Johansen cumulative incidence of dementia as the first event for women with no cardiometabolic condition over the whole study period—not only at the baseline—was 11.0% (95% CI 10.4%, 11.7%).
Comparing the patterns representing one single cardiometabolic condition, the highest cumulative incidence was seen for women with the pattern of ‘S’. The KM and CIF estimates of cumulative incidence for those with ‘S’ at baseline were 56.6% (40.5%, 68.4%) and 31.7% (22.7%, 40.9%), respectively (but these women may have developed other cardiometabolic conditions before dementia). The Aalen-Johansen cumulative incidence of dementia as the second event for women who first had stroke (at baseline or some other time during the study period) was 17.9% (16.0%, 19.9%). Indeed, majority of women with the pattern of ‘S’ had a report of heart disease immediately after stroke (cumulative incidence 25.7%). Such changes in the pattern were ignored in KM and CIF methods.
The confidence intervals for the Aalen-Johansen estimates are much narrower than for KM or CIF estimates because they are based on data from more women; for example, only 101 women had ‘S’ at baseline (table 2) whereas 1420 women had ‘S’ as their only condition before a diagnosis of dementia (online supplemental figure S1).
Estimates of HRs
No departure from the PH assumption was seen. The standard and time-varying Cox and Fine and Gray models show that, compared with women with the baseline pattern of ‘None’, women with the other seven patterns usually had a higher risk of incident dementia (table 4). Online supplemental table S4 shows how the choice of baseline changed the estimated HRs for dementia in the standard Cox regression and Fine and Gray models. Generally, HRs declined when the baseline patterns of cardiometabolic conditions were determined at older ages (online supplemental table S4).
Table 4
|
Risk of incident dementia for women with different patterns of cardiometabolic conditions
Compared with the standard or time-varying Cox models, the HRs estimated in the Fine and Gray models were lower. The BIC of the time-dependent models was lower than the models in which the pattern was treated as a fixed variable (ie, determined based on baseline reports) (table 4).
Discussion
Despite the methodological advances in survival analysis, research on disease progression in multimorbidity is still dominated by the use of the KM method and Cox regression. In this research, we used progression from cardiometabolic multimorbidity toward dementia as an example to compare the performance of alternative modelling strategies. The practice of categorising the patients into different patterns from baseline reports of conditions or biological variables is common in other settings such as the progression from mental health disorders toward diabetes,26 from body mass index trajectories in childhood to cardiometabolic conditions in adulthood27–29 or from the trajectory of blood pressure toward cardiovascular diseases.30
Several studies have recommended adjustments for competing risks in fields such as cardiovascular disease, coronary diseases9 31 32 or nephrology,16 especially for elderly populations which may be more frail and at greater risk of competing events occurring before the main outcome.33 However, no study compared the performance of traditional methods with more advanced methods that take into account the progression of conditions over the follow-up time.
Comparison of methods to estimate the cumulative incidence
Compared with CIF estimates, the estimated cumulative incidence in the KM method was biased upwards. This was consistent with findings from previous studies in other settings.9 31 32 In the KM and CIF methods, the only possible destinations for women with any of the baseline cardiometabolic patterns were dementia or death. This was not the case in the multistate model, as the method captures the progression of conditions after the baseline survey. For example, based on the multistate model, the majority of women with the pattern of ‘D’ (anytime during the follow-up) had a report of ‘H’ as their second immediate event (cumulative incidence=44.7%), indicating that women with the pattern of ‘D’ were likely to move to the pattern to ‘D+H’ before having a report of dementia (table 3).
An advantage of the Aalen-Johansen multistate method is to take into account the progression of conditions over the study period. This is important, especially for conditions with low prevalence at mid-age. For example, the number of women with the pattern of ‘S’ was 101 at baseline (contributed to KM and CIF analyses). However, 1420 women had ‘S’ as their only condition before a diagnosis of dementia (contributed to multistate analysis). Moreover, confidence intervals for the Aalen-Johansen estimates were narrower, compared with KM or CIF estimates (as the method captured the progression of conditions after baseline). This indicates that an advantage of the multistate model is that the method provides more accurate estimates as it incorporates information on reports of conditions over the whole study period into the analysis.
Comparison of methods to estimate the HR
Compared with women with the pattern of ‘None’, women with other patterns were at higher risk of incident dementia. This was consistent with the results of previous studies (in which the baseline pattern was used as the exposure and the standard Cox regression as the modelling approach).4 7 34–36 Moreover, HRs in the Fine and Gray models (that consider the competing risk of death) were lower than in the Cox models. Only one of the five previous studies of the association between patterns of cardiometabolic conditions at baseline survey and risk of incident dementia compared results of the standard Cox model with the Fine and Gray model.34 Using the standard Cox model, relative to the participants with no cardiometabolic condition, the HRs for participants with one condition and with cardiometabolic multimorbidity (two or more conditions) were 1.42 (95% CI 1.27, 1.58) and 2.10 (1.73, 2.57), respectively. In the competing risk analysis, the HRs declined to 1.07 (0.97, 1.17) and 0.92 (0.77, 1.09).
In addition to the competing risk of death, Hu recently pointed out that multimorbidity tends to increase with age so that baseline patterns of conditions are likely to change over time.8 This issue is illustrated by comparing table 2 and online supplemental tables S2 and S3 which show how the distribution of cardiometabolic multimorbidity patterns among ALSWH participants changed from 1996 when they were aged 70–75 at the time of survey 1, to 2002 when they were aged 76–81 (corresponding to survey 3) and 2008 when they were aged 82–86 (corresponded to survey 5). Moreover, online supplemental table S4 showed a decline in the estimated HRs for dementia in the standard Cox or Fine and Gray regression when the baseline patterns of cardiometabolic conditions were determined at older ages (online supplemental table S4). This effect is also apparent in the literature.37 This same scenario of baseline patterns of the cardiometabolic conditions of diabetes, heart disease and stroke used to predict the subsequent risk of dementia has been examined by several authors.4 7 34–36 The HRs calculated by standard Cox models are shown in online supplemental table S5.
These examples show the importance of taking changing patterns of multimorbidity into account. This can be done in various ways. One approach is to use the different patterns of conditions as time-varying covariates or predictor variables in Cox or Fine and Gray models. However, there are alternative time-varying models with different assumptions. For example, each of the cardiometabolic conditions—but not the pattern—could be defined as a time-varying covariate. Alternatively, the conditions could be treated as recurrent events.38 To compare results from the time-varying and standard form of the models, for this paper the pattern of conditions was taken as a time-varying covariate. Our results found that the time-dependent models (that captured the progression of conditions over the study period) had better goodness of fit. The data layout required to apply advanced methods discussed in this paper is presented in online supplemental tables S6–S8.
Austin et al have urged caution in the interpretation of the estimates based of time-varying covariates, especially from a Fine and Gray model.39 They argued that such variables should be included if the value of the covariate is known for the entire time that the subject remains in the risk set. In our study, it was assumed that when a subject has a report of condition it remains switched on. This assumption is justifiable due to the chronic nature of the cardiometabolic conditions.
Other methods
In this paper, we used baseline reports of cardiometabolic conditions and categorised the subjects into eight mutually exclusive patterns to address the pitfalls of traditional methods in which competing risks and the progression of conditions are not considered. Some other studies used group-based trajectory modelling or latent class growth analysis to categorise the participants according to the temporal patterns of longitudinal change of multimorbidity status based on the history of chronic conditions up to a baseline time.3 For example, Chen et al classified the subjects into mutually exclusive categories based on the speed of growth of conditions and estimated the risk of incident dementia using the standard Cox model. Methods based on grouping common trajectory patients have been used in other settings as well, for example, to examine the association between body mass index trajectories in childhood and the risk of cardiometabolic conditions in adulthood,27–29 or to study the trajectory of blood pressure and risk of cardiovascular diseases.30
The use of such methods in conjunction with the standard regression models results in the same shortcomings. This is because if the clustering or trajectory analysis had been performed at a different time point, the distribution of the trajectory categories could have changed. The aim of this paper was not to compare different algorithms to classify the subjects based on the history of conditions up to a time point, but rather to demonstrate the bias in estimation of risk when competing risks and progression of conditions are not considered.
Joint modelling of longitudinal and survival methods is an alternative method with a longitudinal component and a survival component.40 Usually, the longitudinal component examines the trajectory of a given continuous risk factor as a function of other risk factors using linear mixed regression models. For example, the method has been used to estimate the association between viral load and survival of HIV/AIDS patients.41 The authors examined the association between some risk factors and the trajectory of viral load (as a longitudinal outcome) using linear mixed regression, as well as the association between the same risk factors and the estimated trajectory of viral load on the survival of the patients (as the survival outcome). The application of such methods was beyond the scope of this study.
Another advanced method is latent transition analysis (LTA) in which several variables are used to extract latent classes and estimate latent transition probabilities from one latent class to another. Similar to the multistate models, the LTA relies on the Markov assumption.42 We did not apply this method as the patterns of cardiometabolic conditions were observable.
Limitations
Our study had some limitations. First, differences in the periods when data were available for various conditions (online supplemental table S1) mean that the HRs and the cumulative incidence may have been underestimated. Second, the diagnosis of cardiometabolic conditions and dementia was based on self-report (from ALSWH surveys) or administrative health records. Therefore, some cases of cardiometabolic conditions or dementia may not have been diagnosed. For example, estimating the prevalence of dementia in 1921–1926 ALSWH women using the capture-recapture methodology found 2.7% underestimation in prevalence (20.4% in linked data vs 26.0% in the capture-recapture method).43 It should be noted that these limitations were consistent across all seven methods evaluated in this study. Therefore, the comparative analysis and the conclusions about the methods remain valid, as all methods were subject to the same constraints.
Another limitation of this study which may also be a strength was that the sample was only comprised of women. It has been shown that there are gender differences in terms of risk factors for cardiometabolic conditions44 and the patterns of cardiometabolic multimorbidity.45 Therefore, the estimates should not be generalised to the whole population.
In this research, left censoring (when an event occurs before the start of the study) and left truncation (when subjects at risk prior to baseline do not remain observable until the start of follow-up) may not be an issue because in the first ALSWH survey, women were asked ‘Have you ever been diagnosed with …?’. Therefore, events that had occurred before the ALSWH surveys were identified, and it was assumed that women were at risk from the starting date of the coverage period of the data sources used to extract records of conditions. It should be acknowledged that the ALSWH survey experienced moderate non-response rates during each follow-up. As summarised in online supplemental table S1, multiple sources of data were used to identify reports of conditions. Therefore, even if women were no longer completing the ALSWH survey, the incidence of hypertension, diabetes, stroke or dementia, could be identified using other sources.
Conclusion
There are tutorials on the utility of competing risks and multistate models.17 46 Moreover, previous studies that applied multistate models mainly emphasised the risk factors that were associated with disease progression47–49 or cumulative incidence of being in each state.50 To our knowledge, this is the first study that has compared the performance of seven alternative methods to demonstrate the flexibility of multistate and time-varying models to investigate the progression from cardiometabolic conditions to dementia. In conclusion, the KM and Cox regression models do not consider the competing risks or the progression of exposure variables. On the other hand, multistate and competing risk models with time-varying covariates are flexible tools that make the most use of data allowing for the transitions between conditions over the study period. These models may provide further insight into the progression patterns of multiple chronic conditions.
Contributors: All authors (MRB, AD and GM) contributed to the design of the study. MRB undertook the statistical analyses, and all authors interpreted the results. All authors (MRB, AD and GM) read and revised the manuscript and accepted the final version of the manuscript and were accountable for all aspects of the work. MRB is the guarantor for this work and accepts full responsibility for the finished work.
Funding: GM and MRB is supported by NHMRC Investigator grant (APP2009577). The funding bodies played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
Competing interests: None declared.
Patient and public involvement: Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review: Not commissioned; externally peer reviewed.
Supplemental material: This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.
Data availability statement
No data are available. The data that support the findings of this study are not openly available to protect the privacy of study participants. Sample codes as well as data layouts required to fit each of the seven models are provided in online supplemental materials.
Ethics statements
Patient consent for publication:
Not applicable.
Ethics approval:
This study involves human participants and the ALSWH has been granted ethics clearance by the Human Research Ethics Committees at the University of Newcastle (ref no. H-0760-0795) and the University of Queensland (ref no. 2004000224). Details of ALSWH ethical approvals for linked administrative datasets are provided here: https://alswh.org.au/alswh-hrec-approvals/. This study has been approved by the University of Queensland (Project ID: A1392). Informed consent was obtained from participants for each survey. All participants consented to data linkage. All methods were carried out in accordance with relevant guidelines and regulations. The authors declare that this work was in compliance with ethical standards.
Acknowledgements
We acknowledge that artificial intelligence assisted technologies were not used in this manuscript. The research on which this paper is based was conducted as part of the Australian Longitudinal Study on Women’s Health by the University of Queensland and the University of Newcastle. We are grateful to the Australian Government Department of Health and Aged Care for funding and to the women who provided the survey data. The authors also acknowledge: The Australian Government Department of Health and Aged Care for providing MBS and PBS data, and Aged Care data; and the Australian Institute of Health and Welfare (AIHW) as the integrating authority. The assistance of the Data Linkage Unit at the Australian Institute of Health and Welfare (AIHW) for undertaking the data linkage to the National Death Index (NDI). The Centre for Health Record Linkage (CHeReL), NSW Ministry of Health and ACT Health, for the NSW Admitted Patients, Emergency Department, and the ACT Admitted Patient Care, Emergency Department. Queensland Health as the source for Queensland Hospital Admitted Patient, and Emergency Collections; and the Statistical Analysis and Linkage Unit (Queensland Health) for the provision of data linkage. The Department of Health Western Australia, including the Data Linkage Service (WA), and the WA Hospital Morbidity and Emergency Department Data Collections. SA NT DataLink, SA Health and Northern Territory Department of Health, for the SA Public Hospital Separations, SA Public Hospital Emergency Department, NT Public Hospital Inpatient Activity, and NT Public Hospital Emergency Department. The Department of Health Tasmania, and the Tasmanian Data Linkage Unit, for the Public Hospital Admitted Patient Episodes, and Tasmanian Emergency Department Presentations. Victorian Department of Health as the source of the Victorian Admitted Episodes Dataset and the Victorian Emergency Minimum Dataset; and the Centre for Victorian Data Linkage (Victorian Department of Health) for the provision of data linkage.
Cassell A, Edwards D, Harshfield A, et al. The epidemiology of multimorbidity in primary care: a retrospective cohort study. Br J Gen Pract2018; 68:e245–51. doi:10.3399/bjgp18X695465•Google Scholar
Xu X, Mishra GD, Dobson AJ, et al. Progression of diabetes, heart disease, and stroke multimorbidity in middle-aged women: A 20-year cohort study. PLoS Med2018; 15. doi:10.1371/journal.pmed.1002516•Google Scholar
Chen H, Zhou Y, Huang L, et al. Multimorbidity burden and developmental trajectory in relation to later-life dementia: A prospective study. Alzheimers Dement2023; 19:2024–33. doi:10.1002/alz.12840•Google Scholar
Chen Y, Zhang Y, Li S, et al. Cardiometabolic diseases, polygenic risk score, APOE genotype, and risk of incident dementia: A population-based prospective cohort study. Arch Gerontol Geriatr2023; 105:104853. doi:10.1016/j.archger.2022.104853•Google Scholar
Khondoker M, Macgregor A, Bachmann MO, et al. Multimorbidity pattern and risk of dementia in later life: an 11-year follow-up study using a large community cohort and linked electronic health records. J Epidemiol Community Health2023; 77:285–92. doi:10.1136/jech-2022-220034•Google Scholar
Veronese N, Koyanagi A, Dominguez LJ, et al. Multimorbidity increases the risk of dementia: a 15 year follow-up of the SHARE study. Age Ageing2023; 52. doi:10.1093/ageing/afad052•Google Scholar
Wang Z, Marseglia A, Shang Y, et al. Leisure activity and social integration mitigate the risk of dementia related to cardiometabolic diseases: A population‐based longitudinal study. Alzheimer's & Dementia2020; 16:316–25. doi:10.1016/j.jalz.2019.09.003•Google Scholar
Austin PC, Fine JP. Practical recommendations for reporting Fine-Gray model analyses for competing risk data. Stat Med2017; 36:4391–400. doi:10.1002/sim.7501•Google Scholar
Cheng G, Huang C, Deng H, et al. Diabetes as a risk factor for dementia and mild cognitive impairment: a meta-analysis of longitudinal studies. Intern Med J2012; 42:484–91. doi:10.1111/j.1445-5994.2012.02758.x•Google Scholar
Wolters FJ, Segufa RA, Darweesh SKL, et al. Coronary heart disease, heart failure, and the risk of dementia: A systematic review and meta-analysis. Alz Dement2018; 14:1493–504. doi:10.1016/j.jalz.2018.01.007•Google Scholar
Noordzij M, Leffondré K, van Stralen KJ, et al. When do we need competing risks methods for survival analysis in nephrology? Nephrol Dial Transplant2013; 28:2670–7. doi:10.1093/ndt/gft355•Google Scholar
Putter H, Fiocco M, Geskus RB, et al. Tutorial in biostatistics: competing risks and multi-state models. Stat Med2007; 26:2389–430. doi:10.1002/sim.2712•Google Scholar
de Wreede LC, Fiocco F, Putter H, et al. mstate: an R package for the analysis of competing risks and multi-state models. J Stat Softw2011; 38:1–30. doi:10.18637/jss.v038.i07•Google Scholar
Therneau T, Crowson C, Atkinson E, et al. Using time dependent covariates and time dependent coefficients in the cox model. 2023; Google Scholar
Lee C, Dobson AJ, Brown WJ, et al. Cohort Profile: the Australian Longitudinal Study on Women’s Health. Int J Epidemiol2005; 34:987–91. doi:10.1093/ije/dyi098•Google Scholar
Dobson AJ, Hockey R, Brown WJ, et al. Cohort Profile Update: Australian Longitudinal Study on Women’s Health. Int J Epidemiol2015; 44:1547. doi:10.1093/ije/dyv110•Google Scholar
Wickham H. ggplot2: elegant graphics for data analysis. Springer-Verlag New York2016; Google Scholar
Sjoberg D, Fei T. tidycmprsk: competing risks estimation. 2023; Google Scholar
Deschênes SS, Burns RJ, Graham E, et al. Prediabetes, depressive and anxiety symptoms, and risk of type 2 diabetes: A community-based cohort study. J Psychosom Res2016; 89:85–90. doi:10.1016/j.jpsychores.2016.08.011•Google Scholar
Aris IM, Rifas-Shiman SL, Li L-J, et al. Association of Weight for Length vs Body Mass Index During the First 2 Years of Life With Cardiometabolic Risk in Early Adolescence. JAMA Netw Open2018; 1. doi:10.1001/jamanetworkopen.2018.2460•Google Scholar
Wibaek R, Vistisen D, Girma T, et al. Body mass index trajectories in early childhood in relation to cardiometabolic risk profile and body composition at 5 years of age. Am J Clin Nutr2019; 110:1175–85. doi:10.1093/ajcn/nqz170•Google Scholar
Yuan Y, Chu C, Zheng W-L, et al. Body Mass Index Trajectories in Early Life Is Predictive of Cardiometabolic Risk. J Pediatr2020; 219:31–7. doi:10.1016/j.jpeds.2019.12.060•Google Scholar
Li F, Lin Q, Li M, et al. The Association between Blood Pressure Trajectories and Risk of Cardiovascular Diseases among Non-Hypertensive Chinese Population: A Population-Based Cohort Study. Int J Environ Res Public Health2021; 18. doi:10.3390/ijerph18062909•Google Scholar
Feakins BG, McFadden EC, Farmer AJ, et al. Standard and competing risk analysis of the effect of albuminuria on cardiovascular and cancer mortality in patients with type 2 diabetes mellitus. Diagn Progn Res2018; 2. doi:10.1186/s41512-018-0035-4•Google Scholar
Hageman SHJ, Dorresteijn JAN, Pennells L, et al. The relevance of competing risk adjustment in cardiovascular risk prediction models for clinical practice. Eur J Prev Cardiol2023; 30:1741–7. doi:10.1093/eurjpc/zwad202•Google Scholar
Wolbers M, Koller MT, Witteman JCM, et al. Prognostic models with competing risks: methods and application to coronary risk prediction. Epidemiology2009; 20:555–61. doi:10.1097/EDE.0b013e3181a39056•Google Scholar
Dove A, Guo J, Marseglia A, et al. Cardiometabolic multimorbidity and incident dementia: the Swedish twin registry. Eur Heart J2023; 44:573–82. doi:10.1093/eurheartj/ehac744•Google Scholar
Dove A, Marseglia A, Shang Y, et al. Cardiometabolic multimorbidity accelerates cognitive decline and progression to dementia in older adults. Alzheimer's & Dementia2021; 17. doi:10.1002/alz.050473•Google Scholar
Tai XY, Veldsman M, Lyall DM, et al. Cardiometabolic multimorbidity, genetic risk, and dementia: a prospective cohort study. Lancet Healthy Longev2022; 3:e428–36. doi:10.1016/S2666-7568(22)00117-9•Google Scholar
Fayosse A, Nguyen D-P, Dugravot A, et al. Risk prediction models for dementia: role of age and cardiometabolic risk factors. BMC Med2020; 18. doi:10.1186/s12916-020-01578-x•Google Scholar
Amorim L, Cai J. Modelling recurrent events: a tutorial for analysis in epidemiology. Int J Epidemiol2015; 44:324–33. doi:10.1093/ije/dyu222•Google Scholar
Austin PC, Latouche A, Fine JP, et al. A review of the use of time-varying covariates in the Fine-Gray subdistribution hazard competing risk regression model. Stat Med2020; 39:103–13. doi:10.1002/sim.8399•Google Scholar
Ibrahim JG, Chu H, Chen LM, et al. Basic concepts and methods for joint models of longitudinal and survival data. J Clin Oncol2010; 28:2796–801. doi:10.1200/JCO.2009.25.0654•Google Scholar
Luvanda HB, Mukyanuzi EN, Akarro RRJ, et al. A joint survival model for estimating the association between viral load outcome and survival time to death among HIV/AIDS patients attending health care and treatment centers in Tanzania. BMC Public Health2023; 23. doi:10.1186/s12889-023-16977-x•Google Scholar
Nylund-Gibson K, Garber AC, Carter DB, et al. Ten frequently asked questions about latent transition analysis. Psychol Methods2023; 28:284–300. doi:10.1037/met0000486•Google Scholar
Waller M, Mishra GD, Dobson AJ, et al. Estimating the prevalence of dementia using multiple linked administrative health records and capture-recapture methodology. Emerg Themes Epidemiol2017; 14. doi:10.1186/s12982-017-0057-3•Google Scholar
Zhang D, Tang X, Shen P, et al. Multimorbidity of cardiometabolic diseases: prevalence and risk for mortality from one million Chinese adults in a longitudinal cohort study. BMJ Open2019; 9. doi:10.1136/bmjopen-2018-024476•Google Scholar
Andersen PK, Geskus RB, de Witte T, et al. Competing risks in epidemiology: possibilities and pitfalls. Int J Epidemiol2012; 41:861–70. doi:10.1093/ije/dyr213•Google Scholar
Hazewinkel A-D, Lancia C, Anninga J, et al. Disease progression in osteosarcoma: a multistate model for the EURAMOS-1 (European and American Osteosarcoma Study) randomised clinical trial. BMJ Open2022; 12. doi:10.1136/bmjopen-2021-053083•Google Scholar
Neumann JT, Thao LTP, Callander E, et al. A multistate model of health transitions in older people: a secondary analysis of ASPREE clinical trial data. Lancet Healthy Longev2022; 3:e89–97. doi:10.1016/s2666-7568(21)00308-1•Google Scholar
Siriwardhana C, Lim E, Davis J, et al. Progression of diabetes, ischemic heart disease, and chronic kidney disease in a three chronic conditions multistate model. BMC Public Health2018; 18. doi:10.1186/s12889-018-5688-y•Google Scholar