Original Research

Assessing disparity in the distribution of HIV and sexually transmitted infections in Australia: a retrospective cross-sectional study using Gini coefficients

Abstract

Introduction The risk of HIV and sexually transmitted infections (STIs) varies substantially across population groups in Australia. We examined this disparity in HIV/STI distribution using Gini coefficients, where scores closer to one indicate greater disparity.

Methods We used demographic and sexual behaviour data from the Melbourne Sexual Health Centre, between 2015 and 2018. We examined 88 642 HIV consultations, 92 291 syphilis consultations, 97 473 gonorrhoea consultations and 115 845 chlamydia consultations. We applied a machine learning-based risk assessment tool, MySTIRisk, to determine the risk scores. Based on individuals’ risk scores and HIV/STIs diagnoses, we calculated the Gini coefficients for these infections for different subgroups.

Results Overall, Gini coefficients were highest for syphilis (0.60, 95% CI 0.57 to 0.64) followed by HIV (0.57, 95% CI 0.52 to 0.62), gonorrhoea (0.38, 95% CI 0.36 to 0.42) and chlamydia (0.31, 95% CI 0.28 to 0.35). Gay, bisexual and other men who have sex with men (GBMSM) had lower Gini coefficients compared with heterosexual men or women; HIV (0.54 vs 0.94 vs 0.96), syphilis (0.50 vs 0.86 vs 0.93), gonorrhoea (0.24 vs 0.57 vs 0.57) and chlamydia (0.23 vs 0.42 vs 0.40), respectively. The Gini coefficient was lower among 25–34 years than in other age groups for HIV (0.66 vs 0.83–0.90) and gonorrhoea (0.38 vs 0.43–0.47). For syphilis, the oldest age group (≥45 years) had a lower Gini coefficient than 18–24 years (0.61 vs 0.70).

Conclusions Our study demonstrated that HIV/STIs are more evenly distributed among GBMSM, suggesting widely disseminated interventions for GBMSM communities. In contrast, interventions for heterosexual men and women should be more targeted at individuals with higher risk scores.

What is already known on this topic

  • Disparities exist in the distribution of HIV and sexually transmitted infections (STIs) among different population groups.

What this study adds

  • Our analysis provides specific insights into the distribution of HIV and STIs among different groups. Notably, we found a higher Gini coefficient for heterosexual men and women, highlighting a concentrated risk, especially for HIV and syphilis. Conversely, lower Gini coefficients for gay, bisexual and other men who have sex with men (GBMSM) indicate a widespread risk, particularly for chlamydia and gonorrhoea, regardless of their risk scores among GBMSMs.

How this study might affect research, practice or policy

  • Our findings indicate that the interventions will be most cost-effective if they are focused on small, high-risk groups of heterosexuals. In contrast, the interventions for GBMSM should involve the entire GBMSM community, particularly for chlamydia and gonorrhoea, where risk appears to play a minimal role in identifying those most at risk.

Introduction

The WHO declared the target of ending the pandemic of sexually transmitted infections (STIs) by 2030. This is an enormous task considering that the WHO projected that in 2020 more than one million STIs were contracted each day, leading to a total of 374 million new infections with one of the four curable STIs: chlamydia, gonorrhoea, syphilis and trichomoniasis.1 Australia has observed a rapid rise in the incidence of STIs over the last decade, particularly in gay and bisexual men.2–5

In response to the rise in STIs, governments have implemented interventions to control these infections.6 7 A key part of implementing these interventions is deciding how best to target populations at particularly high risk for these infections.8–10 The effectiveness of interventions for HIV/STIs depends on the distribution of these infections within a population. If infections are widespread across all risk levels, yet interventions only target those at highest risk, these efforts may not yield desired results. Conversely, if infections primarily affect high-risk individuals, but interventions are distributed across all risk groups, resources might not be used effectively.

Gini coefficients have primarily been used to measure economic inequality in countries. More recently, however, several studies in Canada, the UK and the USA have used Gini coefficients to investigate the inequalities in the geographical distribution of STIs in different locations to facilitate geographically specific interventions.11–14 Recently, limited studies have used the Gini coefficients to measure the disparity in the distribution of STIs by risk15 16 and by geographical location.17 Gsteiger et al15 compared the disparity of chlamydia by risk over time using two population datasets in the UK and reported that the Gini coefficients for chlamydia among females and found they had not changed over time; 0.30 (1999–2001) and 0.33 (2010–2012).15 van Wees et al16 calculated the Gini coefficients for chlamydia, gonorrhoea and syphilis among men who have sex with men (MSM) before and after the implementation of pre-exposure prophylaxis (PrEP) for HIV in the Netherlands. van Wees et al demonstrated that the Gini coefficients had increased for chlamydia (0.37–0.43) and syphilis (0.50–0.66) after PrEP became available, while gonorrhoea remained stable.16 To date, however, no study has assessed the Gini index within different risk populations in the same study to compare different infections within different populations or used a composite measure of risk.15 16 These studies estimated the risk of HIV/STI infections based on only up to five risk measures.15 16 We have recently developed machine-learning approaches to calculating a composite risk score based on many individual risks.18–20

We aimed to estimate the disparity in the distribution of four infections (HIV, chlamydia, gonorrhoea and syphilis) across different population groups (GBMSM, heterosexual men and women) and different age groups using a composite measure of risk.

Methods

The Melbourne Sexual Health Centre (MSHC) is Australia’s largest public sexual health clinic and offers free HIV/STI services to the general public. At MSHC, the individuals’ demographic and sexual behavioural data are recorded through computer-assisted self-reported interviews at the initial visit and any subsequent visit that is at least 3 months apart. Diagnoses are recorded in the clinic database using predetermined data fields.19

Data

We conducted a retrospective cross-sectional study using data from the MSHC. We extracted demographic and behavioural data from the medical record for individuals attending between 2 March 2015 and 31 December 2018. The datasets for each infection included consultations where individuals were tested for specific infections at that consultation. The HIV dataset included a total of 88 642 consultations, the syphilis dataset included 92 291 consultations, the gonorrhoea dataset included 97 473 consultations and the chlamydia dataset had 115 845 consultations.20

Estimating the risk scores of HIV/STIs

We applied a risk prediction tool, MySTIRisk, developed from a previous study to the dataset.20 MySTIRisk is a machine learning-based risk assessment tool that estimates a risk score, which ranges from 0 to 1 with 1 indicating the highest risk, for each infection based on the predictors.20 During the development of MySTIRisk, rigorous training and testing procedures were conducted. For the development process (training and testing) of the tool, we used clinic consultations tested for HIV/STIs at the MSHC between 2015 and 2018. The results demonstrated that the tool performed at an acceptable to excellent level, as indicated by the area under the curve (AUC) values: HIV (0.78), syphilis (0.84), gonorrhoea (0.78) and chlamydia (0.70).20

To ensure the reliability and generalisability of the tool, external validation was performed using two separate datasets. The first external validation dataset, from 2019, showed consistent and stable performance with AUC values of HIV (0.79), syphilis (0.85), gonorrhoea (0.81) and chlamydia (0.69). The second external validation dataset, covering the years 2020–2021, also demonstrated reliable performance with AUC values of HIV (0.71), syphilis (0.84), gonorrhoea (0.79) and chlamydia (0.69).

The most important predictors in our models were gender, age, country of birth, men who reported having sex with men, presence of STI symptoms at the time of visit, the number of casual sexual partners, condom use, the last time of drug injection (if present), past STIs, contact with someone diagnosed with STIs, and having sex with someone outside Australia and New Zealand. Since infection risk scores generated by our models were not equivalent to the probability of infection, we calibrated and fitted the data using a logistic function to provide the prevalence for each infection for the clients. MSHC is working on deploying the MySTIRisk tool as a public-facing web app21 that will allow people to determine their risks and encourage early STI testing and diagnosis. We use the infection risk scores in the study to calculate Gini coefficients.

Estimating Gini coefficients

Based on the above risk scores, we used Lorenz curves to plot the cumulative proportion of HIV/STIs as a function of the cumulative proportion of the clinic consultations ranked from lowest to highest risk scores that we generated from MySTIRisk. By definition, the Gini coefficient is a single number that measures the disparity of the distribution of HIV/STIs. The coefficient is calculated as the area beneath the line of perfect equality minus the area beneath the Lorenz curve, divided by the area beneath the line of perfect equality.22 A Gini coefficient of 0 indicates perfect equality, with a homogeneous distribution of HIV/STIs over the infection risk scores of the population. In contrast, Gini coefficients closer to 1 indicate that STI diagnoses are concentrated in parts of the population with higher infection risk scores.

We computed Lorenz curves and estimated Gini coefficients along with their 95% bootstrap CIs for HIV, syphilis, gonorrhoea and chlamydia infections for each risk group (heterosexual men, women and GBMSM). The bootstrap method, a resampling technique where each sample is generated through random selection with replacement from the original dataset, was employed to account for variability and provide robustness in the estimation of Gini coefficients.11 12 15 We calculated the 2.5th and 97.5th percentiles from 5000 bootstrapped Gini coefficients to form the 95% bootstrap CIs. This methodology allowed us to ascertain the precision of our estimates and understand the distribution of these infections across different risk groups. We also conducted similar Gini coefficient analyses for various age groups (18–24 years, 25–34 years, 35–44 years, and 45 years and above) and compared the corresponding Gini coefficients and the positivity percentages across different subsets of the population.

We conducted these analyses using MATLAB R2022a.

Patient and public involvement

Given the retrospective nature of this study, patients and the public were not directly involved in research design, recruitment or conduct. However, the findings, particularly regarding the disparity of HIV/STIs as measured by Gini coefficients, contribute to the understanding of infection distribution, potentially informing public health policies and interventions. The results will be disseminated through academic publication, thereby contributing to the wider body of knowledge in the field.

Results

Demographic characteristics of study participants

For each of the four infections, gay, bisexual and other men who have sex with men (GBMSM) accounted for the majority (43%–52%) of the clinic consultations, followed by women (29%–33%) and heterosexual men (16%–23%) (table 1). Younger age groups (18–34 years) accounted for over 70% of clinic consultations. More than 55% of consultations were from individuals born overseas, and the remaining were born in Australia and New Zealand. The sexual risk predictors for each of the four infections are displayed in table 1.

Table 1
|
Characteristics of clinic consultations in individual datasets

Infection positivity in study participants

As shown in table 2, the positivity (the positive percentage of the HIV/STI diagnosis) for chlamydia was 8.82% (95% CI 8.66% to 8.98%), for gonorrhoea was 7.78% (95% CI 7.61% to 7.95%), for syphilis was 1.94% (95% CI 1.85% to 2.03%) and HIV was 0.24% (95% CI 0.21% to 0.28%). The positivity was highest among GBMSM for all four infections, while women had the lowest positivity across all HIV/STIs.

Table 2
|
Positivity (the positive percentage of the HIV/STI diagnoses) with corresponding 95% CI per total consultations

The risk scores among STI-positive participants

Among STI-positive participants, the median risk scores and IQRs were 0.76 (0.48–0.88) for the syphilis dataset, 0.66 (0.54–0.77) for the gonorrhoea dataset and 0.56 (0.44–0.68) for the chlamydia dataset (see figure 1).

Figure 1
Figure 1

Risk scores’ distribution for STI-positive versus STI-negative consultations. ANOVA, analysis of variance; STI, sexually transmitted infection.

Gini coefficients for HIV/STIs, in all participants

The Lorenz curves for HIV, syphilis, gonorrhoea and chlamydia infections are shown in figure 2. The chlamydia curve was closest to the diagonal line, indicating that chlamydia diagnoses were more homogeneously distributed over the population regardless of their risk scores. In contrast, the syphilis curve was furthest away from the diagonal line, indicating that syphilis diagnoses were strongly associated with the individuals with higher risk scores. The Gini coefficients were lower for chlamydia (0.31, 95% CI 0.28 to 0.35) and gonorrhoea (0.38, 95% CI 0.36 to 0.42) and higher for HIV (0.57, 95% CI 0.52 to 0.62) and syphilis (0.60, 95% CI 0.57 to 0.64).

Figure 2
Figure 2

Lorenz curve and Gini coefficients for HIV/STIs. CT, chlamydia; NG, gonorrhoea; STIs, sexually transmitted infections.

Gini coefficients for HIV/STIs, by sexual orientation and age groups

Table 3 and figure 3 show the Gini coefficients in GBMSM, heterosexual men and women. The Gini coefficients for each infection were lowest for GBMSM, indicating that each infection was more homogeneously distributed in GBMSM populations than heterosexual men or women. This difference in Gini coefficients between GBMSM and heterosexual men or women was more pronounced for HIV and syphilis than for gonorrhoea or chlamydia.

Figure 3
Figure 3

Lorenz curves showing the cumulative proportion of STI diagnoses among patients who visited the MSHC as a function of the cumulative proportion of all visits from lowest to highest risk score. In the figure, the diagonal line (black dash line) denotes perfect equality, which means an equal dispersion of the infection across the population. Note: MSM, male and female refer to GBMSM, heterosexual men and women, respectively. GBMSM, gay, bisexual and other men who have sex with men; MSM, men who have sex with men; MSHC, Melbourne Sexual Health Centre.

Table 3
|
Gini coefficients with 95% bootstrap CIs for HIV, syphilis, gonorrhoea and chlamydia across different risk and age groups

The differences in Gini coefficients across all age groups were less marked than the differences by risk group. Those aged 25–34 had the lowest Gini coefficients of 0.66 (95% CI 0.61 to 0.72) for HIV, 0.38 (95% CI 0.35 to 0.41) for gonorrhoea and 0.31 (95% CI 0.28 to 0.35) for chlamydia, respectively. For syphilis, those aged 45 years and above had the lowest Gini coefficient of 0.61 (95% CI 0.56 to 0.66) (see figure 3).

Discussion

We found that there were substantial differences in the Gini coefficients across the four HIV/STIs and across the risk groups. The highest Gini coefficients were among heterosexual males and females for HIV or syphilis, suggesting these infections are concentrated among heterosexuals with higher infection risk scores. These risk scores were calculated using our machine learning based risk assessment tool, MySTIRisk, which considers a variety of factors such as age, gender, country of birth, sexual behaviour and history of previous STIs. Therefore, a higher risk score suggests a higher probability of acquiring HIV and STIs. In contrast, the Gini coefficients were lower among GBMSM, particularly for chlamydia or gonorrhoea, suggesting these infections are widely distributed throughout the GBMSM population. The significance of these findings is that for HIV and syphilis, the interventions will be most cost-effective if they are focused on small, high-risk groups of heterosexuals. In comparison, the interventions for GBMSM should involve the entire GBMSM community, particularly for chlamydia and gonorrhoea, where risk appears to play a minimal role in identifying those most at risk.

Our study demonstrated that Gini coefficients for syphilis and HIV were higher than those for chlamydia and gonorrhoea. In particular, the finding that syphilis demonstrated the highest disparity followed by gonorrhoea and chlamydia is consistent with another study conducted in the Netherlands.16 Specifically, higher Gini coefficients indicate that infections are more concentrated in a smaller group of individuals with higher risk scores. This finding implies that more targeted interventions are necessary for HIV and syphilis infection, while more widespread programmes would be suitable for chlamydia and gonorrhoea. Higher Gini coefficients among HIV and syphilis may be explained by the known fact that those infections are more concentrated among certain subpopulations such as men who have sex with men, and sex workers leading to a higher degree of inequality in terms of disease prevalence.23–25 It means that a relatively small proportion of people disproportionately affected by syphilis and HIV, can lead to higher Gini coefficients than more evenly distributed STIs such as gonorrhoea and chlamydia. Additionally, HIV and syphilis are more severe and chronic conditions with a social and cultural stigma attached to them, which can prevent the infected individuals from seeking the testing and treatment and may potentially lead to higher Gini coefficients.26–28

In another study, Gsteiger et al15 used population data in the UK to calculate the Gini coefficients to measure the distribution of chlamydia. The authors compared the Gini coefficients for STIs for two survey periods of Natsal-2 (1999–2001) and Natsal-3 (2010–2012) and found similar results for chlamydia with 0.30 (95% CI 0.12 to 0.50) in Natsal-2 and 0.33 (95% CI 0.18 to 0.49) in Natsal-3 among females. However, our study has a slightly higher Gini coefficient of 0.40 (95% CI 0.35 to 0.45) in women with chlamydia. First, the difference in study population could have contributed to the discrepancy. The Natsal studies employed general population data, whereas our study focused on data from a sexual health clinic. It is well known that individuals attending such clinics may have a higher risk profile for STIs than individuals from the community which can consequently influence the Gini coefficient. Second, our definition of the exposure variable differed from the Natsal studies. The latter used a single risk factor—the number of new opposite-sex partners in the previous year—while we adopted a broader approach. By employing machine-learning algorithms, we generated a composite risk score encompassing multiple exposure variables. This more comprehensive risk assessment likely impacted our Gini coefficient. Lastly, unlike the Natsal studies, which focused solely on heterosexual females, we included heterosexual, bisexual and women who have sex with women (WSW). Similarly, it is important to note that the Natsal studies provided limited analysis on subgroups for different age groups and population groups as it only included the opposite-sex contacts for sexual behaviour variables population because of a limited proportion of individuals having same-sex partner in the dataset.

Our finding that the Gini coefficients were significantly lower in GBMSM is consistent with a previous study.16 van Wees et al16 developed a risk score calculator using multivariable logistic regression for HIV/STIs and used this to calculate the Gini coefficients for chlamydia, gonorrhoea and syphilis during the period before PrEP(2009 to mid-2015) and after PrEP (mid-2015 to 2019) was introduced. This study examined the distribution of STIs among HIV-negative MSM in Amsterdam Cohort Studies and found a similar pattern to our study, but with a slightly higher Gini coefficients for gonorrhoea (0.46) and chlamydia (0.43) and lower Gini coefficient for syphilis (0.50). These higher Gini coefficients for chlamydia and gonorrhoea in the Netherlands study may be explained by the lower positivity of chlamydia and gonorrhoea in their dataset compared with our study (4.6% vs 8.8% for chlamydia and 5.1% vs 7.8% for gonorrhoea). However, the Gini coefficient for syphilis was lower in the Netherlands study despite the lower positivity (0.7% vs 1.9%). This variation may partly be explained by the difference in sampling frame as the study only included HIV-negative MSM while our study included all GBMSM, regardless of the HIV status and the difference in the nature of risk prediction tools in calculating infection risk scores. While direct comparisons are not currently available, variations in healthcare systems, sexual health education and societal attitudes towards GBMSM populations between different countries such as the Netherlands and Australia could potentially lead to differences in the distribution of STIs such as syphilis. Further research is required to confirm this.

Based on our findings, we propose the need for both targeted and widespread intervention strategies for the control of HIV/STIs. For HIV and syphilis, which show a high degree of concentration in individuals with higher risk scores, targeted interventions are essential. Such interventions can be integrated into existing services and may include initiatives focused on testing, treatment adherence and education about safe sex practices. For chlamydia and gonorrhoea, which have a more widespread distribution across the population, broader public health strategies are needed. These could include regular STI screening programmes, public awareness campaigns and improving access to treatment, which can be integrated into general healthcare services. The strategic integration of these interventions into existing public health programmes and policies could contribute significantly to the control of HIV/STIs.

To our knowledge, this is the first research from Australia to examine the distribution of four STIs across different risk populations (GBMSM, heterosexual men and women) and different age groups. The main strength of the study is the use of a composite risk score that was only available because of the extensive data on sexual risk from attendees at MSHC. This composite score was generated using a machine learning approach20 and is likely to be a better representation of overall risk than a single epidemiological measure.

This study has several limitations. First, the predictive criteria are based on the clients' self-reported information, which is subject to recall, non-response and social-desirability biases. However, there is no other way to collect risk information and we have previously shown that the self-interview method which we used is the least influenced by social-desirability bias.29 Second, the datasets only included MSHC clients, who are at higher risk than the general population that includes lower-risk individuals. This may lead to an underestimate of the Gini index because our dataset likely under-represents the lower-risk individuals. While our focus was on the disparities within the clinic population, we acknowledge that this may limit the generalisability of our findings to other populations or settings. Therefore, it is important for policymakers and public health officials to consider these limitations when applying our findings to their respective contexts. Caution must be applied when extrapolating these results to broader contexts, as different populations may present unique risk profiles. Further studies in diverse settings are necessary to validate and extend our findings. Third, our unit of analysis was client consultations, not individual clients. This means our study reflects the number of consultations rather than the number of unique individuals. This approach could potentially over-represent individuals who had multiple consultations, thereby skewing the positivity rates of STIs and creating potential bias in estimating Gini coefficients. Nevertheless, given our large sample size and our focus on internal population disparities, we believe this approach’s impact on our findings is minimal. Fourth, we only used data from 2015 to 2018 because in 2015 we moved from culture to nucleic acid amplification testing for gonorrhoea30 and this period was too brief to identify a changing trend in HIV/STIs in Australia. Fifth, although our dataset was large enough for machine learning training and testing, the number of HIV and syphilis-positive cases was notably low, which may affect MySTIRisk prediction risk scores and Gini coefficients. Sixth, as the PrEP uptake status was not included as a predictor, we were unable to examine the changes in Gini coefficients before and after PrEP utilisation. Seventh, for gonorrhoea and syphilis, we did not include the anatomical site of infection as a predictor, for which we could not identify the distributions of STIs at different anatomical sites. Looking forward, we propose several directions for future research that could address our study’s limitations and enrich our understanding of HIV/STI disparities. Longitudinal studies are crucial for capturing shifts in HIV/STI distribution over time and evaluating the impact of interventions like PrEP. Additionally, exploring STI distribution across various anatomical sites could offer insights for targeted prevention strategies.

While our current study explores disparities in the distribution of HIV/STIs, we have not distinguished between high-risk and low-risk individuals based on explicit thresholds; a more nuanced approach is planned for future research. In anticipation of these future investigations, we intend to define these thresholds and risk subgroups, balancing factors such as disease burden, healthcare capacity and government funding. Therefore, the objective of our ongoing research is to refine our existing findings, thereby guiding the development of HIV/STIs intervention strategies and policy.

In conclusion, our study demonstrates that disparities exist in the distribution of HIV/STIs among different population groups, with implications for policy and interventions. The higher concentration of HIV and syphilis among heterosexual men and women indicates the need to identify and target high-risk subsets with focused testing and treatment. In contrast, the widespread distribution of chlamydia and gonorrhoea among GBMSM reinforces implementing broader screening and prevention. Estimating Gini coefficients enables tailored, data-driven approaches to early HIV/STI testing and treatment for those most affected. Our findings highlight the potential of Gini coefficients to inform resource allocation and policies aimed at HIV/STI control through precision public health strategies.