Introduction
Background and significance
The Orphan Drug Act defines a rare disease (RD) as a disease that impacts fewer than 200 000 people in the USA (https://rarediseases.info.nih.gov/about). However, with an estimated 7000–10 000 different RDs, 30 million—or 1 in every 10—Americans are collectively impacted by RDs.1–5 Accordingly, when all RDs are considered together, they have staggering implications for a large swath of the USA and global population, on healthcare systems (HCS) and, most importantly, on patients with RD.6
Despite the number of Americans estimated to be impacted by RDs, many of these patients experience difficulty obtaining timely and accurate diagnoses. Reasons for this include unknown molecular mechanisms for diagnostics, a lack of US Food and Drug Administration (FDA)-approved treatments, difficulty in navigating patient data,7 small and dispersed patient populations, diffused RD-specific expertise, overlapping symptoms and primary care physicians not well versed in all 10 000 RDs. As a result, many patients experience misdiagnosis and failed therapy interventions.8 9 The path to diagnosis is often a prolonged journey that, according to the World Economic Forum,10 can last an average of 7–8 years. Unfortunately, this lag in diagnosis may also result in missed opportunities to stop or slow RD progression. Patients may unknowingly forego disease-modifying therapy if available, or inappropriate care may be provided if misdiagnosed. Identifying individuals with RDs earlier could alleviate the long-term sequelae and financial burdens associated with RDs.11 12 Patients with RD also grapple with limited treatment options, challenges finding a specialised physician or treatment centre, little or no research being conducted for their disease, high treatment costs and difficulty accessing medical, social or financial services or assistance.13 Machine learning (ML), which encompasses a set of methodologies designed to gain insights and understanding from complex datasets,14 offers an opportunity to better characterise RD and potentially lead to earlier diagnosis by identifying key features associated with RD. However, ML approaches require large volumes of data to be most effective in making predictions. Thus, it is critical for the successful applications of such approaches to access and use large national databases for RD research, as regional approaches will be limited in patient numbers for any given RD.
Objectives
The objective of this study is to comprehensively analyse and characterise the prevalence, patient characteristics and economic implications associated with RDs in the USA. The fragmented nature of the US HCS can lead to considerable variability in patient care and utilisation,15 particularly concerning RDs. To address this, we have undertaken a detailed investigation using national commercial claims data sourced from the Healthcare Cost Institute (HCCI). This database encompasses a timeframe from 2012 to 2020 and includes data on an annual addition of 55 million individuals.
Building upon the groundwork laid by the Impact of Rare Diseases on Patients and Healthcare Systems (IDeaS) pilot study5 conducted by the Division of Rare Disease Research Innovation within the National Institutes of Health National Centre for Advancing Translational Sciences (NCATS), our focus centres on 14 specific RDs. These conditions were the subject of the aforementioned pilot study, which encompassed diverse HCSs characterised by variations in geographic coverage, insurance representation, patient volume and duration of coverage. Our study aims to expand upon the insights garnered from the IDeaS pilot study by employing records exclusively from the HCCI database spanning the years 2016 through 2020. The primary objectives of this investigation are to estimate the prevalence of RDs across the USA, categorise the distribution of RDs within urban and rural communities, delineate the economic costs associated with RDs, classify patients based on their demographic and clinical characteristics and explore the impacts of treatment modifications. Through this multifaceted analysis, we seek to provide a comprehensive overview of the landscape of RDs in the USA, thereby contributing valuable insights to the understanding of these conditions and their implications for patients and HSCs.