Materials and methods
Overall study setting and design
We conducted a retrospective cohort analysis of a prospectively collected EHR repository. The N3C consortium is a high-granularity EHR data repository of deidentified, patient-level data from medical centres across the USA. The overall N3C design, data ingestion and harmonisation, and sampling approach have been described previously.24 25 Additional details about N3C and ethical review are in online supplemental materials.
Our dataset included COVID-19 vaccines currently given Food and Drug Administration (FDA) authorisation, including two mRNA vaccines (Pfizer-BioNTech (BNT162b2) and Moderna (mRNA-1273)) and a viral vector vaccine (Johnson & Johnson/Janssen (JNJ-784336725)), as well as other vaccines (eg, AstraZeneca). We categorised initial full vaccination as completion of the recommended dosing regimen of any vaccine (ie, at least two doses for mRNA or other vaccines and one dose for the Janssen vaccine) and partial vaccination as one dose of mRNA vaccine. We categorised any additional doses after 90 days of initial dosing (eg, a third mRNA dose) as booster vaccinations and only accounted for the first booster vaccination in this analysis.26 The concepts and codes to define vaccinations can be found in online supplemental table 1.
Study period and analytical cohort selection
Our study observation period occurred from 10 December 2020 to 7 June 2022. We used N3C data version 84 (released 7 July 2022), noting that we truncated the observation period 1 month earlier to allow for adequate time for data reporting. We defined the beginning of the observation period, 10 December 2020, based on the date the FDA first approved COVID-19 vaccination.27 We chose the end date, 7 June 2022, to allow 90 days prior to when the bivalent booster became available (9 September 2022) so that this analysis amply precludes bivalent booster impact (separate analysis forthcoming).
We stratified the analysis by variants predominant during the pandemic: pre-Delta (10 December 2020, to 20 June 2021), Delta (21 June 2021 to 25 December 2021) and Omicron (26 December 2021 to 7 June 2022), with predominance periods defined based on Centers for Disease Control and Prevention (CDC) estimates for the USA in general.27
Cohort construction and bias control
Figure 1 illustrates our cohort construction. We included data from 43 total contributing sites, excluding sites in the bottom quartile of vaccination rates. Next, we generated subsets of this cohort to identify pregnant people (cohort 1) and vaccinated females (cohort 2).
Figure 1Flow chart of analytic cohorts selection from the N3C cohort
For the analysis of initial vaccinations with cohort 1, we included pregnant people aged 15–55 with COVID-19 vaccinations during or before their pregnancy on or after 10 December 2020 (ie, the earliest date that COVID-19 vaccination was approved by FDA). For unvaccinated pregnant people, we included those with pregnancy end dates after 1 March 2021. We used 1 March 2021, the first peak of the initial full vaccination date among vaccinated pregnant people, as the ‘proxy initial vaccination date’. Similarly, for the analysis of booster vaccinations with cohort 1, we included pregnant people with a booster vaccination during or before their pregnancy after 1 December 2021. For unvaccinated pregnant people, we included those with pregnancy end dates after 1 December 2021. We used 1 December 2021, the peak of booster vaccination date among vaccinated pregnant people, as the ‘proxy booster vaccination date’. We defined proxy vaccination dates to avoid ascertainment bias in person-time at risk being longer in the unvaccinated versus vaccinated groups.
For analysis of initial and booster vaccinations with cohort 2, we included non-pregnant vaccinated (ie, those initial vaccinations/booster vaccinations) females of reproductive age to match with the pregnant vaccinated (ie, those initial vaccinations/booster vaccinations) females.
Cohort 1 analysis (all pregnant people)
In the analysis for initial and booster vaccinations in cohort 1, we compared vaccinated versus unvaccinated pregnant people aged 15–55 to evaluate the vaccine effectiveness in pregnancy via incident and severe COVID-19 infections. We categorised vaccinated pregnant people based on vaccination timing into four groups: received any initial or booster vaccination during pregnancy (vaccinated during pregnancy) and received all initial or booster vaccinations before pregnancy (vaccinated before pregnancy).
To further investigate the effectiveness of booster vaccinations, we compared pregnant people with full initial vaccinations and those with booster vaccinations via incident and severe COVID-19 infections.
Cohort 2 (all vaccinated females)
In the analysis for initial and booster vaccinations in cohort 2, we compared pregnant people vaccinated during or before pregnancy (same as cohort 1) versus non-pregnant females aged 15–55 to examine the impact of pregnancy on COVID-19 vaccine effectiveness.
For both cohorts 1 and 2, we excluded pregnant people with COVID-19 infections after their initial/booster vaccination (or proxy initial/booster vaccination dates) but before pregnancy. In the analysis for booster vaccinations, the included people were a subsample of the analysis cohort for initial vaccination.
Exposures definitions
In the analysis with cohort 1 (all pregnant people), vaccination status is the key exposure category. In the analysis with cohort 2 (all vaccinated females), pregnancy is the key exposure category.
Outcomes definitions
Our first coprimary outcome was incident and breakthrough COVID-19 infections among vaccinated and unvaccinated people, respectively. Our incident or breakthrough COVID-19 positivity definition only included positive results based on: (1) PCR-positivity, (2) antigen-positivity or (3) ICD diagnosis condition, in that hierarchical order.24 Breakthrough infection was defined by incident COVID-19 positivity after 14 days of the last vaccine received for initial/booster vaccination.2 22
Our second coprimary outcome was severe COVID-19 infections. We defined severe COVID-19 infections as incident or breakthrough COVID-19 infections with hospitalisation records within 14 days before or 45 days after infections that do not overlap within 7 days before or after the recorded delivery or pregnancy end date (to avoid confounding of hospitalisation due to deliveries or other pregnancy outcomes).2 28 The concepts and codes to define COVID-19 infections and hospitalisations can be found in online supplemental table 1.
Person-time at-risk for vaccinated people accrued from 14 days after their last date of vaccination (initial/booster) until the earliest date of COVID-19 incident or severe infections, death, transfer to hospice or the date of their last record in N3C. Person-time at-risk for unvaccinated pregnant people accrued similarly from 14 days after the proxy initial or booster vaccination date. More details on person-time at risk are illustrated in online supplemental figure 1. In the comparison of pregnant people with booster vaccinations versus those with full initial vaccinations, we used the full initial vaccination date plus 90 days as the start date to calculate person-time-at-risk for pregnant people without booster vaccinations to avoid ascertainment bias. The 90-day time window was based on the definition of booster vaccinations, as mentioned in the previous section.
Other outcomes included (1) ICU admission within 45 days after COVID-19 infection; (2) severe COVID-19 disease with invasive ventilation or extracorporeal membrane oxygenation (ECMO) treatment within 45 days after COVID-19 infection and (3) 30-day mortality after COVID-19 infection.2 29 Note that ICU admission, ventilation and ECMO outcomes are hospital based and, therefore, subsumed in the second coprimary outcome of severe COVID-19 infection.
Statistical analysis
We present summary characteristics at the time of the first initial COVID-19 vaccination date and proxy vaccination date for vaccinated and unvaccinated people, respectively. We used descriptive statistics only for outcomes with low counts in various groups.
We applied Cox proportional hazard models to estimate adjusted HRs (aHR) with 95% CIs. We clustered SEs at data partner sites to account for heterogeneity across sites. We adjusted for individual-level sociodemographic characteristics, including age, race, ethnicity, specific clinical comorbidities, Charlson Comorbidity Index and whether people had any prior COVID-19 infections before initial/booster or proxy vaccinations.
We stratified the models by the variant predominance periods (ie, pre-Delta, Delta and Omicron) to allow for different HR estimations during our observation period, and we assessed model assumptions accordingly. Person-time at-risk was also calculated for each variant stratum (ie, one within each stratum). We further separated the analysis for initial and booster vaccinations.
Of note, given the timing of booster vaccinations occurring mainly during the Omicron period, we only analysed booster vaccinations during the Omicron period. Since we anticipated pregnant people to possibly undergo more frequent COVID-19 testing than non-pregnant people (eg, for the delivery hospitalisation), we conducted an a priori sensitivity analysis restricting non-pregnant people in cohort 2 to people with at least one recorded COVID-19 test. Due to the concern of different effectiveness of mRNA and non-mRNA vaccines,30 we additionally conducted sensitivity analyses only including people with mRNA vaccinations (ie, Pfizer and Moderna).
Details about covariates and modelling can be found in online supplemental materials. All analyses were conducted in the N3C Data Enclave using PySpark and SparkR/R.31
Patient and public involvement
This is a retrospective observational study, patients were not involved.