Development and internal validation of a model to predict type 2 diabetic complications after gestational diabetes


Data used for this study were de-identified and ethical review and participant consent was waived by the institutional review board of the University of Montreal Hospital Center. All methods were performed in accordance with the Tri-Council Policy Statement: Ethical Conduct for Research Involving Human Subjects.

Study population

We conducted a retrospective cohort study of women who had hospital births in Quebec, Canada, from April 1989 to March 2016 (cohort entry); women were followed through 2018 to identify outcomes14🇧🇷 The cohort was constructed from the Data Maintenance and Use register for the Hospital Clientele Study, which comprises > 99% of deliveries in Quebec.

Individuals aged 18 to 45 years who had GDM in at least one pregnancy were included, with the cohort entry point (t0) in the first pregnancy affected by GDM. GDM was defined as abnormal maternal glucose tolerance first identified during pregnancy and identified using diagnostic codes from the 9th and 10th revisions of the International Classification of Diseases (ICD) (Table S1). These codes have been previously validated and adequately capture GDM diagnoses with specificity of > 90% and positive predictive values ​​of > 80%.15.16🇧🇷 There are some variations in approaches to identifying GDM at different centers, ie one-step versus two-step approaches; however, both approaches are endorsed by Diabetes Canada17🇧🇷

Women who died in the first affected pregnancy and women with pre-existing diabetes or its complications were excluded (Fig. 1).

figure 1

Cohort study development.


The primary outcome was hospitalization for type 2 diabetic complications within 10 years of delivery of the first GDM-affected pregnancy. Type 2 diabetic complications were defined as a diagnosis of type 2 diabetes with the development of one or more of the following complications: diabetic coma, acidosis, renal, ophthalmic, neurological, circulatory or other complications resulting from diabetes and identified by ICD-9 and 10 codes, previously validated in studies with a specificity of 99% and positive predictive values ​​> 80% (Table S1).

The secondary endpoint was type 2 diabetic complications occurring any time (up to 29 years) after delivery of the first GDM-affected pregnancy.

Women were followed from entry into the cohort until the occurrence of any outcome, death, or the end of the study period (March 31, 2018).

statistical analysis

We developed Cox proportional hazards regression models to predict type 2 diabetic complications, according to the steps described above.18.19and reporting the process using the Transparent Reporting guidelines of a multivariate prediction model for Prognosis or Individual Diagnosis (TRIPOD) (Table S2)20🇧🇷

Candidate predictors, variable selection and coding

We considered demographic, reproductive, and clinical factors known to be associated with an increased risk of type 2 diabetes as potential predictor variables5.21🇧🇷 These factors included maternal age, substance use, morbid obesity, socioeconomic deprivation (measured using a composite score of neighborhood income, education, and employment)22pregnancy factors such as parity and multifetal pregnancy, and pregnancy complications such as hypertensive disorders of pregnancy (DHEG), severe maternal morbidity (SMM)23, stillbirth, preterm birth, low birth weight, and admission to a neonatal intensive care unit (NICU) or adult intensive care unit (ICU). Candidate predictors were measured at the time of index delivery (cohort entry).

Clinical variables that had a low incidence were combined with other similar variables (eg, previous obstetric complications such as MMS, stillbirth, preterm delivery, low birth weight, NICU stay, or neonatal death were combined). Prior history of obstetric complications was further combined with parity as follows: prior obstetric complications (among multiparous women), no prior obstetric complications (among multiparous women), and no prior obstetric complications (among primiparous women). When there was collinearity (r > 0.5) between the variables, the most clinically relevant variable was selected.

Continuous candidate predictor variables (e.g. maternal age) were modeled using constrained cubic splines with three node locations19🇧🇷 We evaluated interaction terms and retained predictors that were statistically significant (alpha = 0.10)18🇧🇷 Final model variables were selected using Least Absolute Selection and Shrinkage Operator (LASSO) regression18🇧🇷

Model performance and internal validation

The predictive performance of the model was evaluated based on discriminatory, calibration and risk stratification accuracy18🇧🇷 Discrimination was measured by the ç-statistics, which is equivalent to the area under the receiver operating characteristic curve (AUROC)19🇧🇷 An AUROC ≥ 0.7 was interpreted as good discrimination and 0.6 to < 0.7 modest, while 0.5 to 0.6 was considered poor and < 0,5 como sem capacidade discriminativa. O desempenho da calibração foi examinado traçando a média dos eventos observados versus a média dos riscos previstos por decil. As inclinações de calibração foram interpretadas como boas (inclinação > 0.7), poor (0.5 < slope ≤ 0.7) or uninformative (slope ≤ 0.5)24🇧🇷

Using a risk ranking table, we examined the model’s ability to stratify the population into low-risk and high-risk categories. We divided the population into four risk groups, with the highest calculated risk group corresponding to the overall incidence rate of the outcome in the study population.25🇧🇷 Likelihood ratios (LR) were calculated to assess classification accuracy within each group26🇧🇷 For clinical use, positive LRs (LR+) of > 5 or > 10 were interpreted as moderate or good rule-in tests, respectively, while negative LRs (LR-) of < 0.2 and < 0.1 were considered as moderate or good rule -out tests, respectively24🇧🇷

The model was evaluated for internal validity using the bootstrap method with 200 iterations and superoptimism (i.e., the degree to which a model is overfitted) was reported.18🇧🇷

Secondary analyzes

Using the same selected final variables, we also developed a prediction model for type 2 diabetic complications up to 29 years postpartum and evaluated the discriminatory performance of the model.

Sample size

We estimate our sample size based on the rule of thumb of 10 to 20 events per degree of freedom19, to avoid overfitting the model. With a total of 1025 events during follow-up, we had enough sample size to consider up to 50 degrees of freedom for candidate predictors.

Analyzes were conducted using R version 3.5.1 (The R Project for Statistical Computing).

Development and internal validation of a model to predict type 2 diabetic complications after gestational diabetes

Leave a Reply

Your email address will not be published.

Scroll to top