Main findings
Laparoscopy was the gold standard for the diagnosis of ectopic pregnancy for a long period of time since 199338. However, it is reported that 3.0% ~ 4.5% of EP patients failed to diagnose EP at the time of initial laparoscopic examination39, 40. As a result, laparoscopy is no longer considered as the gold standard for the diagnosis of ectopic pregnancy according to the 2016 Royal College of Obstetricians and Gynaecologists and Association of Early Pregnancy Units (RCOG/AEPU) joint guidelines on diagnosis and management of ectopic pregnancy41 . Currently, there is a lack of effective means for early diagnosis of ectopic pregnancy, because it was difficult to distinguish from threatened or inevitable abortion, intrauterine pregnancy with corpus luteum rupture and intrauterine and extrauterine pregnancy. The American College of Obstetricians and Gynaecologists (ACOG) and RCOG/AEPU joint guidelines recommend that transvaginal ultrasound is the first choice for the diagnosis of EP41, 42. However, ultrasound examination is also affected by equipment, pathway, physician’s operation, pregnant women’s obesity, combined with uterine fibroids or ovarian tumors. 8% to 31% of the early pregnant site cannot be diagnosed at the first ultrasound examination19.
PUL was a temporary status with various outcomes. EP is the high risk outcome of PUL, because it would cause internal hemorrhage and endanger patients’ lives1. The high-risk outcomes as EP in PUL patients need to be screened out before the occurrence of rupture and internal hemorrhage43. Precise prediction of PUL outcome can provide not only timely and correct management protocols for EP, but also reduce medical burden and unnecessary medical intervention for IUP and FPUL patients.
The M4 model is currently one of the most widely used prediction models, especially in European countries like Britain. In 2016, Ben Van Calster et al26 proposed the M6 model, which introduced both progesterone and hCG values(0h and 48h)as variables in anticipation of greater predictive accuracy, because the presence of low serum progesterone concentrations in patients with EP has been known since the late 1970s44. A meta-analysis by S Bobdiwala et al. in 2019 showed that the areas under the curves (95% CI) of hCG cut-offs, hCG ratio (0/48h), progesterone cut-offs and M4 model were 0.42 (0.00-0.99), 0.69 (0.57-0.78), 0.69 (0.54-0.81) and 0.87 (0.83-0.91), respectively. The prediction accuracy of the model with hCG at 0 h and 48 h was higher than that with single hCG value, and the model with single progesterone value had higher accuracy than that with single HCG value, which is the same trend as our statistics. Moreover, through our systematic meta-analysis of all published prediction protocols of ectopic pregnancy outcome in PUL, we found that, consistent with previous studies45, M4 model has better prediction accuracy. As for the latest protocol, M6 also showed a trend of higher prediction accuracy, reaching the AUC of 0.94. It is worth noting that hCG ratio and progesterone cut-offs also showed good sensitivity and specificity in the scheme using a single biomarker or single biomarker detection point, with AUC of 0.82 and 0.72, respectively. In view of the short time required, few testing items, fast guidance for clinicians (especially in developing countries and regions), and reduced cost for patients, they still have certain practical value. Most studies focusing on PUL outcome prediction models have been conducted in European and North American countries and regions with abundant medical resources. There often were well-established protocols for early pregnancy diagnosis, but fewer studies had been conducted on PUL outcome prediction in low-income countries and areas. In addition, predicting EP in PUL patients by M4 in the United Kingdom and United States, Barnhart K. T. et al.7 found that even after adjusting the diagnostic criteria to a consistent level in the two countries, the sensitivity of the model differed. The study revealed that the PUL outcome prediction protocols were related to the database used, or the different medical guidelines and the levels of healthcare organization.
In the past, the acceptability of each prediction protocol has hardly been evaluated, so we propose to use average production utility to evaluate the cost performance of the protocols. Due to the differences of medical charges, medical insurance policies and health policies in different countries and regions, and the differences of medical development level, we use the sum of the number of visits and the number of inspection items to replace the medical cost, and the number of inspection items is defined as the minimum number of inspections that can be used to predict the outcome of the protocols. It is reasonable to assume the numbers of tests the protocols required can reflect the medical cost. In addition, more visits and examination items, and more complex prediction protocols could likely lead to follow-up losing. Therefore, we believe that the data loss caused by the above reasons reflects the acceptability of each protocol to a certain extent. We propose a new evaluation method. Table 2 shows the relationship between the sum of the number of visits and the number of inspection items and the rate of lost. Previously, we expected that as the protocol took longer and the number of examinations and visits increased, more patients might not be able to use the protocols because of medical related payment pressure and severe clinical symptoms, which may lead to the loss of follow-up and the lack of timely diagnosis and treatment of EP patients. However, different from our expectation, the rate of lost did not increase with the number of examinations and visits required by the protocols. When the time required for the protocols was extended from one day to two days, the average rate of lost increased from 11.56% (95% CI 6.96% - 16.16%) to 17.46% (95% CI 11.46% - 23.46%). Although there was a certain growth trend, there was no statistical significance. This may be due to the following reasons: First, the earliest time of all 29 studies can be traced back to 1991, and the latest time is 2018. Over the past 20 years, great changes may have taken place in medical policy, popularization of medical science knowledge and national economic level, which may affect the willingness of patients to follow up and the ability of medical institutions to track and manage patients; Second, some protocols (such as P1, M1, etc.) have not been studied extensively, which may cause large bias. In fact, after excluding the protocols with less than or equal to 2 studies, the average rate of lost increased from 11.19% (95% CI 4.67-17.72) to 18.63% (95% CI 9.67-17.71) when the sum of visits and examinations changed from 3 to 5. Although there was no statistical significance, the trend was the same as our prediction.
Besides that, table 3 shows that the complicated protocols to improve the accuracy of prediction also needs higher cost. The average production utility of M4 model, which requires at least 2 visits and 3 examinations, is higher than that of M6 model requiring at least 3 visits and 3 examinations, and lower than the hCG cutoffs model and progesterone-cutoffs model which only need one visit and one inspection. When this trend is reflected in clinical work, it seems that more complex prediction schemes may bring higher costs. Simple prediction protocols still have certain application value in low-income countries and regions.