A predictive model of thyroid malignancy using clinical, biochemical and sonographic parameters for patients in a multi-center setting

Background Thyroid nodules are highly prevalent, but a robust, feasible method for malignancy differentiation has not yet been well documented. This study aimed to establish a practical model for thyroid nodule discrimination. Methods Records for 2984 patients who underwent thyroidectomy were analyzed. Clinical, laboratory, and US variables were assessed retrospectively. Multivariate logistic regression analysis was performed and a mathematical model was established for malignancy prediction. Results The results showed that the malignant group was younger and had smaller nodules than the benign group (43.5 ± 11.6 vs. 48.5 ± 11.5 y, p < 0.001; 1.96 ± 1.16 vs. 2.75 ± 1.70 cm, p < 0.001, respectively). The serum thyrotropin (TSH) level (median = 1.63 mIU/L, IQR (0.89–2.66) vs. 1.19 (0.59–2.10), p < 0.001) was higher in the malignant group than in the benign group. Patients with malignancies tested positive for anti-thyroglobulin antibody (TGAb) and anti-thyroid peroxidase antibody (TPOAb) more frequently than those with benign nodules (TGAb, 30.3% vs. 15.0%, p < 0.001; TPOAb, 25.6% vs. 18.0%, p = 0.028). The prevalence of ultrasound (US) features (irregular shape, ill-defined margin, solid structure, hypoechogenicity, microcalcifications, macrocalcifications and central intranodular flow) was significantly higher in the malignant group. Multivariate logistic regression analysis confirmed that age (OR = 0.963, 95% CI = 0.934–0.993, p = 0.017), TGAb (OR = 4.435, 95% CI = 1.902–10.345, p = 0.001), hypoechogenicity (OR = 2.830, 95% CI = 1.113–7.195, p = 0.029), microcalcifications (OR = 4.624, 95% CI = 2.008–10.646, p < 0.001), and central intranodular flow (OR = 2.155, 95% CI = 1.011–4.594, p < 0.05) were independent predictors of thyroid malignancy. A predictive model including four variables (age, TGAb, hypoechogenicity and microcalcification) showed an optimal discriminatory accuracy (area under the curve, AUC) of 0.808 (95% CI = 0.761–0.855). The best cut-off value for prediction was 0.52, achieving sensitivity and specificity of 84.6% and 76.3%, respectively. Conclusion A predictive model of malignancy that combines clinical, laboratory and sonographic characteristics would aid clinicians in avoiding unnecessary procedures and making better clinical decisions. Electronic supplementary material The online version of this article (10.1186/s12902-018-0241-7) contains supplementary material, which is available to authorized users.


Background
Thyroid nodules are highly prevalent in the general adult population, with a detection rate of 19-67% during routine ultrasound examinations [1]. An epidemiological study showed that approximately 5-15% of these nodules are malignant [2]. Despite the high incidence of thyroid malignancy, most patients referred for suspected nodules have benign conditions. The overestimation of malignancy leads to the performance of unnecessary procedures and causes a burden for both society and patients. Therefore, distinguishing thyroid nodules preoperatively is required.
To date, the Thyroid Imaging Reporting and Data System (TIRADS) and American Thyroid Association guidelines are considered as the main criteria for determining malignancy and are generally followed by radiologists in practice [3]. However, these categorization systems were established based on fine needle aspiration (FNA) cytology results that included data from nodules > 1 cm. In addition, a few reports have presented serum thyrotropin (TSH) and positive thyroid autoantibodies as possible predictors of thyroid malignancy [4,5]. However, these guidelines or studies either used FNA cytology results for their final diagnoses, which are less reliable than those confirmed via surgical inspection, or they included a relatively small number of patients. Additionally, most studies to date have focused on single risk factors, clinical, biochemical or radiological, and only a few studies have analyzed these risk factors in combination. A robust predictive model involving easily accessible clinical, laboratory and radiological risk factors may serve as a pragmatic aid in making decisions regarding malignancy differentiation.
In the present study, we reviewed a large cohort of 2984 patients in China who underwent thyroid surgery and had final pathological data available. The purpose of our study was to verify the independent risk factors of clinical, laboratory and ultrasonographic (US) features in patients with thyroid carcinomas and to establish a predictive model for determining malignancy that can be used by clinical practitioners.

Patients
We retrospectively studied the data from 3145 consecutive patients who mostly received routine neck ultrasound detections and underwent total or partial thyroid surgery between 2006 and 2009 at four tertiary hospitals in China. Patients with a previous thyroid surgery or radiation ablation and patients who were taking thyroxine or antithyroid drugs were not included. Patients with medullary thyroid cancer, anaplastic cancer or lymphoma were considered TSH-nonresponsive and were excluded. After the exclusions, 2984 patients were included in the analysis. Their clinical, laboratory, and US variables were assessed retrospectively. This study had institutional review board approval.

US imaging analysis
US examinations of the four tertiary hospitals were performed using US scanner GE LOGIQ9 (USA) equipped with a 5-12-MHz linear transducer for morphological examinations and a 4.7-MHz transducer for color Doppler evaluations. The examinations were conducted and recorded by two skilled sonographers from respective hospitals according to a standard procedure and interobservers reached agreement on the results of each US findings. The following US parameters of the nodules were recorded: (1) number of nodules, (2) nodule size, (3) echoic texture, (4) echogenicity, (5) shape, (6) margin, (7) calcification (microcalcification, macrocalcification, or egg-shell calcification) and (8) intranodular central flow.

Laboratory variables
The levels of serum TSH, free triiodothyronine (FT3) and free thyroxine (FT4) were determined using chemiluminescence analyzer Roche Cobas E601 (Switzerland) and the matched kit. These values ranged from 0.35 to 5.5 UI/ml for TSH, from 11.5 to 22.7 pmol/l for FT4 and from 3.5 to 6.5 pmol/l for FT3. If the other laboratories had different normal ranges, the values were adjusted to reflect the same normal range. Anti-thyroid peroxidase antibody (TPOAb, reference value < 60 μIU/ml) and antithyroglobulin antibody (TGAb, reference value < 60 IU/ml) levels were measured using immunometric assays. Thyroid antibody levels higher than the upper range were considered positive.

Pathology
FNA cytology was not generally performed and considered as a routine pre-operative assessment when the study was conducted. Postoperative histopathologic evaluations were performed by pathologists experienced in thyroid pathology. The histopathologic results of the patients operated on were grouped as either malignant or benign.

Statistical analysis
Descriptive statistics are presented as the means ± standard deviations for continuous variables and as the number of patients and percentages for categorical variables. Differences between independent groups for continuous variables were evaluated using a Student's t-test or a Mann-Whitney U-test, where applicable. Categorical data were analyzed using Pearson's chi-square test. Univariate and multivariate logistic regression analyses were performed to evaluate the association between malignancy and risk factors. Appealing receiver operating characteristic (ROC) curve analyses were performed to examine the predictive power of combinations of clinical, laboratory and sonographic features. The areas under the curves (AUCs) were derived from ROC curves. The Youden index was used to define the optimal cut-off value [6]. All statistical analyses were performed using SPSS version 17.0 (SPSS, Inc., Chicago, IL). Differences between AUCs were detected using Delong's test [7]. A p-value of < 0.05 was considered statistically significant.

Clinical characteristics
This study cohort consisted of 541 men and 2443 women. Overall, 2460 patients were diagnosed with pathologically benign nodules, and 524 patients were diagnosed with malignant nodules. The malignancy rate in our study was 17.6%. Most of the nodules were detected incidentally in routine body check-up and totally 10.5% of the patients present clinical systems such as hoarsennes, swallowing difficulty, thyroid enlargement, with the duration of symptoms varying from 7 days to 26 years. As shown in Table 1, there was no difference in the sex ratios between the patients with benign and malignant nodules. Patients with malignant nodules were younger than those without malignant nodules (43.5 ± 11.6 years vs. 48.5 ± 11.5 years, p < 0.001) ( Table 1).
The mean maximal diameter of malignant nodules was significantly smaller than that of benign nodules (1.96 ± 1.16 cm vs. 2.75 ± 1.70 cm, p < 0.001). The prevalence of solitary nodules in malignant cases was not different from that in benign cases (29.0% vs. 25.1%, p = 0.109).

Laboratory values
As shown in Table 2, there were no significant differences in FT3 and FT4 values between the two groups. The level of TSH (median 1.63 mIU/L, IQR (0.89-2.66) vs. 1.19 (0.59-2.10), p < 0.001] in the malignant group was higher than in the benign group. Subsequently, based on the cutoff values predetermined in population studies, TSH levels were divided into quintiles, including below normal (< 0.35 mIU/L), above normal (> 5.5 mIU/L), and within normal, with the latter divided into tertiles of similar size (0.35-0.99 mIU/L, 1.0-2.49 mIU/L, and 2.5-5.49 mIU/L). The prevalence of malignancy was 9.8% when TSH levels were less than 0.35 mIU/L, compared with 13.2% when TSH levels were 5.5 mIU/L or greater (p = 0.17). In the normal range, a high rate of malignancy was observed in patients with higher TSH levels. The prevalence of malignancy was 15.8% when TSH levels were between 1.0 and 2.49 mIU/L and 24.4% when TSH levels were between 2.50 and 5.49 mIU/L, compared with 12.6% when TSH levels were between 0.35 and 0.99 mIU/L (p = 0.09 and p < 0.001, respectively) ( Fig. 1).

Clinical, biochemical and sonographic characteristics of microcarcinoma
Of 524 malignant nodules, 104 nodules ≤1 cm in diameter were defined as microcarcinomas. Since microcarcinoma is considered "more silent", we analyzed clinical, biochemical and sonographic parameters separately. As shown in the Additional file 1: Table S1, we found age, positive TGAb result, hypoechogenicity, microcalcification and intranodular central flow were also associated with increased risk for malignancy in the nodules less than 1 cm in diameter.

The performance of independent risk factors-A mathematical model to predict malignancy
To evaluate the predictive power of combinations of clinical characteristics, laboratory values and US features and to establish a mathematical model to calculate the risk for malignancy, a series of ROC curve analyses were performed, and AUCs were calculated. When the factors age, TGAb, hypoechogenicity and microcalcification were combined, the optimal AUC had a favorable value of 0.808 (0.761-0.855), indicating a diagnostic accuracy of 80.8% (Fig. 2). By combining these four independent risk factors of malignancy, we established the following formula for a predictive model: p = (EXP(− 0.963-0.4*age + 1.108*TGAb+ 1.441*micro-calcification+ 1.722*hypoechogenicity)/(1 + EXP(− 0.963-0.4*age + 1.108*TGAb+ 1.441*microcalcification+ 1.722* hypoechogenicity)).
The best cut-off value was calculated as 0.52, with a sensitivity of 84.6% and a specificity of 76.3%.

Discussion
In this study, we verified risk factors associated with thyroid malignancy after comprehensively evaluating clinical, laboratory and sonographic variables in a population of 2984 patients who underwent thyroidectomy. Subsequently, we developed a mathematical model for cancer prediction, thereby providing a practical tool for clinicians to distinguish thyroid nodules preoperatively.
In agreement with previous studies, we identified that decreased age was one of the independent risk factors  for thyroid cancer [8]. Malignant nodules were smaller than benign nodules (1.96 ± 1.16 cm vs. 2.75 ± 1.70 cm, p < 0.001). However, our multivariate logistic analysis did not confirm a predictive role of nodule size. This difference indicates that smaller nodules may not have a higher risk of malignancy because patients with larger nodules often have an increased likelihood of surgery for benign reasons, such as compressive symptoms, whereas patients with smaller nodules without any suspicious sonographic findings often select a conservative follow-up.
Higher TSH values, even within normal ranges, have been associated with a higher prevalence of thyroid malignancy in some studies [4,5,9,10]. The results of our study are in agreement with those of previous studies, except for when TSH levels were higher than 5.5 mIU/l, which was not associated with a further increase in the prevalence of malignancy. This difference may be due to selection bias because we excluded patients who were taking thyroxine drugs; therefore, the number of patients with TSH levels > 5.5 mIU/L would have been quite small. However, in our study TSH lost its diagnostic value after being included in the multivariate logistic regression analysis, probably due to its weak role in predicting malignancy, which could be masked by including other co-effectors. Elevated TGAb, but not TPOAb, levels were a significant predictor of thyroid cancer, which is consistent with the findings of other reports [11][12][13][14]. Consistently, our study confirmed that the prevalence of lymphocytic thyroiditis was more frequent in malignant nodules (Additional file 2: Table S2). Additionally, our data also confirmed that patients with thyroiditis had positive TGAb more frequently than patients without thyroiditis (63.9% vs. 13.0%, p < 0.001).
Numerous studies have investigated the role of US findings in the diagnosis of malignant nodules [1,[15][16][17]. These studies state that hypoechogenicity, microcalcification, thyroid nodules with irregular margins, and intranodular vascularity are important features in determining the risk of malignancy. However, Cappelli et al. showed that an ill-defined margin was a nonspecific finding that could be seen for both benign and malignant nodules [18]. Consistent with these previous findings, we confirmed that microcalcifications, hypoechogenicity and intranodular central flow were associated with increased risks of malignancy. Our study did not find an association between egg-shell calcification and malignancy. Peripheral-rim or eggshell calcification has generally been considered to be an indicator of a benign nodule. However, a recently published study of thyroid nodules with eggshell calcifications reported that the findings of a peripheral halo and disruption of eggshell calcifications may be useful predictors of malignancy [19,20]. Further studies are needed to confirm this observation.
Previously, some researchers have reported several systems for maligncy assessment [21][22][23][24][25]. Stojadinovic et al. established a model based on the performance of electrical impedance scanning (EIS) EIS, which was not routinely scheduled in clinics [21]. Zahir et al. showed a complicated two-step predictive model which was less accesible for clinicans [22]. Koike [25]. Different from previous reports, in this study we enrolled 2984 patients from multiple tertiary medical centers, which greatly strengthens the evidence for diagnostic evaluations. Additionally, our mathematical model is derived from a combination of easily accessible clinical, biochemical and sonographic predictors, which improves the feasibility and practical appeal, thereby helping clinicians with decision making and reducing unnecessary invasions.
In addition, we analyzed predictive variables based on postoperative pathological inspections instead of FNA cytology examinations. Although FNA is considered to be an accurate and cost-effective method for evaluating thyroid nodules with a high diagnostic sensitivity and specificity [26], there are some limitations to diagnostic FNAs. First, FNA is recommended for nodules > 1 cm at their greatest dimension with a highly or intermediately suspicious sonographic pattern and for nodules > 1.5 cm at their greatest dimension with a minimally suspicious sonographic pattern [3]. Nodules smaller than 1 cm are difficult to distinguish via FNA cytology. Second, the performance of FNA is largely affected by the experience of radiologists, and the quality of the FNA procedure may affect the results. Reflecting these limitations, a number of previous studies have analyzed risk stratification based on FNA diagnoses [4,26,27] and have shown that it is less reliable than postoperative pathological examinations, which were used in our study.
However, there are some limitations to this study. The US feature of a node being taller than it is wide is considered to be a reliable indicator for thyroid malignancy. Unfortunately, these data were not available for the majority of the patients; therefore, this parameter was not included in the analysis. An algorithm including this US feature might improve the diagnostic accuracy of the predictive model in our study. Although less convincing than operative confirmations, FNA cytology is a relatively effective and robust method for identifying malignancies. Unfortunately, due to limitations relating to the skill with which FNAs are performed and a lack of compliance by patients, FNAs were not routinely performed in suspicious thyroid nodules in this study. Lastly, our study is retrospective, and prospective studies in a larger patient population are required to define and verify this model of risk prediction to improve clinical management.

Conclusion
In summary, we analyzed 2984 patients who underwent thyroidectomy from multiple tertiary medical centers and established a practical model for predicting malignancies using a combination of simple and accessible clinical, biochemical and sonographic predictors. Prospective studies are required to validate this predictive model in a larger population.