Metabolite biomarkers of type 2 diabetes mellitus and pre-diabetes: a systematic review and meta-analysis

Background We aimed to explore metabolite biomarkers that could be used to identify pre-diabetes and type 2 diabetes mellitus (T2DM) using systematic review and meta-analysis. Methods Four databases, the Cochrane Library, EMBASE, PubMed and Scopus were selected. A random effect model and a fixed effect model were applied to the results of forest plot analyses to determine the standardized mean difference (SMD) and 95% confidence interval (95% CI) for each metabolite. The SMD for every metabolite was then converted into an odds ratio to create an metabolite biomarker profile. Results Twenty-four independent studies reported data from 14,131 healthy individuals and 3499 patients with T2DM, and 14 included studies reported 4844 healthy controls and a total of 2139 pre-diabetes patients. In the serum and plasma of patients with T2DM, compared with the healthy participants, the concentrations of valine, leucine, isoleucine, proline, tyrosine, lysine and glutamate were higher and that of glycine was lower. The concentrations of isoleucine, alanine, proline, glutamate, palmitic acid, 2-aminoadipic acid and lysine were higher and those of glycine, serine, and citrulline were lower in prediabetic patients. Metabolite biomarkers of T2DM and pre-diabetes revealed that the levels of alanine, glutamate and palmitic acid (C16:0) were significantly different in T2DM and pre-diabetes. Conclusions Quantified multiple metabolite biomarkers may reflect the different status of pre-diabetes and T2DM, and could provide an important reference for clinical diagnosis and treatment of pre-diabetes and T2DM. Supplementary Information The online version contains supplementary material available at 10.1186/s12902-020-00653-x.


Background
Type 2 diabetes mellitus (T2DM) is a highly prevalent chronic disease that is associated with the development of complications including diabetic retinopathy, kidney disease and diabetic ketoacidosis [1,2], which represent serious threats to human health. Between 1980 and 2014, the number of adults with diabetes increased from 108 million to 422 million [3], with T2DM accounting for > 90% of these cases [4]. Recent studies have shown that diabetes has become one of the three major diseases in the world with the increasing global prevalence rate [5]. However, the symptoms of T2DM are not very obvious or only partially manifest in the early stages of the disease. Therefore, it is particularly important to identify an early diagnosis and effective treatment for diabetes.
In view of the high incidence of T2DM and its serious consequences, the identification of novel diagnostic markers for T2DM has become a subject of intense research. The existing recognized diagnostic biomarkers of T2DM are blood glucose (including fasting blood glucose and 2 h glucose in oral glucose tolerance test) and hemoglobin A1c. The metabolomic approach aims to identify all the metabolites present in a biologic system, whether cells, tissues or living organisms, to identify their physiologic or pathologic effects [6]. The development of metabolomics makes it possible for metabolites to be identified as biomarkers that may be useful for the diagnosis or treatment of diabetes. For example, amino acids have been proposed to be useful diagnostic biomarkers because the metabolism of amino acids is considerably altered in prediabetes and continue to vary over the course of T2DM progression [7,8]. In particular, tryptophan and branched chain amino acids (BCAAs, including valine, leucine and isoleucine) could represent potentially useful biomarkers of T2DM because their serum concentrations are higher in T2DM patients [9]. Additionally, plasma phospholipid such as phosphatidylinositol and sphingomyelin were capable of discriminating healthy individuals and T2DM patients [10].
It is critical to study of bring data on the appearance of metabolic profile abnormalities before the occurrence of pre-diabetes or T2DM, since this might predict and allow prevent the disease progression to pre-diabetes or T2DM. However, there is no current consensus regarding the use of metabolites as diagnostic biomarkers of T2DM, and part of the results were from clinical singlecenter or insufficient consideration of mixed factors such as different regions and different populations [11]. Therefore, it is a need for an effective and comprehensive evaluation method for the use of metabolites as diagnostic biomarkers of pre-diabetes or early T2DM. The study from Guasch-Ferré et al. showed that several amino acids were consistently associated with the risk of T2DM [12]. Since then, a number of original studies emerged. We hence undertook a systematic review and meta-analysis of the proposed biomarkers of T2DM or pre-diabetes revealed by published metabolomics and constructed a profile of the metabolite biomarkers. The purpose of this study is to explore metabolite biomarkers integrating biomarkers from different studies through systematic review and meta-analysis, which could provide further evidence for early diagnosis of pre-diabetes and T2DM.

Methods
The systematic review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [13].

Study selection and inclusion criteria
The titles, abstracts and full texts of the articles were evaluated after duplicate records were removed. Before literature screening, the inclusion criteria for the publications obtained were formulated by two authors (Long and Yang) as follows: (1) studies conducted in humans; (2) the participants in the study were not gestational diabetes mellitus (GDM), type 1 diabetes mellitus (T1DM) or subjects under 18 years of age; (3) the study included a diabetic group or a prediabetic group and diagnosis was performed according to the international diagnostic guidelines [14]; (4) the article was not a review, conference abstract, editorial or note; (5) the biologic samples analyzed were collected in the fasting state and (6) the study was not conducted with dietary interventions and (or) medications. The publications initially identified as relevant were screened independently by two investigators (Long and Yang) using Endnote X7 (Thomson ResearchSoft, Stanford, USA). If there was any disagreement regarding the selection or inclusion of a study, this was resolved by discussion or by involvement of a third author (Yan). Studies of biomarkers of human pre-diabetes and T2DM identified using metabonomic technology have been included. The prediabetic category included subject who met the above inclusion criteria and had impaired glucose tolerance (IGT) or impaired fasting glucose (IFG) [15].

Quality assessment and data extraction
The Newcastle-Ottawa Scale (NOS) criteria [16] were used to assess each publication to improve the overall reliability of the extracted data. Three domains, the comparability of cases and controls, selection of cases and controls and exposure, were subdivided into eight risk assessment items. The comparability domain was awarded a maximum of two stars and other items were awarded a maximum of one star, which indicated low, moderate or high risk of bias, respectively. High and low NOS scores reflect low and high risks of bias, respectively.
Two investigators (Long and Yang) independently extracted appropriate information, including the names of the authors and journal, year of publication, study design, population, sample sizes of the case and control groups, the biologic samples obtained, analytic method, determination method, covariates of statistical analysis in the study and the identity and concentrations of the metabolites detected [reported as mean ± standard deviation (SD) or standard error (SE)] in the case and control groups. For the publications that did not provide mean values, we extracted the hazard ratio or odds ratio (OR) and its 95% confidence interval (95% CI). We also extracted the median and interquartile range values from two publications regarding pre-diabetes.

Statistical analysis
Forest plots for each metabolite for which mean ± SD/SE values were available were produced using Review Manager 5.3 software. The raw data for each metabolite were described in the forest plots, which reflected the weighted contribution of each study. The heterogeneities of the pooled means generated using the forest plots were assessed using the I 2 statistic. For continuous variables, random effect models [17] were used to assess the pooled means when I 2 > 50%; otherwise, fixed effect models were used. The outcomes were considered to be statistically significant when P < 0.05.
To clearly illustrate the relationships between metabolites, pre-diabetes and T2DM, the data provided in the publications were reprocessed. We calculated estimated means and SDs for each metabolite for which median and interquartile ranges were reported in the publications [18,19]. Because the published data were presented in different forms, using means ± SD/SE or OR value, the outcome indicators were unified to better express the results. The mean ± SD of each metabolite provided in included studies was calculated as standardized mean difference (SMD), and then the SMD was converted to OR value using formula 1 [20,21].
The mean and SD for ORs were obtained using SPSS 20.0 (IBM, Inc., Armonk, NY, USA) and converted outliers were removed when their values were larger than the mean plus five times SD [22]. The ORs were used to construct scatter diagrams with Graphpad Prism 7.0 (GraphPad Software, Inc., San Diego, USA), ensuring that there were at least three sets of data for each metabolite.

Study selection
A total of 3072 publications were identified from the database, and 1549 relevant articles remained after the removal of duplicate studies. A further 1408 publications were excluded after evaluating their titles and abstracts. These comprised 971 studies unrelated to the research topics; 68 that were on inflammation or cardiovascular diseases; 25 on polycystic ovary syndrome; 41 on non-alcoholic fatty liver disease; 156 were reviews, abstracts, editorials, conference papers or notes and 147 were performed on animals. Thus, 141 publications remained for assessment of the full text. After excluding studies of T1DM or GDM and qualitative research, 34 studies remained for inclusion in the meta-analysis, 20 of which were of T2DM, 10 were of pre-diabetes and 4 were of both T2DM and pre-diabetes. The PRISMA flow diagram for the meta-analysis is presented in Fig. 1.

Quality assessment
The scores of for the studies included in this metaanalysis, generated using the NOS criteria, were shown in Table S1 and Table S2. The maximum score, awarded on the basis of eight risk assessment items [16], was nine stars. Studies with a score of five stars or more were regarded as of medium-to-high quality; otherwise, they were to be categorized as poor-quality and excluded. However, the lowest score was six stars. This implies that all the included studies were of medium-to-high quality, meaning that the data extracted were suitable for inclusion in the meta-analysis.

Characteristics of the included studies
The characteristics of the included studies are shown in Table 1. They comprised 24 independent studies reporting data from 14131 healthy participants and 3499 T2DM patients [10,11,. All these studies compared T2DM patients with healthy participants. Four of the studies were prospective [26,33,37,41] and four literatures were cohort studies [30,31,35,43]. There were two cross-sectional studies [22,40] and four follow-up studies [32,34,36,42], and the rest were case-control studies. The results of most of the studies were presented as mean ± SD/SE, but some were presented as ORs.

Characteristics of the metabolites studied
Metabolites including amino acids, lipids, saccharides and others were analyzed in the 24 studies of T2DM.
The frequencies of analysis of each metabolite in the 24 studies were counted and metabolites quantified in three or more studies are shown as a bubble diagram (Fig. 2a). The four categories of metabolite are shown in pink, green, blue and purple, respectively. The ordinal numbers on the bubbles represent different metabolites and the size of each bubble is indicative of the number of studies in which it was analyzed. Eighteen amino acids, five lipids, three saccharides and three other metabolites were assayed. Thus, the most studied metabolites were amino acids, of which the four most commonly analyzed were isoleucine, valine, glycine and leucine, in 14, 13, 12 and 12 studies, respectively. Metabolites studied on less than three articles were excluded, as summarized in Table S3.
For pre-diabetes, the number of metabolites studied in publications was significantly lower in the 14 studies included than for T2DM, as shown in Fig. 2b. There were 14 amino acids, 2 lipids and 3 other metabolites. The top three most commonly analyzed amino acids were leucine, isoleucine and valine, which were studied on 11, 10 and 8 occasions, respectively. Metabolites studied on less than three articles were excluded, as summarized in Table S4.

Analysis of metabolites associated with T2DM
On the basis of data extracted with means ± SD/SE forest plots for each metabolite were created using Review Manager 5.3. Because the dimensions and units used in the studies differed, SMDs were used for the forest plot       outputs. For the T2DM studies, because the I 2 values for glycine and tyrosine were 29 and 43%, respectively, with a P value in the Q test > 0.1, fixed effect models were used to calculate combined effect sizes. Moreover, the I 2 values for valine, leucine, isoleucine, proline, glutamate, lysine, phenylalanine, alanine, histidine and serine were > 90% (Table S5). That is, random effect models were used for these metabolites [17,54]. As shown in Fig. 3, the concentrations of BCAAs and aromatic amino acids (AAAs) were significantly higher in  , isoleucine (c), phenylalanine (d) and tyrosine (e) in serum and plasma of pre-diabetes and control groups. Studies with several populations comparing patients with pre-diabetes and controls are described by the author name followed by A or B to indicate, for example, subdivision according to sex plasma of patients with T2DM than in control participants (Fig. S1). Thus, valine, leucine, isoleucine, tyrosine, glycine, proline, glutamate and lysine could be considered as biomarkers of T2DM according to their forest plots and the first five of these are likely to be most useful, given the associated P values.

Analysis of metabolites associated with pre-diabetes
For the prediabetic studies, the I 2 values for isoleucine, proline, citrulline, 2-aminoadipic acid and lysine were less than 50%, with the P value for the Q test > 0.1 and therefore fixed effect models were used to calculate the combined effect sizes. The I 2 values for glycine, alanine, glutamate, serine and palmitic acid (C16:0), leucine, valine, tyrosine, phenylalanine, propionylcarnitine (C3), carnitine (C0), asparagine, tryptophan and myristate (C14:0) were > 50% (Table S6). Therefore, random effect models were used. As shown in Fig. 4 and Table S6, 24, 0.49], P < 0.00001) were higher in the serum or plasma of prediabetic patients than in control participants (Fig. S2). Furthermore, there were statistically significant differences in the concentrations of serine, citrulline, 2-aminoadipic acid and palmitic acid (C16:0) in the serum or plasma between prediabetic and healthy participants, as shown in Fig. S3 and Table   This implies that isoleucine, glycine, proline, glutamate, lysine, serine, citrulline, 2-aminoadipic acid and palmitic acid (C16:0) may represent biomarkers of prediabetes.

Integrative analysis of the metabolite biomarkers
Forest plots were only constructed for metabolites analyzed in at least three studies included in the meta-analysis, but these may not represent the most widely applicable assays. For example, IFG and IGT were only assessed in prediabetic patients in one study [44], but more than three datasets for each metabolite can be more reliably integrated to reflect the features of pre-diabetes. Therefore, we conducted integrative profiling to scientifically combine all the data provided in the included studies.
The ORs for each metabolite provided in the included publications were analyzed to reflect the characteristics of the disease biomarkers, excluding publications containing outliers. As shown in Fig. 5a , α-ketoglutarate (OR = 1.08) and trigonelline (OR = 0.85) were significant. Unlike T2DM, no saccharides were analyzed. The mean ORs for all the metabolites were constructed to indicate the characteristics of the metabolic profile for pre-diabetes.
From Fig. 5, obviously, alanine, citrulline, glutamate, glycine, isoleucine, leucine, lysine, phenylalanine, proline, serine, tyrosine and valine amino acids, LPC (C18: 2) and palmitic acid (C16:0) were statistically similar between T2DM/pre-diabetes patients and healthy controls. The obvious difference in pre-diabetes and T2DM indicates that these disease stages are associated with distinct and quantified metabolic biomarker profiles. In particular, the metabolic biomarkers alanine, glutamate and palmitic acid (C16:0) were significantly different in pre-diabetes and T2DM, which suggests that quantified concentrations of this three metabolites are potential for use as integrative biomarkers for the differentiation of pre-diabetes and T2DM.

Discussion
The use of a single biomarker to diagnose a disease lacks specificity because multiple disease processes are likely to affect its concentration. Additionally, the main disadvantage of the simple addition of other biomarkers is that their discriminative ability typically overlaps, also limiting the use of this approach [55]. Some studies involving diet were excluded, because the higher intake of metabolites might falsely raise their levels in metabolomics [56]. The use of single biomarkers is limited by the effects of external factors, such as diet. Furthermore, risk models containing biomarkers derived from the pathways directly affected by the disease itself may not demonstrate high predictive value. We believe that the integration of data regarding a number of biomarkers more accurately predict the occurrence of pre-diabetes/ T2DM, and map the patient's current state in a precise manner, which might prevent the further development of T2DM, diabetic macro-and microphaties.
The present meta-analysis, which included 34 independent studies reported data from 14,515 healthy participants, 3499 patients of T2DM and 2139 with pre-diabetes, was performed based on both original OR and OR converted from SMD value. SMD could reflect the original data of each study, and reduce the deviations caused by different methods in included studies. Therefore, the comparability and reliability of meta-analysis are acceptable [21]. There were 23 metabolites concerning T2DM and 32 metabolites concerning pre-diabetes based on included studies. From Fig. 5, obviously, 12 amino acids, LPC (C18:2) and palmitic acid (C16:0) were statistically similar between T2DM/pre-diabetes patients and healthy controls. Metabolite biomarkers of T2DM and pre-diabetes revealed that the levels of alanine, glutamate and palmitic acid (C16:0) are significantly different in T2DM and pre-diabetes. These findings could reflect the different status of prediabetes and T2DM, and could provide an important reference for clinical diagnosis and treatment of pre-diabetes and early T2DM, which might prevent the further development of T2DM and reduce the incidence of diabetes complications.
Integrated profiling reflects a set of biomarkers in the context of a network, instead of considering only single or isolated biomarkers. As shown in Fig. S4, the pathogenesis of T2DM is complex and involves many signaling pathways, which has not yet been fully elucidated. Integration hence of the pre-existing metabolite biomarkers may be useful for the prevention and diagnosis of T2DM and pre-diabetes. This method of analysis is suitable for the integration of a number of types of data; for instance, both amino acid biomarkers, belonging to centralized data with strong regularity and a wide range of metabolic biomarkers, belonging to dispersive and isolated data with irregularity. The goals of most studies is improving the diagnosis rate of pre-diabetes and early T2DM, which could reduce the incidence of T2DM and diabetes complications through early intervention treatment. Integrative profiling of metabolic biomarkers should be able to provide reliable references for the selection of biomarkers suitable for the prediction and diagnosis of T2DM and pre-diabetes in the future. It is more potential clinical valuable for high incidence of diabetes (such as China and India) to explore metabolite biomarkers profile for identification and diagnosis of pre-diabetes and T2DM [3]. For the abnormal amino acid and lipid profiles (low levels of metabolites such as glycine, serine and LPC (18:2)), is it possible to increase their levels through external intake to reduce the incidence of pre-diabetes or T2DM? It is worthwhile to design experiments to verify this conjecture at the animal level in future. In further research, conducting clinical, multi-center cohort or prospective observation trials are necessary and important research works.
Although it has shown that quantified metabolic biomarkers could reflect T2DM and pre-diabetes, there were some limitations to the approach used. First, some relevant studies may not have been retrieved from the databases using the search terms described. Second, there were fewer studies of some of the metabolites and there is likely to be a publication bias in favor of positive findings, which may have introduced bias into our analysis. Third, all the information regarding the samples and data collected were derived from the included studies, so the potential confounding factors present in these studies, such as ethnicity, region, education and physical health of the participants might have affected the study results. Although the accuracy of the meta-analysis results was affected by the original research data form included studies, the conclusions of this study were obtained from the meta-analysis conducted in strict compliance with the included criteria and the PRISMA guidelines.

Conclusions
Quantified multiple metabolite biomarkers are useful strategy to differentiate pre-diabetes and T2DM, and we believe that it has potential clinical value for the diagnosis of T2DM.