The mechanisms underlying the development and progression of chronic kidney disease (CKD) are not yet fully understood. Early diagnosis and prediction are important clinical strategies to improve patient outcomes. Developing diagnostic and prognostic risk prediction models for CKD is a means of clinical translation, which aids in the assessment of patient prognosis and the formulation of clinical decisions.
This study utilized bioinformatics, WGCNA (Weighted Gene Co-expression Network Analysis), and random forest algorithms to identify low-risk and high-risk molecules at the transcriptional level that are closely related to the onset and progression of CKD in the human dataset GSE137570. Further, through Cox regression analysis, LASSO regression analysis, and logistic regression analysis, traditional predictive modeling methods were used to evaluate the diagnostic or predictive performance of three different gene sets. The key biomarkers were narrowed down to three genes: CCL2, SUCLG1, and ACADM. The effectiveness of the Medium z-score score, which is closely related to CKD progression, was successfully validated. Finally, the reliability of the bioinformatics analysis of the human dataset was confirmed in a mouse UUO (unilateral ureteral obstruction) model, as well as the feasibility of in-depth mechanistic studies in the UUO model.
The significance of this study lies in the identification of a core gene set containing nine genes and the smallest functional gene set with three genes, all closely related to the onset and progression of CKD, through transcriptomic analysis and risk model construction. The feasibility of studying these biomarkers in a mouse UUO model was also explored.
Machine learning screening methods have been widely applied in the diagnosis and prediction of kidney diseases (PMID: 36786976). Although this study successfully constructed CKD prediction models based on different regression methods, the diagnostic effectiveness still needs improvement. This may require integrating different omics information to further construct multi-omics molecular diagnostic and prognostic tools (PMID: 35008760), or combining different machine learning algorithms to minimize selection bias across algorithms. Some studies have constructed CKD progression-related prognostic models using machine learning screening of laboratory data (PMID: 35967110), but the evaluation metrics of these models are relatively limited. Risk prediction models based on traditional laboratory data cannot explain the potential risks and mechanisms of CKD at the molecular level. Limited by the number of CKD datasets with prognostic information and the small number of sequenced patients, along with the high biological heterogeneity of different types of CKD, the application of risk models based on transcriptomics has inherent limitations. A study based on patient kidney transcriptomics and urinary proteomics for patient risk stratification (PMID: 38286178) suggests that individual omics have significant application value for precision treatment. Future comprehensive risk models combining multiple data sources, such as current medical history, laboratory indicators, multi-omics datasets, and machine learning algorithms, are expected to be developed and practically applied to personalized treatment clinical decisions for CKD patients.
This study identified three core key genes: CCL2, SUCLG1, and ACADM, and verified similar transcriptomic changes between human CKD and the mouse UUO model, suggesting that the UUO model can serve as a platform for exploring the molecular mechanisms and functional studies of biomarkers in CKD. Studies have shown that high expression of CCL2 in infiltrating macrophages and neutrophils in CKD is closely associated with renal fibrosis (PMID: 33290282). The concentration of CCL2 in the plasma of human CKD patients is significantly increased (PMID: 34251390). These studies suggest that CCL2 plays an important biological role in the onset and progression of CKD, although further investigation using cell-specific knockout and other gene editing mice is needed to elucidate the specific mechanisms. Additionally, studies have shown that SUCLG1 promotes mitochondrial biogenesis leading to leukemia progression through POLRMT succinylation (PMID: 38649537), indicating that SUCLG1 has important post-translational modification functions. However, its role in CKD remains unexplored. For the potential function of ACADM in CKD, studies have identified a correlation between ACADM and plasma metabolic markers in CKD patients through genome-wide association studies, but the biological function of ACADM in CKD requires further research.
This study provides a risk prediction model based on transcriptomics for the onset and progression of CKD, offering potential target genes for the molecular mechanism research of CKD and effective tools for exploring prognosis risk prediction based on omics in CKD patients.