We identified the GSE137570 dataset from the GEO database, which contains two subsets related to the occurrence and progression of CKD. Using WGCNA and random forest algorithms, we screened and constructed three core gene sets of different sizes based on the whole-genome transcriptome and differential gene expression profiles. The diagnostic performance of the gene set scores was externally validated using ROC curves in GSE66494 and GSE180394, and their predictive performance for CKD progression was evaluated in GSE60861. We utilized Cox regression, LASSO regression, and logistic regression to build diagnostic and progression risk prediction models. Finally, the reliability of human CKD transcriptomic analysis and the feasibility of functional studies were validated in a mouse UUO model.