基于机器学习算法的结肠癌代谢核心基因鉴定及其功能机制研究

Identification of metabolic core gene in colon cancer based on machine learning algorithms and its functional mechanisms

  • 摘要:
    目的 基于机器学习算法筛选结肠癌代谢核心基因,并分析其功能机制。
    方法 从癌症基因组图谱(TCGA)数据库和基因表达综合数据库(GEO)获取数据, TCGA队列包含375例肿瘤样本、32例癌旁组织样本, GSE39582队列包含419例肿瘤样本。采用单因素Cox回归分析结合随机森林、支持向量机递归特征消除(SVM-RFE)、最小绝对收缩和选择算子(LASSO)回归算法,筛选代谢核心基因。绘制受试者工作特征(ROC)曲线,以曲线下面积(AUC)评估核心基因的预测效能。采用实时荧光定量聚合酶链反应(qRT-PCR)和免疫组织化学(IHC)方法检测核心基因表达。敲低核心基因,探讨其在结肠癌中的作用。
    结果 基于机器学习算法筛选出3个核心基因,即 CPT2SCP2NR3C2。根据ROC曲线的AUC比较结果, NR3C2 的预测效能最佳。qRT-PCR检测结果显示, NR3C2 mRNA在结肠癌细胞系中低表达; IHC检测结果显示, NR3C2在结肠癌组织中低表达。敲低 NR3C2 可显著促进结肠癌细胞增殖与迁移。
    结论 交叉运用3种机器学习算法筛选出 NR3C2 为结肠癌核心代谢抑制基因,这或可为代谢靶向治疗提供新策略。

     

    Abstract:
    Objective To screen metabolic core genes in colon cancer based on machine learning algorithms and analyze their functional mechanisms.
    Methods Data were obtained from The Cancer Genome Atlas (TCGA) database and the Gene Expression Omnibus (GEO) database. The TCGA cohort included 375 tumor samples and 32 adjacent normal tissue samples, while the GSE39582 cohort comprised 419 tumor samples. Univariate Cox regression analysis combined with random forest, support vector machine recursive feature elimination (SVM-RFE), and least absolute shrinkage and selection operator (LASSO) regression algorithms were employed to screen for metabolic core genes. Receiver operating characteristic (ROC) curves were plotted, and the area under the curve (AUC) was used to evaluate the predictive efficacy of the core genes. Real-time fluorescent quantitative polymerase chain reaction (qRT-PCR) and immunohistochemistry (IHC) methods were adopted to detect the expression of the core genes. The core genes were knocked down to explore their roles in colon cancer.
    Results Three core genes, namely CPT2, SCP2 and NR3C2 , were screened based on machine learning algorithms. According to the comparison results of the AUCs of the ROC curves, NR3C2 exhibited the best predictive efficacy. qRT-PCR detection results showed that NR3C2 mRNA was lowly expressed in colon cancer cell lines; IHC detection results revealed that NR3C2 was lowly expressed in colon cancer tissues. Knocking down NR3C2 significantly promoted the proliferation and migration of colon cancer cells.
    Conclusion NR3C2 is identified as a core metabolic inhibitory gene in colon cancer by cross-applying three machine learning algorithms, which may provide a new strategy for metabolic targeted therapy.

     

/

返回文章
返回