J. M. Sánchez Santos, A. Berral-Gonzalez, S. Bueno-Fortes, M. Martin-Merino, J. De Las Rivas
We obtained and validated in a cohort of 1273 tumor samples from colorectal cancer (CRC) patients a signature of differentially expressed genes that was determined in human colon cell lines laking gene CDKN1A. The cohort is composed of normalized transcriptomic data together with phenotypic and survival data. We classified patients into the four Consensus Molecular Subtypes (CMS) of CRC reported by using independent algorithms to achieve robust results: CMScaller (uses Nearest Template Prediction (NTP) algorithm) and CMSclassifier (uses Random Forest). This approach yields us 854 samples classified in consensus. We performed risk analysis of the samples classified as CMS1 and CMS4 using machine learning. Finally, we used the validated gene signature from the cell line CDKN1A to obtain separability in the Kaplan-Meier curves in a meaningful way. Also, we performed a classification of patients based on their risk obtaining a signature of genes that most influence this separability.
Keywords: machine learning, random forest, survival analysis, classification, colorectal cancer, transcriptomic data
Scheduled
Posters II
June 7, 2022 4:50 PM
Faculty Hall