Using biological constraints to improve prediction in precision oncology
Using biological constraints to improve prediction in precision oncology
Many gene signatures have been created using machine learning (ML) on omics data, but their clinical usefulness is often limited by poor interpretability and inconsistent performance. In this study, we highlight the importance of incorporating prior biological knowledge into the decision rules generated by ML methods to create more reliable classifiers. We tested this approach by applying various ML algorithms to gene expression data to predict three challenging cancer outcomes: bladder cancer progression to muscle-invasive disease, response to neoadjuvant chemotherapy in triple-negative breast cancer, and metastatic progression in prostate cancer. We developed two types of classifiers: mechanistic, which focused the training on features related to specific biological mechanisms, and agnostic, which did not include any prior biological knowledge. Mechanistic models showed equal or better performance compared to their agnostic counterparts, with the added benefit of improved interpretability. Our results emphasize the value of incorporating biological constraints to develop robust gene signatures with strong translational potential.
