Title: Optimizing Colon Cancer Stage Classification with Machine Learning and Deep Learning Models
Authors: Aya Farid Abdel Rahman Ezzedine and Samy S. Abu-Naser
Volume: 8
Issue: 11
Pages: 1-1
Publication Date: 2024/11/28
Abstract:
Accurate classification of colon cancer stages is crucial for effective treatment planning and patient management. This study explores the performance of various machine learning and deep learning models for classifying colon cancer stages using a dataset of 1,560 samples collected from Kaggle. The dataset contains nine features, including age, gender, location, and Dukes Stage. After applying the Abunaser technique to balance the dataset, the sample size increased to 2,400, with 600 samples in each stage (I, II, III, and IV). The dataset was split into 60% training, 20% validation, and 20% testing. Thirteen machine learning models were employed, including Bagging Classifier, AdaBoost Classifier, Gradient Boosting Classifier, XGBoost Classifier, Logistic Regression, Decision Tree Classifier, Random Forest Classifier, SVM, KNeighbors Classifier, and Gaussian Process Classifier. In addition, a custom deep learning model was developed and trained for 70 epochs. Model performance was evaluated using accuracy, F1-score, recall, and precision. The best-performing machine learning model was the Bagging Classifier, achieving an accuracy, recall, precision, and F1-score of 95.20%. The proposed deep learning model outperformed all other models, with an accuracy of 98.35%, recall of 98.30%, precision of 98.30%, and F1-score of 98.30. These results demonstrate that deep learning offers significant improvements in colon cancer stage classification compared to traditional machine learning techniques. This study provides a robust framework for future work in the field, suggesting the potential for deep learning to enhance cancer diagnostics and treatment.