Title: Leveraging Machine Learning and Deep Learning for Enhanced Prediction of Thyroid Cancer Recurrence
Authors: Khaled Waleed Lulu, Samy S. Abu-Naser
Volume: 8
Issue: 12
Pages: 1-10
Publication Date: 2024/12/28
Abstract:
Thyroid cancer is one of the most prevalent malignancies, and accurate prediction of recurrence is crucial for effective patient management and treatment planning. This study aims to classify thyroid cancer recurrence using an array of machine learning and deep learning techniques. We employed a dataset from Kaggle comprising 383 samples with 17 features, including demographic and clinical variables. The target variable, recurrence status, was imbalanced with 108 samples indicating recurrence (yes) and 275 samples indicating no recurrence (no). To address this imbalance, we utilized the Abunaser technique, augmenting the dataset to 1,000 samples with equal representation for both categories. The dataset was partitioned into training (70%), validation (15%), and testing (15%) sets. We evaluated 13 machine learning models, including XGBoost Classifier, Logistic Regression, Decision Tree Classifier, Random Forest Classifier, SVM, KNeighbors Classifier, Gaussian Process Classifier, BernoulliNB, GaussianNB, Bagging Classifier, AdaBoost Classifier, Gradient Boosting Classifier, and Gradient Boosting Regressor. In addition, we developed a deep learning model trained for 60 epochs. Evaluation metrics included accuracy, F1-score, recall, and precision. The results indicated that the XGBoost Classifier achieved the highest performance among the machine learning models, with an accuracy of 97.25%, recall of 97.20%, precision of 97.20%, and an F1-score of 97.20%. In contrast, the proposed deep learning model outperformed all machine learning approaches, achieving an accuracy of 98.75%, recall of 98.70%, precision of 98.70%, and an F1-score of 98.70%. These findings demonstrate the potential of machine and deep learning techniques in enhancing the prediction of thyroid cancer recurrence, offering valuable insights for clinical decision-making and personalized treatment strategies.