International Journal of Academic and Applied Research (IJAAR)

Title: Detection and classification of cervical cancer disease among women using machine learning technique Model in Western Kenya.

Authors: JF Murere, S. Wangila, J. Koech

Volume: 9

Issue: 6

Pages: 34-40

Publication Date: 2025/06/28

Abstract:
Cervical cancer is the leading cause of cancer related deaths among Kenyan women, claiming approximately the lives of 3,200 women annually. This is primarily due to the low screening uptake (16%) and late diagnosis. The aim of this study was to develop a machine leaning based model to enhance early detection of cervical cancer in Western Kenya, a region in Kenya with limited healthcare resources. Demographic, reproductive, and clinical characteristics data were collected from 968 women across health facilities in western Kenya (MTRH and Kakamega Referral hospital) utilizing a cross sectional study design. The dataset was divided into training set (70%) and testing set (30%). The training set was used to develop the five machine learning model: Logistic Regression, Random Forest, Decision Tree, Support Vector Machine (SVM), and Artificial Neural Network (ANN). The testing set was used to evaluate the models. The machine learning model were trained to classify the cervical cancer cases, addressing the class imbalances using class weighting method for SVM, decision tree, random forest and logit model and synthetic minority oversampling class technique (SMOTE) for ANN. The random forest model demonstrated the superior performance compared to the other four models as it achieved the highest accuracy (94.33%) and specificity (98.37%) making it to be highly effective at ruling out negative cases. It however had a sensitivity of 20% which indicated that it had challenges in detecting positive cases. The logistic regression model excelled in sensitivity (70%) making it suitable for initial screening. ANN model showed the lowest precision (10%). The findings from this study suggested that a two-step approach which combine both Logistic Regression for screening and Random Forest for confirmation of cervical cancer cases which will go a long way in improving early detection and reduce cervical cancer mortality in resource-constrained settings like Western Kenya.

Download Full Article (PDF)