Title: Explainable Hybrid Machine Learning and Attention Based Deep Learning Framework for miRNA Biomarker Discovery in Pancreatic Cancer
Authors: Salahaldin Shady Aldaya, Yasmeen Ali Abu Dib and Samy S. Abu-Naser
Volume: 10
Issue: 5
Pages: 43-54
Publication Date: 2026/05/28
Abstract:
Background: Pancreatic cancer (PC) remains one of the most lethal malignancies globally, with a fiveyear survival rate below 12%, largely due to latestage diagnosis and the absence of reliable early detection biomarkers. MicroRNAs (miRNAs) have emerged as highly stable, noninvasive molecular biomarkers detectable in blood and other biofluids, making them ideal candidates for early diagnostic screening. However, the highdimensional, noisy nature of miRNA expression data presents major challenges for classical statistical and machine learning (ML) approaches. Objectives: This study proposes a novel Explainable Hybrid Machine Learning and AttentionBased Deep Learning (XHMLAB) Framework for the systematic discovery, selection, and clinical interpretation of miRNA biomarkers in pancreatic cancer. The framework integrates classical ensemble ML methods with attentionenhanced deep learning architectures and posthoc explainability tools. Methods: We utilized publicly available miRNA expression datasets from the Gene Expression Omnibus (GEO) repository (GSE41372, GSE60978, GSE74877) and The Cancer Genome Atlas (TCGAPAAD). The pipeline consists of: (1) preprocessing and normalization of miRNA expression profiles; (2) hybrid feature selection combining LASSO regularization, SVMRFE, and Random Forest importance scoring; (3) an AttentionBased Bidirectional LSTM (AttBiLSTM) deep learning model for classification; (4) a Gradient Boosting ensemble classifier for validation; and (5) SHAP, LIME, and attention weight visualization for model interpretability. Results: The XHMLAB framework achieved an AUC of 0.972 (95% CI: 0.961-0.983), accuracy of 94.3%, sensitivity of 93.8%, and specificity of 94.9% in distinguishing pancreatic cancer patients from healthy controls. A panel of 12 candidate miRNA biomarkers was identified, including miR196a5p, miR217, miR196b5p, let7i5p, miR130a3p, and miR2213p, with SHAP analysis confirming their biological relevance. Conclusions: The proposed XHMLAB framework demonstrates superior performance compared to singlemodel approaches and provides clinically interpretable explanations that can support physician decisionmaking. The integration of explainability mechanisms bridges the gap between AIdriven predictions and clinical trust, offering a pathway toward translatable earlydetection tools for pancreatic cancer.