HyMaC-Net: a hybrid lightweight mamba-CNN framework with patch embedding for medical image classification

Citation

Abstract

Designing medical image classifiers that generalize across heterogeneous datasets while remaining computationally lean remains a challenge. Recent work explores hybrids that mix local convolutional priors with global sequence modeling (e.g., Transformers or state-space models), yet there is limited empirical guidance on which design choices actually matter under strict compute budgets. This study presents a systematic ablation study in 12 datasets (MedMNIST-2D subsets plus CPN X-ray and Kvasir) that probes key knob depth, tokenization granularity, normalization/ regularization, pooling, and the use of Mamba state-space blocks for long-range dependency modeling. Rather than chasing single-dataset SOTA, our goal is to map the accuracy compute frontier with transparent metrics (ACC/AUC, parameters, GMACs, and inference time) and seed-robust statistics. The study yields two practical profiles (Small and Base) of the proposed HyMaC-Net a hybrid mamba-cnn architecture that deliver competitive performance across datasets while staying within modest compute envelopes. In particular, in the Kvasir dataset, both Base (84.67%) and Small (83.42%) models acquired 4-5% more precision than the existing models while having fewer parameters and GMACs. On the other hand, on the OrganAMNIST dataset, both the Base (97.8%) and Small (95.3%) models surpassed the existing model with a 2% accuracy increment. Results are consistent with the premise that Mamba SSM blocks can replace attention to capture global context efficiently and that careful architectural pruning (fewer blocks, fused pooling, auxiliary guidance) preserves accuracy at substantially reduced cost. Furthermore, we include an interpretability check using Grad-CAM to ensure that model predictions are based on clinically relevant features. We release per-dataset metrics (OA, AUC, precision, recall, specificity, F1, Cohen’s κ, Dice) and ablation logs to support reproducibility and fair comparison. The resulting guidelines are intended to help physicians build deployable medical classifiers without exhaustive tuning.

Description

Cataloged from PDF version of thesis.
Includes bibliographical references (pages 61-64).
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2025.

Publisher Link

Type

Thesis