Fusion-based multimodal deep learning to improve detection of diabetic retinopathy and macular edema: integrating retinal imaging, clinical data and systemic biomarkers

Citation

Abstract

Diabetic Retinopathy, a silent threat to vision, is one of the major causes of vision impairment worldwide, where the retina of the eye is damaged before noticeable symptoms appear. Accompanying DR (Diabetic Retinopathy), DME (Diabetic Macular Edema) frequently develops, stating both are overlapping ocular conditions threatening visual acuity that can be effectively diagnosed by analyzing retinal images. However, relying only on a single modality has proven inadequate accuracy in distinguishing between DME and DR. Traditional diagnostic methods are employed primarily on fundus imaging, OCT (Optical Coherence Tomography), or OCTA (Optical Coherence Tomography Angiography). To date, single modality alone fails to provide a complete contextual understanding necessary for precise classification.This work proposes to offset the limitation by developing deep learning architectures that leverage several image modalities to improve classification performance and yield context-aware outputs. Specifically, the work proposes to develop personalized Convolutional Neural Networks (CNNs) driven mainly by superior fusion methods such as Multi-Head Self-Attention (MSA) Fusion, Gated Fusion, and Feature-wise Linear Modulation (FiLM) Fusion, with model interpretability at each step. The multimodal DR and DME classification strategy proposed architecture fuses two forms of image data or biomarkers so that the model may accommodate both structural and context-specific differences. Our proposed architecture has achieved an impressive accuracy of 95.52% and an F1-score of 0.975, outperforming the existing benchmark. Furthermore, this accuracy is achieved with a lower parameter count of 1.75 million and 2.57 million, with faster inference times of 19.289 ms and 19.843 ms for the two architectures, respectively, setting a state-of-the-art benchmark in the medical field.

Description

Cataloged from PDF version of thesis.
Includes bibliographical references (pages 99-108).
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2025.

Publisher Link

Type

Thesis