What type of study is this?

This is a Quantitative Study study.

September 28, 2025Open Access

Neurofusionx: A Scalable Multi-Modal Deep Learning Framework With Attention Based Fusion Across Healthcare, Finance, and Cybersecurity

Key Points

Achieved approximately 97.8% accuracy and 0.976 macro-F1 across 18 tasks using a multi-modal deep learning framework.
Utilized CNNs for images, a transformer for text, and temporal attention for time-series data to enhance prediction accuracy.
Implemented a cross-modal fusion-attention block that effectively handles noisy and missing channels in the data.
Demonstrated a median reduction in inference latency by approximately 35% compared to a strong baseline model.

Abstract

NeuroFusion-X is a unified, modular framework for end-to-end prediction from heterogeneous real-world data. Many decisions require joint reasoning over time-series, images, and text, yet production systems remain siloed, and naïve early/late fusion misses cross-modal dependencies and temporal alignment. NeuroFusion-X addresses this via: (1) modality-specialized encoders,CNNs for images, a compact transformer for text, and a bidirectional time-series encoder with temporal attention; (2) a cross-modal fusion-attention block that learns instance-wise interactions and down-weights noisy or missing channels; and (3) parameter-efficient bottlenecks and inference-oriented kernels to cut latency without sacrificing accuracy. To evaluate realism and scale while avoiding privacy constraints, we construct a controlled synthetic benchmark of 500k multimodal samples across healthcare, finance, and cybersecurity. Each sample includes a 48-step, 30-variable time-series, a 128×128 image, and a 60–160-token note, with class imbalance, inference-time modality masks, and induced distribution shifts. Across 18 tasks, NeuroFusion-X reaches approximately 97.8% mean accuracy and approximately 0.976 macro-F1, reducing median per-sample inference latency by approximately 35% versus a strong baseline. Robustness holds with ≤1.6% macro-F1 drop under 20% modality dropout and ≤2.2% under light adversarial perturbations. Ablations show fusion-attention, modality-dropout, and domain-adaptive normalization drive reliability. We outline deployment pathways for safety-critical contexts and integration with multimodal LLMs for rationale-grounded predictions.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Ashutosh Agarwal (Sat,) studied this question.

synapsesocial.com/papers/68d90bc941e1c178a14f724b https://doi.org/https://doi.org/10.12732/ijam.v38i4s.301

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark

View Full Paper