What question did this study set out to answer?

This research aims to improve valve opening prediction during fluororubber production using a novel multi-modal approach.

February 14, 2026Open Access

MECFN: A Multi-Modal Temporal Fusion Network for Valve Opening Prediction in Fluororubber Material Level Control

Key Points

This research aims to improve valve opening prediction during fluororubber production using a novel multi-modal approach.
Developed the MECFN model for data-driven valve opening prediction.
Utilized multimodal data from visual image sequences and height sensor signals.
Implemented a Multi-Feature Extraction module to enhance image representations.
Employed two Transformer encoders to capture temporal dependencies within data modalities.
Introduced a cross-attention mechanism for effective modulation between different data types.
MECFN achieved a mean absolute error of 2.36 and a root mean squared error of 3.73.
An R2 value of 0.92 confirms the model's predictive accuracy.
Outperformed traditional machine learning and single-modality models significantly.

Abstract

During fluororubber production, strong material agitation and agglomeration induce severe dynamic fluctuations, irregular surface morphology, and pronounced variations in apparent material level. Under such operating conditions, conventional single-modality monitoring approaches—such as point-based height sensors or manual visual inspection—often fail to reliably capture the true process state. This information deficiency leads to inaccurate valve opening adjustment and degrades material level control performance. To address this issue, valve opening prediction is formulated as a data-driven, control-oriented regression task for material level regulation, and an end-to-end multimodal temporal regression framework, termed MECFN (Multi-Modal Enhanced Cross-Fusion Network), is proposed. The model performs deep fusion of visual image sequences and height sensor signals. A customized Multi-Feature Extraction (MFE) module is designed to enhance visual feature representation under complex surface conditions, while two independent Transformer encoders are employed to capture long-range temporal dependencies within each modality. Furthermore, a context-aware cross-attention mechanism is introduced to enable effective interaction and adaptive fusion between heterogeneous modalities. Experimental validation on a real-world industrial fluororubber production dataset demonstrates that MECFN consistently outperforms traditional machine learning approaches and single-modality deep learning models in valve opening prediction. Quantitative results show that MECFN achieves a mean absolute error of 2.36, a root mean squared error of 3.73, and an R2 of 0.92. These results indicate that the proposed framework provides a robust and practical data-driven solution for supporting valve control and achieving stable material level regulation in industrial production environments.

Bookmark

View Full Paper

Bookmark

View Full Paper

MECFN: A Multi-Modal Temporal Fusion Network for Valve Opening Prediction in Fluororubber Material Level Control

Key Points

Abstract

Cite This Study