Ensuring navigational safety is one of the most critical challenges in autonomous maritime navigation research, requiring accurate real-time assessment of collision risks and prompt navigational decisions based on such assessments. Traditional rule-based systems utilizing radar and Automatic Identification Systems (AIS) exhibit fundamental limitations in simultaneously analyzing discrete objects such as vessels and buoys alongside continuous environmental boundaries like coastlines and bridges. To address these limitations, recent research has incorporated artificial intelligence approaches, though most recent studies have primarily focused on object detection methods. This study proposes a structured tag-based multimodal navigation safety framework that performs inference on maritime scenes by integrating YOLO-based object detection with the LLaVA vision–language model, generating outputs that include risk level assessment, navigation action recommendations, reasoning explanations, and object information. The proposed method achieved 86.1% accuracy in risk level assessment and 76.3% accuracy in navigation action recommendations. Through a hierarchical early stopping system using delimiter-based tags, the system reduced output token generation by 95.36% for essential inference results and 43.98% for detailed inference results compared to natural language outputs.
Building similarity graph...
Analyzing shared references across papers
Loading...
Dong-Hyun Kim
Ju-Yeon Yoo
Small and Medium Business Administration
Journal of Marine Science and Engineering
Korea Institute of Materials Science
Building similarity graph...
Analyzing shared references across papers
Loading...
Kim et al. (Tue,) studied this question.
synapsesocial.com/papers/69401d682d562116f28f9083 — DOI: https://doi.org/10.3390/jmse13122339
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: