What question did this study set out to answer?

To develop effective analysis methods for identifying outcomes in Coast Guard search and rescue reports.

April 24, 2026

Evaluating analysis methods for coast guard reports freeform text: a case study on resource-constrained natural language processing with search and rescue reports

Key Points

To develop effective analysis methods for identifying outcomes in Coast Guard search and rescue reports.
Explored classical statistical methods for training a classification model.
Developed a text cleaning pipeline to handle messy report data.
Utilized an Iterative Token Elimination Algorithm to enhance vocabulary differences between classes.
Achieved 0.762 recall and precision, and 0.959 accuracy with the XGBoost model.
Demonstrated the importance of text cleaning and feature augmentation in classification performance.

Abstract

After a U.S. Coast Guard (USCG) search and rescue (SAR) case, USCG personnel create an after-action report containing a textual narrative of the situation and Coast Guard response efforts. Data analysts explored how to identify reports involving cases with a verified person in the water. With restricted access to compute resources and limiting policy, large language models (LLMs) could not be utilized, so statistical (‘classical’ and non-neural) methods were considered for training a classification model to identify SAR case outcomes from report texts. The dataset was severely imbalanced toward the negative class, and the texts were extremely messy, with many typos and abbreviations. Therefore, an extensive text cleaning pipeline was developed and tested for improving classification performance. The Iterative Token Elimination Algorithm (iTEA) was developed to increase differences in vocabulary between classes. Model improvement was further explored through augmentation of the feature space using non-text data. The best model was an XGBoost model, achieving 0.762 recall and precision (and 0.959 accuracy). Errors from the test set are analyzed to guide future improvements until LLMs can be used, which are expected to improve performance and reduce text cleaning requirements.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Zachary Kudlak

United States Coast Guard Academy

Justin Sherman

United States Coast Guard Academy

Journals

The Journal of Defense Modeling and Simulation Applications Methodology Technology

Evaluating analysis methods for coast guard reports freeform text: a case study on resource-constrained natural language processing with search and rescue reports

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Also consider