What question did this study set out to answer?

The objective is to enhance Table QA performance by exploring multiple modalities for table representation.

April 1, 2026Open Access

STMK24 NTCIR18 U4 Table QA Submission

Key Points

The objective is to enhance Table QA performance by exploring multiple modalities for table representation.
Transformed tables into image, text, and layout modalities.
Trained a model to infer cell IDs for better understanding of table structures.
Used rule-based conversion for automatic extraction of cell values.
Investigated the performance impact of each modality.
Achieved high accuracy in inferring cell IDs when using all modalities.
Demonstrated improved Table QA performance through multi-modal approaches.

Abstract

This paper reports the methods, results and analysis of STMK24 for the NTCIR-U4 Table QA (TQA) task. STMK24 approaches TQA as a Visual Document Understanding task, and tables are transformed into three different modalities: image, text, and layout of the content. To simply comprehend the structures of the tables, our model is trained to infer the cell IDs of the tables, and the cell values are automatically extracted through rule-based conversion. We investigated the impact of each modality on Table QA performance and confirmed that the model achieves high cell ID inference accuracy when utilizing all modalities.

STMK24 NTCIR18 U4 Table QA Submission

Key Points

Abstract

Cite This Study