Long documents are crucial in knowledge transfer and typically employ a structured organization to facilitate comprehension. While LLMs can process long texts, it is unclear to what extent they recognize and utilize this structural information. Previous research has shown that document structure can improve downstream task performance in pre-trained language models, but its effect on LLMs remains underexplored. In this thesis, we systematically investigate the ability of LLMs to understand document structure and the impact of explicit structural information on downstream task performance. To this end, we evaluated Llama 3.1 8B Instruct, Llama 3.1 70B Instruct, and GPT-4o mini models by presenting documents in different input formats (plain text, HTML, Markdown, LaTeX) and analyzing their performance for structure understanding and downstream tasks. Our experimental results showed that LLMs can develop a structural intuition without explicit structural information; however, structured inputs significantly improve model accuracy in structure understanding tasks. The impact of incorporating explicit structure in documents differed across downstream tasks: While it provided a clear advantage in evidence selection, its benefits were more limited in question answering and summarization tasks.
Berfin Demir (Tue,) studied this question.