Quantifying differences across disciplines and audiences is crucial to expanding the accessibility of scholarly material. In this study, we investigated the linguistic features of lay and technical research paper summaries from the biological, physical, and social sciences based on word-level co-occurrence graph networks and the 14 conventional indices of syntactic complexity. Our findings showed that while technical summaries exhibited similar characteristics across disciplines, lay summaries displayed notable variation: physical science summaries favored compact, nominal-heavy structures, whereas social science summaries relied more on clausal elaboration and subordination. To assess graph-theoretic textual features, we used Pointwise Mutual Information (PMI) to construct Word Adjacency Graphs for both technical and lay summaries, and computed six normalized graph-theoretic indices: lexical diversity (connected nodes per word), semantic connectivity (average nodal degree), cohesion (density), modularity (average clustering), and conceptual integration (largest connected component size / nodes). As observed with the conventional measures of syntactic complexity, while technical summaries were generally homogenous across disciplines, we found that the lay summaries differed significantly regarding graph-theoretic textual metrics. These findings highlight that not only did scientific domains differ in their linguistic structures, but these differences were audience-dependent. Our approach offers a qualitative framework for evaluating semantic complexity in science writing and has implications for both automated readability assessments and cross-disciplinary science education.
Ranjan et al. (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: