Systematic analysis of attention memory patterns in transformer-based large language models, examining head specialization, attention sink phenomena, information density gradients across layers, and key-value redundancy patterns that inform cache compression strategies.
Building similarity graph...
Analyzing shared references across papers
Loading...
Oleh Ivchenko
Odessa National Polytechnic University
Odessa National Polytechnic University
Building similarity graph...
Analyzing shared references across papers
Loading...
Oleh Ivchenko (Thu,) studied this question.
synapsesocial.com/papers/69be38126e48c4981c6783ab — DOI: https://doi.org/10.5281/zenodo.19116551