Research article: Disaggregated Prefill and Decode Architectures
Oleh Ivchenko (Sun,) studied this question.