首页
探索
nav.journalClub
趋势
更多
synapse
⌘+K
语言
简体中文
简体中文
DiLLeMa: An extensible and scalable framework for distributed large language models (LLMs) inference on multi-GPU clusters | Synapse
March 3, 2026
Open Access
DiLLeMa: An extensible and scalable framework for distributed large language models (LLMs) inference on multi-GPU clusters
RP
Robby Ulung Pambudi
Sepuluh Nopember Institute of Technology
AS
Ary Mazharuddin Shiddiqi
Sepuluh Nopember Institute of Technology
RI
Royyana Muslim Ijtihadie
Sepuluh Nopember Institute of Technology
See all
Key Points
The framework supports distributed computing for large language models, enhancing inference speed and efficiency.
Performance tests indicate a notable improvement in processing time for large language models across multiple GPUs.
Observational analysis focuses on scalability and extensibility features, ensuring usability in various computing environments.
Highlights the significance of improved inference mechanisms, offering considerable advantages in artificial intelligence applications.
Read Full Paper
externally
Mark Helpful
Like
Save
Bookmark
Relay
Share
View Full Paper
Mark Helpful
Like
Save
Bookmark
Relay
Share
View Full Paper
Cite This Study
Copy
Pambudi et al. (Thu,) studied this question.
synapsesocial.com/papers/69a75d67c6e9836116a276c8
https://doi.org/https://doi.org/10.1016/j.softx.2026.102537