What question did this study set out to answer?

The aim is to improve identification of IoT devices in challenging network environments to enhance privacy and safety.

June 3, 2026

What's on My Network? Using Large Language Models to Identify Real-World IoT Devices at Scale

Key Points

The aim is to improve identification of IoT devices in challenging network environments to enhance privacy and safety.
Constructed high-fidelity vendor labels for the IoT Inspector dataset using large language models.
Instruction-tuned a quantized LLaMA 3.1 8B model on the dataset using curriculum learning.
Evaluated model robustness against missing data and adversarial manipulation.
Achieved 98.69% top-1 accuracy and 90.73% macro accuracy across 2,015 vendors.
Demonstrated robustness in the presence of protocol drift and adversarial attacks.
Validated on an independent IoT testbed dataset, confirming model effectiveness.

Abstract

The growth of IoT devices in shared environments has outpaced our ability to identify them, posing urgent risks to privacy, safety, and accountability. This challenge is especially pronounced in open-world environments, where network traffic metadata is often sparse, noisy, or adversarial. To address this problem, we introduce a semantic inference pipeline that reframes device identification as a language modeling task over real-world network metadata. As this approach depends on reliable supervision, we first construct high-fidelity vendor labels for the IoT Inspector dataset—the largest real-world corpus of its kind—using an ensemble of large language models guided by mutual-information and entropy-based stability scores. We then instruction-tune a quantized LLaMA 3.1 8B model on this dataset using curriculum learning to support generalization under sparsity and long-tail vendor distributions. Our model achieves 98.69% top-1 and 90.73% macro accuracy across 2,015 vendors, while remaining robust to missing fields, protocol drift, and adversarial manipulation. We also evaluate the model on an independent IoT testbed dataset, assess explanation quality, and conduct adversarial tests to probe robustness under spoofed and obfuscated input. These results position instruction-tuned LLMs as a scalable, interpretable foundation for trustworthy device identification at scale.

Bookmark

Cite This Study

Mahmood et al. (Mon,) studied this question.

synapsesocial.com/papers/6a1fc58bdee9eb8c0dce6ffd https://doi.org/https://doi.org/10.1145/3808674

Bookmark