What question did this study set out to answer?

The aim is to develop an autonomous cleaning robot that adapts its navigation and interacts intelligently with objects.

March 1, 2026Open Access

CleanNav: Deep learning and reinforcement strategies for smart robot exploration

Key Points

The aim is to develop an autonomous cleaning robot that adapts its navigation and interacts intelligently with objects.
Integrated PPO-based reinforcement learning for adaptive navigation.
Used a robotic arm for object manipulation during cleaning.
Employed ultrasonic sensors for obstacle avoidance and navigation.
Utilized YOLOv8 and MobileNetV3 for vision-based object detection with weighted voting.
Implemented a lost & found feature for tracing items based on logged detections.
Achieved 96.4% mAP@50 in object detection using YOLOv8.
Obtained 99.2% classification accuracy with MobileNetV3.
Robotic arm successfully picked up lightweight objects in 90% of attempts.
Weighted voting approach corrected critical misclassifications during operation.

Abstract

The developing trend in autonomous cleaning solutions for our homes has highlighted some challenges such as efficient navigation through spaces, avoiding obstacles, and interacting with objects in an intelligent way. Conventional vacuum cleaners adhere to set cleaning paths and thus do not adapt so well in changing scenarios. CleanNav integrates PPO-based reinforcement learning for adaptive navigation with a robotic arm for direct interaction in the environment, moving beyond vision-only or preplanned paths to enable end-to-end autonomy in dynamic homes. This paper proposes a novel hardware-based intelligent cleaning system, which combines a function of object interaction using robotic arm mounted on the vacuum cleaner base with reinforcement learning for navigation and deep learning for object detection. The system self-navigates and explores the surrounding by effectively avoiding obstacles with the help of ultrasonic sensor data fed to a model of reinforcement learning trained using Proximal Policy Optimization algorithm. The vision-based object detection module enhanced by the weighted voting approach, which fuses the models of YOLOv8 and MobileNetV3 ensure accurate real-time object detection. The hardware architecture of this model uses a robotic arm for object manipulation to allow uninterrupted cleaning. Complementing this, a lost & found feature assist users in tracing misplaced items based on logged detections by providing contextual hints about their last seen locations. With all these integration of robotic manipulation with intelligent exploration and navigation, cleaning efficiency is thus enhanced, making the system a dependable choice as a comprehensive home cleaning solution. Experimental results demonstrate the system achieves 96.4% mAP@50 in object detection using YOLOv8 and 99.2% classification accuracy with MobileNetV3. The robotic arm successfully picks up lightweight objects in 90% of attempts, while the weighted voting approach corrects critical misclassifications during real-time operation. Altogether, this implementation shows a very effective coordination between PPO-based navigation decisions and robotic arm actions, reflecting how reinforcement learning and mechanical control can operate together within a unified cleaning framework. Combining reinforcement learning for safe obstacle avoidance and exploration, this integration of vision, navigation and manipulation demonstrates robust system performance.

CleanNav: Deep learning and reinforcement strategies for smart robot exploration

Key Points

Abstract

Cite This Study