Multi-object navigation (MON) tasks involve sequentially locating multiple targets in an unknown environment, requiring global long-term planning under incomplete information. This necessitates that the agent dynamically balance immediate actions and long-term rewards while considering both local adaptability and global foresight. However, current methods overly focus on local path optimization, which leads to slower convergence in sparse reward settings and increases the risk of deadlocks or trap states. The core challenge of MON lies in the deformation of the shared decision space, where independent optimization leads to redundant and overlapping paths. Thus, path planning requires dynamic, cross-task optimization rather than simple subtask aggregation. To minimize overall effort, the optimization process should adaptively balance task contributions through weight adjustment. Thus, we propose the Goal-oriented Dynamic Weight Optimization (GDWO) algorithm. GDWO integrates target-specific value loss functions into a unified optimization framework and dynamically adjusts weights through gradient-based updates. To prevent over-optimization, weights are normalized during training according to navigation success rates, prioritizing more challenging targets. This adaptive mechanism effectively addresses the challenge of sparse rewards and improves convergence efficiency. By leveraging this mechanism, GDWO unifies multiple objectives within a unified decision space, achieving efficient optimization and balancing short-term gains with long-term goals. Additionally, we introduce two auxiliary modules: prior knowledge-based navigation and frontier-aware exploration to further enhance GDWO's performance. Experimental results on the Gibson and Matterport3D datasets demonstrate that GDWO achieves improvements in key metrics for MON tasks. It optimizes path planning, reduces exploration costs, and enhances navigation efficiency, enabling the agent to perform tasks more effectively in complex environments.
Zeng et al. (Thu,) studied this question.