Maximum entropy adversarial inverse reinforcement learning (ME-AIRL) has garnered widespread attention for its ability to learn rewards and optimize policies from expert demonstrations. In complex multi-task environments, applying meta-learning ME-AIRL to acquire rewards requires a substantial volume of homogeneous expert demonstrations across all tasks, which is often impractical in real-world scenarios. Moreover, interference between tasks further escalates computational complexity. To solve these challenges, this paper proposes a distributed multi-task meta ME-AIRL framework based on theory of mind and mean field, referred to as TMMF-MTAIRL. In TMMF-MTAIRL, the theory of mind is used to capture the relationships and representational information among multiple tasks. Furthermore, TMMF-MTAIRL integrates mean-field theory to transform interactions between complex tasks into interactions between the main task and the average of the remaining tasks. Furthermore, additional latent variables are introduced to enhance adaptation to novel tasks. We evaluate the proposed TMMF-MTAIRL on point-maze benchmarks and a real-world rolling bearing fault diagnosis dataset using metrics such as classification accuracy, mean rewards or cumulative rewards. TMMF-MTAIRL achieves the best performance across all tasks, with an average improvement of 0.16 in accuracy of fault classification over the strongest baseline.
Song et al. (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: