What question did this study set out to answer?

This study aims to improve 3D human motion editing by providing fine-grained control through learned temporal soft masks.

June 12, 2026

TM ‐Edit: Text Guided Diffusion‐Transformer Based Motion Editing With Temporal Soft Masks Learning

Key Points

This study aims to improve 3D human motion editing by providing fine-grained control through learned temporal soft masks.
Proposed TM-Edit framework employs text guidance for motion editing.
Introduces temporal soft masks to encode editing intensity from motion and text inputs.
Utilizes a conditional diffusion process with uncertainty-aware gating.
Achieves state-of-the-art performance on the MotionFix benchmark dataset.

Abstract

ABSTRACT Text‐driven 3D human motion editing aims to modify an existing motion sequence following natural language instructions, which is a crucial task for character animation, virtual agents, and motion authoring. Recent diffusion‐based methods have shown remarkable success in text‐to‐motion generation. Editing existing motions requires precise spatiotemporal control to localize modifications while preserving context. Current diffusion‐based motion editing methods lack explicit fine‐grained control over when and how strongly to edit. To address this, we propose TM‐Edit, a text guided Diffusion‐Transformer based motion editing framework which introduces learned temporal soft masks to provide explicit frame‐wise editing guidance. The proposed model predicts an editing intensity mask to encode high‐level intent from both the source motion and the text instruction. This mask is then used to modulate source motion features within a conditional diffusion process via an uncertainty‐aware gating mechanism, ensuring robust training and inference. Additionally, a feature semantic alignment loss is employed by using a pre‐trained motion retrieval model to enhance cross‐modal consistency. Extensive experiments on the MotionFix benchmark dataset demonstrate that our approach achieves state‐of‐the‐art performance. Code will be made publicly available.

Ask AI

Helpful

Bookmark