Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes

1Beihang University    2The Chinese University of Hong Kong, Shenzhen    3 University of Technology Sydney    4UiT the Arctic University of Norway    5Sun Yat-sen University   
Teaser image

Sitcom-Crafter supports various types of human motion generation (represented by different colored toruses: Human Locomotion, Human-Scene Interaction, and Human-Human Interaction) within a 3D scene. The motion generation is effectively guided by long plot provided by user.

Abstract

Recent advancements in human motion synthesis have focused on specific types of motions, such as human-scene interaction, locomotion or human-human interaction, however, there is a lack of a unified system capable of generating a diverse combination of motion types. In response, we introduce Sitcom-Crafter, a comprehensive and extendable system for human motion generation in 3D, which can be guided by extensive plot contexts to enhance workflow efficiency for anime and game designers. The system is comprised of eight modules, three of which are dedicated to motion generation, while the remaining five are augmentation modules that ensure consistent fusion of motion sequences and system functionality. Central to the generation modules is our novel 3D scene-aware human-human interaction module, which addresses collision issues by synthesizing implicit 3D Signed Distance Function (SDF) points around motion spaces, thereby minimizing human-scene collisions without additional data collection costs. Complementing this, our locomotion and human-scene interaction modules leverage existing methods to enrich the system's motion generation capabilities. Augmentation modules encompass plot comprehension for command generation, motion synchronization for seamless integration of different motion types, hand pose retrieval to enhance motion realism, motion collision revision to prevent human collisions, and 3D retargeting to ensure visual fidelity. Experimental evaluations validate the system's ability to generate high-quality, diverse, and physically realistic motions, underscoring its potential for advancing creative workflows.

System Overview

framework image

The Sitcom-Crafter system consists of eight modules in total, three for motion generation and five for function enhancement. The arrows between modules indicate the workflow direction. The system supports generation guided by 3D scene structure and long plot context. The plot comprehension module is responsible for interpreting the guiding context into recognizable commands and distributing them to the generation modules. The three generation modules synthesize different motion types: human-scene interaction, human locomotion, and human-human interaction. The motion synchronization module ensures motion consistency between the different generation modules. The hand pose retrieval module augments the motion results with hand motion. The collision revision module corrects frames where characters collide with each other. Finally, the motion retargeting module converts the plain parametric model into detailed 3D digital human assets.

Motion Comparisons

Long Plot Generations

Ablations

BibTeX

@article{chen2024sitcomcrafterplotdrivenhumanmotion,
        title={Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes},
        author={Chen, Jianqi and Hu, Panwen and Chang, Xiaojun and Shi, Zhenwei and Kampffmeyer, Michael and Liang, Xiaodan},
        journal={arXiv preprint arXiv:2410.10790},
        year={2024}
      }