TSMC: Time-varying 4D Scene Mesh Compression

Chen, Guodong; Váša, Libor; Mazumdar, Amrita; Dasari, Mallesham

Fast-forward Video

Thanks Jason Balayev for helping make this video.

Abstract

Time-varying scene meshes are widely used for volumetric video, enabling immersive six degrees of freedom (6DoF) interaction. However, their large data size compared to 2D video poses significant challenges for efficient delivery and real-time streaming, and existing mesh compression methods are not well-suited for dynamic, full-scene content.

To the best of our knowledge, TSMC is the first method to exploit temporal redundancy for inter-frame coding of large, unbounded scene meshes. Unlike prior approaches limited to static or object-level meshes, TSMC supports full-scene compression with complex spatial and temporal variations, and can also benefit hybrid 3DGS frameworks that combine meshes with Gaussian splats.

TSMC first identifies dynamic regions using a SAM3-based segmentation approach. For these regions, it constructs a volume-tracked reference mesh to handle self-contact and computes displacement fields by tracking vertex motion across frames. Static backgrounds are encoded once per group of frames, while dynamic regions are represented using a reference mesh and compressed displacement fields via Karhunen-Loéve Transform and Laplacian coordinates, achieving substantial size reduction with high visual fidelity.

Self-contact Problem

Methods rely solely on geometric information can lead to significant distortions when self-contact regions are present in the reference mesh, such as when arms rest on a desk or when drinking from a bottle. In these scenarios, the areas of self-contact may remain unnaturally stuck together after deformation.

Winding Number Issue

Directly applying volume tracking to the scene mesh fails, as the reference centers cannot correctly represent the region enclosed by the mesh surface due to limitations in the in-out test based on winding number, which is unreliable for complex or open geometries often found in large scenes.

Evaluation

Objective Visual Comparisons

Overall compression efficiency of TSMC compared to several baselines across multiple scenes and operating bitrates, demonstrating consistently higher visual quality at equal or lower data rates. Colors correspond to normalized vertex normals mapped to [0,1], making geometric distortions and self-contact artifacts clearly visible on non-textured meshes. Compared to TVMC*, Draco, KLT, and NeCGS, TSMC preserves fine motion and surface details with fewer artifacts, especially in self-contact regions, while achieving comparable or lower bitrates.

Rate-distortion (RD) Performance

Rate-distortion (RD) performance evaluation on Answering, Drinking, Arena, and Sitting scene sequences. We vary the number of basis vectors used for TSMC, $qp$ level during encoding for TVMC* and Draco, and TSDF resolution and block size for KLT and NeCGS to reach a similar range of bitrates. Note that below 0.7 SSIM, the quality is not perceivable.

BibTeX

@inproceedings{chen2026tsmc,
  title={TSMC: Time-varying 4D Scene Mesh Compression},
  author={Chen, Guodong and Váša, Libor and Mazumdar, Amrita and Dasari, Mallesham},
  booktitle={Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers},
  pages={1--12},
  year={2026}
}