ANIM: Accurate Neural Implicit Model for Human Reconstruction from a Single RGB-D Image

Marco Pesavento; Yuanlu Xu; Nikolaos Sarafianos; Robert Maier; Ziyan Wang; Chun-Han Yao; Marco Volino; Edmond Boyer; Adrian Hilton; Tony Tung

doi:10.1109/CVPR52733.2024.00521

Back

ANIM: Accurate Neural Implicit Model for Human Reconstruction from a Single RGB-D Image

Conference paper

Open access

ANIM: Accurate Neural Implicit Model for Human Reconstruction from a Single RGB-D Image

Marco Pesavento, Yuanlu Xu, Nikolaos Sarafianos, Robert Maier, Ziyan Wang, Chun-Han Yao, Marco Volino, Edmond Boyer, Adrian Hilton and Tony Tung

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024), pp.5448-5458

Institute of Electrical and Electronics Engineers (IEEE)

06/2025

DOI: https://doi.org/10.1109/CVPR52733.2024.00521

Abstract

3D Digital Avatars

3D Human reconstruction

Accuracy

Neural Implicit Model

Optical distortion

Protocols

Shape

Solid modeling

Surface reconstruction

Three-dimensional displays

Computer Vision

Recent progress in human shape learning, shows that neural implicit models are effective in generating 3D hu-man surfaces from limited number of views, and even from a single RGB image. However, existing monocular approaches still struggle to recover fine geometric details such as face, hands or cloth wrinkles. They are also easily prone to depth ambiguities that result in distorted geome-tries along the camera optical axis. In this paper, we ex-plore the benefits of incorporating depth observations in the reconstruction process by introducing ANIM, a novel method that reconstructs arbitrary 3D human shapes from single-view RGB-D images with an unprecedented level of accuracy. Our model learns geometric details from both multi-resolution pixel-aligned and voxel-aligned features to leverage depth information and enable spatial relation-ships, mitigating depth ambiguities. We further enhance the quality of the reconstructed shape by introducing a depth-supervision strategy, which improves the accuracy of the signed distance field estimation of points that lie on the re-constructed surface. Experiments demonstrate that ANIM outperforms state-of-the-art works that use RGB, surface normals, point cloud or RGB-D data as input. In addition, we introduce ANIM-Real, a new multi-modal dataset comprising highquality scans paired with consumer-grade RGB-D camera, and our protocol to fine-tune ANIM, enabling highquality reconstruction from real-world human capture. https://marcopesavento.github.io/Anim/

Files and links (2)

pdf

Pesavento_ANIM_Accurate_Neural_Implicit_Model_for_Human_Reconstruction_from_a_CVPR_2024_paper2.12 MBDownload View

Published (Version of record) Open Access CC BY V4.0

url

https://cvpr.thecvf.com/Conferences/2024View

Event Website Conference website

Metrics

1 Record Views

Details

Title: ANIM: Accurate Neural Implicit Model for Human Reconstruction from a Single RGB-D Image
Creators: Marco Pesavento - University of Surrey, School of Computer Science & Electronic Engineering
Yuanlu Xu - Meta Reality Labs
Nikolaos Sarafianos - Meta Reality Labs
Robert Maier - Meta Reality Labs
Ziyan Wang - Meta Reality Labs
Chun-Han Yao - UC Merced
Marco Volino - University of Surrey, School of Computer Science & Electronic Engineering
Edmond Boyer - Meta Reality Labs
Adrian Hilton - University of Surrey, School of Computer Science & Electronic Engineering
Tony Tung - Meta Reality Labs
Publication Details: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024), pp.5448-5458
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Number of pages: 11
First online publication date: 16/09/2024
Publication Date: 06/2025
Grants: BBC Prosperity Partnership: Future Personalised Object-Based Media Experiences Delivered at Scale Anywhere, EP/V038087/1, Engineering and Physical Sciences Research Council (United Kingdom, Swindon) - EPSRC
Grant note: This work was supported by Meta, UKRI EPSRC and BBC Prosperity Partnership AI4ME: Future Personalised Object-Based Media Experiences Delivered at Scale Anywhere EP/V038087.
Identifiers: 991123795302346; WOS:001322555905080
Academic Unit: School of Computer Science & Electronic Engineering
Language: English
Resource Type: Conference paper

ANIM: Accurate Neural Implicit Model for Human Reconstruction from a Single RGB-D Image

Abstract

Files and links (2)

Metrics

Details

Usage Policy