Abstract
To overcome the shortage of real-world multi-view multiple people, we introduce a new synthetic multi-view multiple people labelling dataset named Multi-View 3D Humans (MV3DHumans). This dataset is a large-scale synthetic image dataset that was generated for multi-view multiple people detection, labelling and segmentation tasks. The MV3DHumans dataset contains 1200 scenes captured by multiple cameras, with 4, 6, 8 or 10 people in each scene. Each scene is captured by 16 cameras with overlapping field of views. The MV3DHumans dataset provides RGB images with resolution of 640 × 480. Ground truth annotations including bounding boxes, instance masks and multi-view correspondences, as well as camera calibrations are provided in the dataset.