Abstract
This work contributes to the recent advancements in crowd management research, specifically concerning high-density settings, in the context of large gatherings. Video analysis and visual monitoring have become essential for enhancing the safety and security of pilgrimages worldwide. Multi-person posture estimation is crucial for several computer vision applications and has significantly progressed in recent years. Nonetheless, only few methods have tackled the challenge of pose estimation in congested settings, which remains difficult and inescapable in many scenarios. Moreover, current approaches do not provide adequate evaluation criteria for such situations. This study introduces a novel and effective method for tackling the challenge of posture estimation in extensive crowds, accompanied with a new dataset for enhanced algorithm assessment. Our methodology combines several computer vision methods supported by a Mask R-CNN model to precisely separate and evaluate multi-person postures, facilitating the automatic detection of behavioural patterns in large crowds. Our proposed method, with a ResNet101 backbone, on our HAJJ-Crowd videos dataset achieved 70.0 mAP. Our new HAJJ-Crowd video dataset can be used for assessment and testing purposes as it includes instance segmentation and prediction outcomes for several common methodologies.