Pedestrian crossing intention prediction based on computer vision plays a pivotal role in enhancing the safety of autonomous driving and advanced driver assistance systems. In this paper, we present a novel multi-modal pedestrian crossing intention prediction framework leveraging the transformer model. By integrating diverse sources of information and leveraging the transformer’s sequential modeling and parallelization capabilities, our system accurately predicts pedestrian crossing intentions. We introduce a novel representation of traffic environment data and incorporate lifted 3D human pose and head orientation data to enhance the model’s understanding of pedestrian behavior. Experimental results demonstrate the state-of-the-art accuracy of our proposed system on benchmark datasets.
Companion
APSIPA Transactions on Signal and Information Processing Special Issue - Invited Papers from APSIPA ASC 2023
See the other articles that are part of this special issue.