Leveraging Driver Attention for an End-to-End Explainable Decision-Making From Frontal Images
- Araluce, Javier; Bergasa, Luis M.; Ocaña, Manuel; Llamazares, Ángel; López-Guillén, Elena
- Research areas:
- Year: 2024
- Type of Publication: Article
- Journal: IEEE Transactions on Intelligent Transportation Systems
- Pages: 1-12
- ISSN: 1558-0016
- DOI: 10.1109/TITS.2024.3350337
- Abstract:
- Explaining the decision made by end-to-end autonomous driving is a difficult task. These approaches take raw sensor data and compute the decision as a black box with large deep learning models. Understanding the output of deep learning is a complex challenge due to the complicated nature of explainability; as data passes through the network, it becomes untraceable, making it difficult to understand. Explainability increases confidence in the decision by making the black box that drives the vehicle transparent to the user inside. Achieving a Level 5 autonomous vehicle necessitates the resolution of that challenging task. In this work, we propose a model that leverages the driver’s attention to obtain explainable decisions based on an attention map and the scene context. Our novel architecture addresses the task of obtaining a decision and its explanation from a single RGB sequence of the driving scene ahead. We base this architecture on the Transformer architecture with some efficiency tricks in order to use it at a reasonable frame rate. Moreover, we integrate in this proposal our previous ARAGAN model, which obtains SOTA attention maps, to improve the performance of the model thanks to understand the sequence as a human does. We train and validate our proposal on the BDD-OIA dataset, achieving on-pair results or even better than other state-of-the-art methods. Additionally, we present a simulation-based proof of concept demonstrating the model’s performance as a copilot in a close-loop vehicle to driver interaction.
Hits: 16841