2025
Conférence
OCTA
Multi-label emotion recognition (MLER) addresses the challenge of identifying multiple co-occurring emotions within a single text, a task inherently more complex than single label classification. Early research relied on traditional machine learning with hand-crafted features, but these approaches struggled to capture emotional ambiguity and dependencies. The rise of deep learning introduced more powerful architectures such as convolutional and recurrent neural networks, as well as attention mechanisms and hybrid models, which significantly improved performance by learning contextual representations. More recently, Transformer-based architectures, particularly encoder-only models like BERT and RoBERTa, have set new benchmarks by leveraging bidirectional self-attention to model nuanced emotional signals. In this position paper, we argue that Transformers represent the most promising direction for advancing MLER, while also highlighting critical challenges—including interpretability, computational demands, and cross-lingual generalization—that must be addressed to make these systems more reliable and impactful in real-world applications. This paper contributes a conceptual synthesis of the evolution of MLER and provides a critical perspective on the future role of Transformer based architectures. It emphasizes interpretability, cross-lingual generalization, and computational efficiency as key directions for achieving robust and ethical emotion recognition systems.



Sirine Khlifi
Mouna Belhaj
Lamjed Ben Said