J.-C. Levesque, C. Gagne and L.-P. Morency. Sequential Emotion Recognition using Latent-Dynamic Conditional Neural Fields. In Proceedings of the IEEE Conference on Automatic Face and Gesture Recognition (FG), 2013
Psychologists believe that facial expressions and verbal messages are some of the primary channels of human communication. In recent years, automatic emotion recognition has received considerable attention. The development of technologies in emotion recognition is surprisingly fast but requires further research.
At the early stage, researchers focused mostly on emotion analysis from single static facial images under constrained circumstances. The recognition in the real world is certainly quite different. As human emotions are dynamic streaming, the research is turning into recognition through video or image sequences. In our work, we try to develop multimodal machine learning methods for static and temporal emotion and affect recognition.
Typical techniques for sequence modeling rely upon well-segmented sequences which have been edited to remove noisy or irrelevant parts. Therefore, we cannot easily apply such methods to noisy sequences expected in real-world applications.
We study sequence modeling through the combination of RNNs that captures the temporal dependencies and the attention mechanism that localizes the salient observations which are relevant to the final decision and ignore the irrelevant (noisy) parts of the input sequence.