Y. Song, L.-P. Morency and R. Davis. Action Recognition by Hierarchical Sequence Summarization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013
Typical techniques for sequence modeling rely upon well-segmented sequences which have been edited to remove noisy or irrelevant parts. Therefore, we cannot easily apply such methods to noisy sequences expected in real-world applications.
We study sequence modeling through the combination of RNNs that captures the temporal dependencies and the attention mechanism that localizes the salient observations which are relevant to the final decision and ignore the irrelevant (noisy) parts of the input sequence.