September 20, 2017
Professor Morency hosted a tutorial in ACL’17 on Multimodal Machine Learning which is based on “Multimodal Machine Learning: A taxonomy and survey” and the course Advanced Multimodal Machine Learning at CMU.
Multimodal machine learning aims to build models that can process and relate information from multiple modalities. It is a vibrant multi-disciplinary field of increasing importance and with extraordinary potential. Instead of focusing on specific multimodal applications, this tutorial surveys recent advances in multimodal machine learning itself and presents them in a common taxonomy. We go beyond the typical early and late fusion categorization and identify broader challenges that are faced by multimodal machine learning, namely: representation, translation, alignment, fusion, and co-learning. This new taxonomy will enable researchers to better understand the state of the field and identify directions for future research.