This reading group focuses on recent papers on machine learning methods, including deep neural networks, to represent and integrate multimodal data. We read recently published papers from venues such as NIPS, ICLR, CVPR, ACL, ICML and ICCV conferences. Below are the list of papers and corresponding meeting dates.
Tied Transformers: Neural Machine Translation with Shared Encoder and Decoder [Presentation Slides]
MEAL: Multi-Model Ensemble via Adversarial Learning [Presentation Slides]
Neural Motifs: Scene Graph Parsing with Global Context. [Presentation Slides]
Do Neural Network Cross-Modal Mappings Really Bridge Modalities? [Presentation Slides]
A probabilistic framework for multi-view feature learning with many-to-many associations via neural networks [Presentation Slides]
The Multi-Entity Variational Autoencoder [Presentation Slides]
Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input [Presentation Slides]
Unsupervised Learning of Spoken Language with Visual Context [Presentation Slides]
Grounded Video Description [Presentation Slides]
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions [Presentation Slides]
Neural Ordinary Differential Equations [Presentation Slides]
Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval With Generative Models [Presentation Slides]
Branch-Activated Multi-Domain Convolutional Neural Network for Visual Tracking [Presentation Slides]
Universal Transformer [Presentation Slides]