LTI-11776: Multimodal Affective Computing

Humans are highly social creatures and have evolved complex mechanisms for signaling information about their thoughts, feelings, and intentions (both deliberately and reflexively). In turn, humans have also evolved complex mechanisms for receiving these signals and inferring the thoughts, feelings, and intentions of others. Proper understanding of human behavior, in all its nuance, requires careful consideration and integration of verbal, vocal, and visual information. These communication dynamics have long been studied in psychology and other social sciences. More recently, the field of multimodal affective computing has sought to enhance these studies using techniques from computer science and artificial intelligence. Common topics of study in this field include affective states, cognitive states, personality, psychopathology, social processes, and communication. As such, multimodal affective computing has broad applicability in both scientific and applied settings ranging from medicine and education to robotics and marketing.

The objectives of this course are:

  1. To give an overview of the components of human behavior (verbal, vocal, and visual) and the computer science areas that measure them (NLP, speech processing, and computer vision)
  2. To provide foundational knowledge of psychological constructs commonly studied in multimodal affective computing (e.g., emotion, personality, and psychopathology)
  3. To provide practical instruction on using statistical tools to study research hypotheses
  4. To provide information about computational predictive models that integrate multimodal information from the verbal, vocal, and visual modalities
  5. To give students practical experience in the computational study of human behavior and psychological constructs through an in-depth course project


Course Topics

Week 1: Introduction and communication models

  • Human communication dynamics
  • Signals and communicative messages
  • Communication models (Brunswick’s model)

Week 2: Measuring psychological constructs

  • Links between constructs and measurement
  • Self-report and observational measurement
  • Trustworthiness and measurement validation

Week 3: Theories behind psychological constructs

  • Theories of affect and emotion
  • Theories of personality
  • Theories of psychopathology

Week 4: Visual communicative messages

  • Facial expressions, gestures, and gaze
  • Proxemics and group formation
  • Gestures and body language

Week 5: Acoustic communicative messages

  • Fundamentals of speech production
  • Prosodic cues and voice quality
  • Nonverbal vocal expressions

Week 6: Verbal communicative messages

  • Speech and dialogue acts
  • Boundaries, fillers, and disfluencies
  • Turn-taking and backchanneling

Week 7: Statistical foundations

  • Sampling and sampling error
  • Point and interval estimation
  • Statistical hypothesis testing

Week 8: Linear statistical modeling

  • Understanding the linear model
  • Generalizing the linear model
  • Prediction and regularization

Week 10: Probabilistic predictive modeling

  • Probabilistic graphical models
  • Bayesian networks and naive Bayes classifier
  • Dynamic Bayesian networks and HMMs

Week 11: Discriminative predictive modeling

  • Markov random fields
  • Factor graph representation
  • Discriminative graphical models

Week 12: Neural network predictive modeling

  • Multi-layer perceptron
  • Deep neural network
  • Convolutional neural network

Week 13: Multimodal deep learning

  • Multimodal representations
  • Attention and modality alignment
  • Temporal and multimodal fusion

Week 14: Multimodal behavior generation

  • Functions of nonverbal behaviors
  • Scripting, rule-based, and data-driven generation
  • Applications for nonverbal behavior generation

Week 15: Multimodal applications

  • Multimodal applications in healthcare
  • Multimodal applications in education
  • Knowledge generation through modality comparison