Resources

LTI-11776: Multimodal Affective Computing

Humans are highly social creatures and have evolved complex mechanisms for signaling information about their thoughts, feelings, and intentions (both deliberately and reflexively). In turn, humans have also evolved complex mechanisms for receiving these signals and inferring the thoughts, feelings, and intentions of others. Proper understanding of human behavior, in all its nuance, requires careful consideration and integration of verbal, vocal, and visual information. These communication dynamics have long been studied in psychology and other social sciences. More recently, the field of multimodal affective computing has sought to enhance these studies using techniques from computer science and artificial intelligence. Common topics of study in this field include affective states, cognitive states, personality, psychopathology, social processes, and communication. As such, multimodal affective computing has broad applicability in both scientific and applied settings ranging from medicine and education to robotics and marketing.

The objectives of this course are:

To give an overview of the components of human behavior (verbal, vocal, and visual) and the computer science areas that measure them (NLP, speech processing, and computer vision)
To provide foundational knowledge of psychological constructs commonly studied in multimodal affective computing (e.g., emotion, personality, and psychopathology)
To provide practical instruction on using statistical tools to study research hypotheses
To provide information about computational predictive models that integrate multimodal information from the verbal, vocal, and visual modalities
To give students practical experience in the computational study of human behavior and psychological constructs through an in-depth course project

Resources: https://piazza.com/cmu/spring2019/11776/resources

Course Topics

Week 1: Introduction and communication models

Human communication dynamics
Signals and communicative messages
Communication models (Brunswick’s model)

Week 2: Measuring psychological constructs

Links between constructs and measurement
Self-report and observational measurement
Trustworthiness and measurement validation

Week 3: Theories behind psychological constructs

Theories of affect and emotion
Theories of personality
Theories of psychopathology

Week 4: Visual communicative messages

Facial expressions, gestures, and gaze
Proxemics and group formation
Gestures and body language

Week 5: Acoustic communicative messages

Fundamentals of speech production
Prosodic cues and voice quality
Nonverbal vocal expressions

Week 6: Verbal communicative messages

Speech and dialogue acts
Boundaries, fillers, and disfluencies
Turn-taking and backchanneling

Week 7: Statistical foundations

Sampling and sampling error
Point and interval estimation
Statistical hypothesis testing

Week 8: Linear statistical modeling

Understanding the linear model
Generalizing the linear model
Prediction and regularization

Week 10: Probabilistic predictive modeling

Probabilistic graphical models
Bayesian networks and naive Bayes classifier
Dynamic Bayesian networks and HMMs

Week 11: Discriminative predictive modeling

Markov random fields
Factor graph representation
Discriminative graphical models

Week 12: Neural network predictive modeling

Multi-layer perceptron
Deep neural network
Convolutional neural network

Week 13: Multimodal deep learning

Multimodal representations
Attention and modality alignment
Temporal and multimodal fusion

Week 14: Multimodal behavior generation

Functions of nonverbal behaviors
Scripting, rule-based, and data-driven generation
Applications for nonverbal behavior generation

Week 15: Multimodal applications

Multimodal applications in healthcare
Multimodal applications in education
Knowledge generation through modality comparison