LTI-11776: Multimodal Affective Computing
Humans are highly social creatures and have evolved complex mechanisms for signaling information about their thoughts, feelings, and intentions (both deliberately and reflexively). In turn, humans have also evolved complex mechanisms for receiving these signals and inferring the thoughts, feelings, and intentions of others. Proper understanding of human behavior, in all its nuance, requires careful consideration and integration of verbal, vocal, and visual information. These communication dynamics have long been studied in psychology and other social sciences. More recently, the field of multimodal affective computing has sought to enhance these studies using techniques from computer science and artificial intelligence. Common topics of study in this field include affective states, cognitive states, personality, psychopathology, social processes, and communication. As such, multimodal affective computing has broad applicability in both scientific and applied settings ranging from medicine and education to robotics and marketing.
The objectives of this course are:
- To give an overview of the components of human behavior (verbal, vocal, and visual) and the computer science areas that measure them (NLP, speech processing, and computer vision)
- To provide foundational knowledge of psychological constructs commonly studied in multimodal affective computing (e.g., emotion, personality, and psychopathology)
- To provide practical instruction on using statistical tools to study research hypotheses
- To provide information about computational predictive models that integrate multimodal information from the verbal, vocal, and visual modalities
- To give students practical experience in the computational study of human behavior and psychological constructs through an in-depth course project
Resources: https://piazza.com/cmu/spring2019/11776/resources
Course Topics
Week 1: Introduction and communication models
- Human communication dynamics
- Signals and communicative messages
- Communication models (Brunswick’s model)
Week 2: Measuring psychological constructs
- Links between constructs and measurement
- Self-report and observational measurement
- Trustworthiness and measurement validation
Week 3: Theories behind psychological constructs
- Theories of affect and emotion
- Theories of personality
- Theories of psychopathology
Week 4: Visual communicative messages
- Facial expressions, gestures, and gaze
- Proxemics and group formation
- Gestures and body language
Week 5: Acoustic communicative messages
- Fundamentals of speech production
- Prosodic cues and voice quality
- Nonverbal vocal expressions
Week 6: Verbal communicative messages
- Speech and dialogue acts
- Boundaries, fillers, and disfluencies
- Turn-taking and backchanneling
Week 7: Statistical foundations
- Sampling and sampling error
- Point and interval estimation
- Statistical hypothesis testing
Week 8: Linear statistical modeling
- Understanding the linear model
- Generalizing the linear model
- Prediction and regularization
Week 10: Probabilistic predictive modeling
- Probabilistic graphical models
- Bayesian networks and naive Bayes classifier
- Dynamic Bayesian networks and HMMs
Week 11: Discriminative predictive modeling
- Markov random fields
- Factor graph representation
- Discriminative graphical models
Week 12: Neural network predictive modeling
- Multi-layer perceptron
- Deep neural network
- Convolutional neural network
Week 13: Multimodal deep learning
- Multimodal representations
- Attention and modality alignment
- Temporal and multimodal fusion
Week 14: Multimodal behavior generation
- Functions of nonverbal behaviors
- Scripting, rule-based, and data-driven generation
- Applications for nonverbal behavior generation
Week 15: Multimodal applications
- Multimodal applications in healthcare
- Multimodal applications in education
- Knowledge generation through modality comparison