Week 1: Sep 27 |
[Lecture] Course Introduction.
|
Recommended reading: None |
Deadline:
- Sign-up
for Presentation and Scirbe
|
Week 2: Oct 2 |
[Lecture] Human preferences
models.
|
Recommended reading:
- Train. Qualitative
Choice Analysis: Theory, Econometrics, and an Application to Automobile Demand. MIT Press. 1985.
- McFadden, Train. Mixed
MNL Models for Discrete Response. Journal of Applied Econometrics. 2000.
- Luce. Individual Choice Behavior: A
Theoretical Analysis. Wiley. 1959.
Additional reading:
- Ben-Akiva, Lerman. Discrete
Choice Analysis: Theory and Application to Travel Demand. Transportation Studies. 1985.
- Park, Simar, Zelenyuk. Nonparametric
Estimation of Dynamic Discrete Choice Models for Time Series Data. Computational Statistics & Data
Analysis. 2017.
- Rafailov, Sharma, Mitchell, Ermon, Manning, Finn. Direct
Preference Optimization: Your Language Model Is Secretly a Reward Model. Preprint. 2023.
| Deadline:
- Pre-class survey
|
Week 2: Oct 4 |
[Student Presentation] Interaction
models
|
Recommended reading:
- Cattelan. Models for Paired Comparison Data: A Review with
Emphasis on Dependent Data. Statistical Science. 2012.
- Bhatia, Pananjady, Bartlett, Dragan, Wainwright. Preference
Learning Along Multiple Criteria: A Game-Theoretic Perspective. NeurIPS. 2020.
- Shah, Gundotra, Abbeel, Dragan. On the Feasibility of
Learning, Rather Than Assuming, Human Biases for Reward Inference. ICML. 2019.
- Ghosal, Zurek, Brown, Dragan. The
Effect of Modeling Human Rationality Level on Learning Rewards from Multiple Feedback Types. AAAI.
2023.
|
Deadline:
- Presentation slide and Presentation feedback for "Interaction models".
|
Week 3: Oct 9 |
[Fireside chat] Psychology and Marketing Perspectives: Noah Goodman, Jonathan Levav, S. Christian Wheeler
|
Additional reading:
- Evangelidis, Levav, Simonson. The
Upscaling Effect:
How the Decision Context Influences Tradeoffs Between Desirability and Feasibility. Journal of
Consumer
Research. 2023.
- Evangelidis, Levav, Simonson. A
Reexamination of the Impact of Decision Conflict on Choice Deferral. Management Science. 2023.
- Shennib, Catapano, Levav. Preference
Reversals Between Digital and Physical Goods. ACR North American Advances. 2019.
- Tamkin, Handa, Shrestha, Goodman. Task Ambiguity in Humans
and Language
Models. arXiv. 2022.
- Hawkins, Berdahl, Pentland, Tenenbaum, Goodman, Krafft. Flexible Social
Inference Facilitates Targeted Social Learning When Rewards Are Not Observable. Nature Human
Behaviour. 2023.
- Yu, Goodman, Mu. Characterizing Tradeoffs Between Teaching
via Language
and Demonstrations in Multi-Agent Systems. arXiv. 2023.
|
Deadline: None |
Week 3: Oct 11 |
[Student Presentation] Human biases
and Reward models
|
Recommended reading:
- The Decision Lab. Biases Index. 2023.
- Slovic. The Construction of Preference. Shaping
Entrepreneurship Research. 2020.
- Hogarth. Insights in
Decision Making: A Tribute to Hillel J. Einhorn. University of Chicago Press. 1990.
- Cooke. Experts in Uncertainty: Opinion and
Subjective Probability in Science. Oxford University Press. 1991.
- Chan, Critch, Dragan. Human Irrationality: Both Bad and
Good for Reward Inference. arXiv. 2021.
- Bobu, Scobee, Fisac, Sastry, Dragan. Less
is More: Rethinking Probabilistic Models of Human Behavior. ACM/IEEE International Conference on
Human-Robot Interaction. 2020.
|
Deadline:
- Presentation slide and Presentation feedback for "Human biases and Reward models"
|
Week 4: Oct 16 |
[Student Presentation] Metric
elicitation
|
Recommended reading:
- Hiranandani, Boodaghians, Mehta, Koyejo. Performance Metric Elicitation from
Pairwise
Classifier Comparisons. AISTATS. 2019.
- Hiranandani, Boodaghians, Mehta, Koyejo. Multiclass
Performance Metric Elicitation. NeurIPS. 2019.
- Hiranandani, Narasimhan, Koyejo. Fair
Performance Metric Elicitation. NeurIPS. 2020.
- Hiranandani, Mathur, Narasimhan, Koyejo. Quadratic Metric Elicitation with
Application to
Fairness. UAI. 2022.
Additional reading:
- Ali, Upadhyay, Hiranandani, Glassman, Koyejo. Metric
Elicitation: Moving from Theory to Practice. NeurIPS Workshop on Human-Centered AI (HCAI), 2022.
- Riabacke, Danielson, Ekenberg, L. State-of-the-Art Prescriptive Criteria Weight
Elicitation. Advances in Decision Sciences, 2012.
|
Deadline:
- Presentation slide and Presentation feedback for "Metric elicitation"
- Scribe for "Human preferences models"
|
Week 4: Oct 18 |
[Student Presentation] Active
learning
|
Recommended Readings:
- Cohn, Ghahramani, Jordan. Active
Learning with
Statistical Models. JAIR. 1996.
- Biyik, Sadigh. Batch Active Preference-Based
Learning of
Reward Functions. CORL. 2018.
- Sadigh, Dragan, Sastry, Seshia. Active
Preference-Based
Learning of Reward Functions. UC Berkeley. 2017.
- Jamieson, Nowak. Active
Ranking Using Pairwise Comparisons. NeurIPS. 2011.
- Holladay, Javdani, Dragan, Srinivasa. Active
Comparison Based Learning Incorporating User Uncertainty and Noise. RSS Workshop on Model Learning
for
Human-Robot Communication. 2016.
Additional Readings:
- Settles. Active Learning Literature Survey.
University of
Wisconsin-Madison. 2009.
|
Deadline:
- Presentation slide and Presentation feedback for "Active learning".
- Scribe for "Interaction models".
|
Week 5: Oct 23 |
[Student Presentation] Bandits and
Probabilistic Methods
|
Recommended Readings:
- Agarwal, Hsu, Kale, Langford, Li, Schapire. Taming the
Monster: A Fast and Simple Algorithm for Contextual Bandits. In International Conference on
Machine Learning,
pp. 1638-1646. PMLR, 2014.
- Bouneffouf, Rish, Aggarwal. Survey on Applications of Multi-Armed and Contextual Bandits. In 2020 IEEE
Congress on
Evolutionary Computation (CEC), pp. 1-8. IEEE, 2020.
- Sui, Zoghi, Hofmann, Yue. Advancements in
Dueling
Bandits. In IJCAI, pp. 5502-5510. 2018.
- Yue, Broder, Kleinberg, Joachims. The K-Armed Dueling Bandits
Problem. Journal of Computer and System Sciences 78, no. 5 (2012): 1538-1556.
| Deadline:
- Proposal deadline
- Presentation slide and Presentation feedback for "Bandits and Probabilistic Methods".
- Scribe feedback for "Human preferences models"
|
Week 5: Oct 25 |
[Students Presentation] Multimodal
rewards; Meta reward learning
|
Recommended reading:
- Hejna, Sadigh. Few-Shot Preference Learning for
Human-in-the-Loop RL. CoRL, 2023.
- Zhou, Jang, Kappler, Herzog, Khansari, Wohlhart, Bai, Kalakrishnan, Levine, Finn. Watch, Try, Learn: Meta-Learning from Demonstrations and
Reward. Arxiv, 2019.
- Myers, Bıyık, Anari, Sadigh. Learning Multimodal Rewards
from Rankings. Arxiv, 2021.
|
Deadline:
- Presentation slide for "Multimodal rewards; Meta reward learning"
- Scribe for "Human biases and Reward models"
- Scribe feedback for "Interaction models"
|
Week 6: Oct 30 |
[Guest lecture] Pat Langley (Institute for the Study of
Learning and Expertise): Human
computing |
Recommended reading:
|
Deadline:
- Scribe for "Metric elicitation"
- Scribe rebuttal for "Human preferences models"
|
Week 6: Nov 1 |
[Student presentation] Alignment;
Expert and non-expert stakeholders |
Recommended reading:
- Brown, Schneider, Dragan, Due Niekum. Value
Alignment Verification. ICML. 2021.
- Bobu, Bajcsy, Fisac, Due Dragan. Learning under
Misspecified Objective Spaces. CoRL. 2018.
- Jeon, Milli, Due Dragan. Reward-Rational
(Implicit) Choice: A Unifying Formalism for Reward Learning. NeurIPS. 2020.
- Bobu, Peng, Agrawal, Shah, Due Dragan. Aligning Robot and
Human Representations. arXiv. 2023.
|
Deadline
- Presentation slide and Presentation feedback for "Alignment; Expert and non-expert
stakeholders"
- Scribe for "Active learning"
- Scirbe feedback for "Human biases and Reward models"
- Scribe rebuttal for "Interaction models"
|
Week 7: Nov 6 |
[Guest lecture] Meredith Ringel Morris (Google DeepMind):
HCI considerations in learning from
humans (Virtual)
|
Recommended reading:
|
Deadline
- Scribe for Pat Langley
- Scribe for "Bandits and Probabilistic Methods"
- Scribe feedback for "Metric elicitation"
|
Week 7: Nov 8 |
[Guest lecture] Vasilis Syrgkanis(Stanford): Truthfulness and
mechanism design |
Recommended reading:
- Balcan, Sandholm, VitercikTutorial on
Mechanism Design. 2023
- Roughgarden Lectures 1 & 2 on the General Mechanism Design
Problem and the Idea of Incentive Compatibility.
- Linstone, Turoff. The Delphi Method. Reading, MA: Addison-Wesley, 1975.
- Prelec. A Bayesian Truth Serum for
Subjective Data. Science. 2004.
|
Deadline
- Scribe for "Multimodal rewards; Meta reward learning"
- Scribe feedback for "Active learning"
- Scirbe rebuttal for "Human biases and Reward models"
|
Week 8: Nov 13 |
[Guest lecture] Jason Hartline (Northwestern):
Truthfulness and mechanism
design
|
Recommended reading:
- Schenk, Guittard. Crowdsourcing: What Can Be
Outsourced to
the Crowd, and Why?. HAL Open Science. 2009.
- Quinn, Bederson. Human
Computation: A Survey and Taxonomy of a Growing Field. SIGCHI Conference on Human Factors in
Computing
Systems. 2011.
- Kong. Dominantly Truthful Multi-Task Peer Prediction with a
Constant
Number of Tasks. ACM-SIAM Symposium on Discrete Algorithms. 2020.
- Kong, Schoenebeck. An Information Theoretic
Framework for
Designing Information Elicitation Mechanisms That Reward Truth-Telling. ACM Transactions on
Economics and
Computation. 2019.
|
Deadline
- Scribe for Meredith Ringel Morris
- Scribe feedback for Pat Langley
- Scribe feedback for "Bandits and Probabilistic Methods"
- Scribe rebuttal for "Metric elicitation"
|
Week 8: Nov 15 |
[Guest lecture] Dorsa Sadigh (Stanford): Inverse reinforcement learning
from human feedback for robotics |
Recommended reading:
- Ng, Russell. Algorithms for Inverse
Reinforcement
Learning. ICML. 2000.
- Hadfield-Menell, Russell, Abbeel, Dragan. Cooperative
Inverse Reinforcement Learning. NeurIPS. 2016.
- Arora, Doshi. A Survey of Inverse Reinforcement Learning:
Challenges,
Methods and Progress. Artificial Intelligence. 2021.
- Hadfield-Menell, Milli, Abbeel, Russell, Dragan. Inverse
Reward Design. NeurIPS. 2017.
- Shin, Dragan, Brown. Benchmarks and Algorithms for Offline
Preference-Based Reward Learning. arXiv. 2023.
- Ghosal, Zurek, Brown, Dragan. The
Effect of
Modeling Human Rationality Level on Learning Rewards from Multiple Feedback Types. AAAI. 2023.
- Bıyık, Losey, Palan, Landolfi, Shevchuk, Sadigh. Learning Reward Functions from
Diverse
Sources of Human Feedback: Optimally Integrating Demonstrations and Preferences. The International
Journal of
Robotics Research. 2022.
|
Deadline
- Scribe for "Alignment; Expert and non-expert stakeholders"
- Scribe for Vasilis Syrgkanis
- Scribe feedback for "Multimodal rewards; Meta reward learning"
- Scribe rebuttal for Pat Langley
- Scribe rebuttal for "Active learning"
|
Week 9: Nov 20 |
Thanksgiving Recess (no classes) |
|
|
Week 9: Nov 22 |
Thanksgiving Recess (no classes) |
|
|
Week 10: Nov 27 |
[Guest Lecture] Diyi Yang (Stanford): Ethics and HCI
|
Recommended reading:
- Busarovs. Ethical Aspects of Crowdsourcing,
or Is It a Modern Form of Exploitation. International Journal of Economics & Business
Administration. 2013.
- Denton, Díaz, Kivlichan, Prabhakaran, & Rosen. Whose Ground
Truth? Accounting for Individual and Collective Identities Underlying Dataset Annotation. arXiv.
2021.
|
Deadline
- Project deadline
- Scribe for Jason Hartline
- Scribe feedback for Meredith Ringel Morris
- Scribe rebuttal for "Bandits and Probabilistic Methods"
|
Week 10: Nov 29 |
[Guest Lecture] Nathan Lambert (HuggingFace): Reinforcement
learning from human feedback for language models
|
Recommended Readings:
- Bansal, Dang, Grover. Peering Through Preferences:
Unraveling Feedback Acquisition for Aligning Large Language Models. arXiv. 2023.
- Christiano, Leike, Brown, Martic, Legg, Amodei. Deep
Reinforcement Learning from Human Preferences. NeurIPS. 2017.
- Ziegler, Stiennon, Wu, Brown, Radford, Amodei, Christiano, Irving. Fine-Tuning Language Models from Human Preferences. arXiv.
2019.
|
Deadline
- Scribe for Dorsa Sadigh
- Scirbe feedback for "Alignment; Expert and non-expert stakeholders"
- Scribe feedback for Vasilis Syrgkanis
- Scribe rebuttal for "Multimodal rewards; Meta reward learning"
- Scribe rebuttal for Meredith Ringel Morris
|
Week 11: Dec 4 |
[Lecture] Open Questions &
Frontiers
|
Recommended Readings:
- Wirth, Akrour, Neumann, Fürnkranz. A Survey of
Preference-Based Reinforcement Learning Methods. JMLR, 2017.
- Casper et al. Open Problems and Fundamental Limitations of
Reinforcement Learning from Human Feedback. Arxiv, 2023.
|
Deadline
- Scribe for Diyi Yang
- Scribe feedback for Jason Hartline
|
Week 11: Dec 6 |
Poster session
|
Recommended Readings: None |
Deadline
- Scribe for Nathan Lambert
- Scribe feedback for Dorsa Sadigh
- Scirbe rebuttal for "Alignment; Expert and non-expert stakeholders"
- Scribe rebuttal for Jason Hartline
- Scribe rebuttal for Vasilis Syrgkanis
|
Week 12: Dec 11 |
Final week: No class |
|
Deadline
- Scribe rebuttal for Dorsa Sadigh
- Scribe feedback for Diyi Yang
- Scribe feedback for Nathan Lambert
|
Week 12: Dec 13 |
Final week: No class |
|
Deadline
- Scribe rebuttal for Diyi Yang
- Scribe rebuttal for Nathan Lambert
|