Dynamic Programming and Reinforcement Learning (動態規劃和強化學習)
Course objective: This course introduces dynamic decision-making under uncertainty, with an emphasis on dynamic programming and reinforcement learning. Drawing on applications in business and engineering, students will learn key theories and algorithms for solving multi-stage decision problems, both with and without explicit models of the environment. Assignments and a final project provide practical experience in problem formulation, algorithm evaluation, and Python-based implementation.
- Prerequisite subjects (先修科目):
- (required) Linear algebra, probability, (Python) programming, any undergraduate optimization course
- (helpful) Machine learning
- Please ensure that you are comfortable with the (introductory) material in math and Python, which are free once you log in.
- Some good (and free) courses on machine learning: Andrew Ng et al.
- Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, Mathematics for Machine Learning, Cambridge University Press, 2020.
- Tentative evaluation method:
- Homework 25%: You are welcome to discuss with your classmates, but please write your own solutions without using LLM tools. For all homeworks, please upload your personal version to CYCU ilearning before each deadline.
- Midterm 20%
- Final project (1-2 persons per group) 35% (Form for you to fill in)
- Class preview questions, attendance, and class participation 20%
- David P. Woodruff, 15-451/651: Algorithm Design and Analysis, CMU, Spring 2025.
For all problems, whether SOLO or GROUP, you must not attempt to find the solution online, in a book, in a journal, by asking an AI tool, or searching anywhere else not explicitly permitted.
- Final project timeline:
- Please refer to my file "DPRL final project.pdf" for some ideas and papers.
- May 8 (4 points): Submit the question after your group meet with the instructor to discuss project ideas in class. Novelty: Your project should propose something new (either a new application, method, or perspective).
- May 15 (3 points): Submit the list of papers you are reading (submit a one-page summary as a conference paper format to explain your choice to school ilearning)
- May 29 (5 points): Each group will give a short presentation in class about a paper related to their project.
- June 12 (3 points): PowerPoint review during the class meeting time (Around 1 minute per page). Please upload your files before the class (ppt and pdf, group name-topic, e.g. 1-Federated-learning).
- June 26 (10 points): Formal presentations (15-20 min). Please upload your files before the class (ppt and pdf).
- July 3 (10 points): Reports to school ilearning (inclduing modified presentation file, written report as a conference paper format, code) (3-10 pages)
- Main references:
- Emma Brunskill, CS234 Reinforcement Learning, Stanford University (2024, videos)
- DeepMind x UCL, Deep Learning Lecture Series 2021 (slides, videos)
- Amir-massoud Farahmand, Introduction to Reinforcement Learning, Polytechnique Montréal (videos)
- Sergey Levine, CS285 Deep Reinforcement Learning, UC Berkeley, 2023 (with videos), 2026
- Richard S. Sutton and Andrew G. Barto, Reinforcement Learning: An Introduction, MIT Press, Second Edition, 2018. (ACM Turing Award winners in 2024) (Code, slides)
- Flipped classroom: Please listen to the videos described in my note under my Google Drive folder and look at my note for further explanation before the class. In the class, you will present the lecture and we will discuss the subtle points and present/work on the homework problems.
- Each assignment will include selected exercises from the document "DPRL-Homework.pdf" under my Google Drive folder. Please be sure to download the most recent version before starting your homework, as some questions may be revised.
- (Tentative) Schedule:
- Introduction
- Tabular MDP Planning (Homework 1, due 3/18, 11:59pm: Questions under Lecture 1 to 3)
- Examples in optimization, inventory, and scheduling
- Policy Evaluation
- Q-learning and Function Approximation (Homework 2, due 3/18, 11:59pm: Questions under Lecture 1 to 3)
- Policy Search 1
- Policy Search 2
- Policy Search 3
- Midterm
- Offline RL 1
- Offline RL 2
- Exploration 1
- Exploration 2
- Exploration 3
- Multi-Agent Game Playing
- Final Project
- Transfer Learning & Meta-Learning
- Multi-Agent Reinforcement Learning
- Further references:
- Courses:
- Warren B. Powell, ORF 544 – Stochastic Optimization and Learning, Princeton ORFE, 2019 (book, slides)
- David Silver, Introduction to Reinforcement Learning, DeepMind x UCL , 2015 (code)
- Ernest K. Ryu, Reinforcement Learning of Large Language Models, UCLA Math, Spring 2025 (Lecture slides, Lecture videos)
- Bao Wang, Math 5750/6880 Mathematics of Data Science, Utah. (Self-attention Mechanism and Transformers, Diffusion Models)
- Shiyu Zhao, Mathematical Foundations of Reinforcement Learning, Springer, 2025 (GitHub with slides, codes, videos, designed for senior undergraduate students, graduate students)
- Textbooks (with videos and more): You could learn a lot about writing succinctly and clearly from the following terrific and amazing authors.
- Applications and More:
沒有留言:
張貼留言