Steven L. Brunton and J. Nathan Kutz, Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control, 2nd edition, Cambridge University Press (pdf, amazing videos, connection with dynamical systems in engineering)
10/01/2025
9/05/2025
Use LLM (e.g., ChatGPT) to learn ***
Dimitris Bertsimas and Georgios Margaritis, Robust and Adaptive Optimization under a Large Language Model Lens, arXiv:2501.00568. (new)
Be careful
- 陳曉莉,MIT研究顯示,習慣用 AI 寫文章會讓腦袋變笨,iThome,2025-06-20 (Nataliya Kosmyna, et al., Your Brain on ChatGPT, arXiv:2506.08872)
- Lee, Hao-Ping (Hank), et al., The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers, 2025, CHI '25: Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, Article No.: 1121, Pages 1 - 22. https://doi.org/10.1145/3706598.371377
Manny Li,Google公布AI提示萬用公式!掌握「21字黃金法則」:先穩80分基本功再求好,bnext,2025.08.05
Barbara Oakley, Accelerate Your Learning with ChatGPT, Coursera
專題和論文的製作與報告 (tips for the final project and your thesis)
- (at the bottom) Avoid common phenomena, final written report (for your ppt content)
- (大學生) 重要任務
- 人生困境:
- Tal Ben-Shahar, Happier: Learn the Secrets to Daily Joy and Lasting Fulfillment, McGraw Hill, 2007. (譚家瑜譯,更快樂:哈佛最受歡迎的一堂課,天下雜誌,2012)
8/13/2025
研究 (Research)
- C.-H. Hsu and T.-Y. Liao, Enhanced holistic regression for multicollinearity detection and feature selection, Available at SSRN, 2025. (code)
- 廖庭煜、維琪、許志華、饒忻,增強不確定性下的決策:結合 TRIZ 和機器學習方法的穩健優化框架,2025 系統性創新研討會暨專案競賽,論文競賽獎第一名 (code)
- C.-H. Hsu and H.-C. Yang, Suboptimal Explainable Scheme for Machining Outcome Estimation, IEEE Robotics and Automation Letters, Vol. 7, No. 3, July 2022, pp. 7834 - 7841.
6/30/2025
Some books and information on machine learning and AI
- 簡禎富,工業3.5:台灣企業邁向智慧製造與數位決策的戰略,天下雜誌,2019
- Alex J. Gutman and Jordan Goldmeier, Becoming a Data Head: How to Think, Speak, and Understand Data Science, Statistics, and Machine Learning, Wiley, 2021.
6/11/2025
Guides for students in Business Analytics Laboratory (商業分析實驗室學生指引)
Knowledge to master for a better foundation (and future)
Tools and general:
- Antonio Torralba, Phillip Isola, and William Freeman, Foundations of Computer Vision, The MIT Press, 2024. (On Research, Writing and Speaking) (new)
- Takeo Kanade, Think Like an Amateur, Do As an Expert (new)
- AI in education: Geoffrey Hinton’s and Yann LeCun’s vision of the future, The Buzz Business, 2023.
- The role of AI in education extends to fostering creativity and critical thinking.
- Another significant impact of AI in education is its potential to democratize access to quality learning.
5/15/2025
Reinforcement learning is enough to reach general AI
David Silver, Satinder Singh, Doina Precup, and Richard S. Sutton, Reward is enough, Artificial Intelligence, Volume 299, October 2021, 103535.
In this article we hypothesise that intelligence, and its associated abilities, can be understood as subserving the maximisation of reward. Accordingly, reward is enough to drive behaviour that exhibits abilities studied in natural and artificial intelligence, including knowledge, learning, perception, social intelligence, language, generalisation and imitation. This is in contrast to the view that specialised problem formulations are needed for each ability, based on other signals or objectives. Furthermore, we suggest that agents that learn through trial and error experience to maximise reward could learn behaviour that exhibits most if not all of these abilities, and therefore that powerful reinforcement learning agents could constitute a solution to artificial general intelligence.
3/13/2025
2024 ACM A.M. Turing Award
ACM A.M. Turing Award Honors Two Researchers Who Led the Development of Cornerstone AI Technology: Andrew Barto and Richard Sutton Recognized as Pioneers of Reinforcement Learning, New York, NY, March 5, 2025
8/01/2024
AI achieves silver-medal standard solving International Mathematical Olympiad problems
AlphaProof and AlphaGeometry teams, AI achieves silver-medal standard solving International Mathematical Olympiad problems, 25 JULY 2024.
2/18/2024
An Exact Solution to Wordle
Dimitris Bertsimas, Alex Paskov (2024) An Exact Solution to Wordle. Operations Research.
1/24/2024
Applications of Operations Research (作業研究) (including Optimization)
為了提高同學們的學習動機,提供以下相關的資訊,以幫助同學們找到方向。也和暑期實習和未來就業中,決策支援系統中的演算法有密切關聯。以下許多的內容屬於碩博士階段的課程,也可以增加同學們就讀研究所的動機:
- Journals:
- INFORMS Journal on Applied Analytics
- INFORMS is the leading international association for Operations Research & Analytics professionals.
- The mission of INFORMS Journal on Applied Analytics is to publish manuscripts focusing on the practice of operations research and management science and the impact this practice has on organizations throughout the world.
- Good topics to be explored for the final project
- Ramayya Krishnan and Pascal Van Hentenryck, editors, Advances in Integrating AI & O.R., INFORMS EC2021, Volume 16, April 19, 2021.
11/01/2023
Learning an Inventory Control Policy with General Inventory Arrival Dynamics
S Andaz, C Eisenach, D Madeka, K Torkkola, R Jia, D Foster, S Kakade, Learning an Inventory Control Policy with General Inventory Arrival Dynamics, 2023, arXiv preprint arXiv:2310.17168. (Amazon)
In this paper we address the problem of learning and backtesting inventory control policies in the presence of general arrival dynamics -- which we term as a quantity-over-time arrivals model (QOT). We also allow for order quantities to be modified as a post-processing step to meet vendor constraints such as order minimum and batch size constraints -- a common practice in real supply chains. To the best of our knowledge this is the first work to handle either arbitrary arrival dynamics or an arbitrary downstream post-processing of order quantities. Building upon recent work (Madeka et al., 2022) we similarly formulate the periodic review inventory control problem as an exogenous decision process, where most of the state is outside the control of the agent. Madeka et al. (2022) show how to construct a simulator that replays historic data to solve this class of problem. In our case, we incorporate a deep generative model for the arrivals process as part of the history replay. By formulating the problem as an exogenous decision process, we can apply results from Madeka et al. (2022) to obtain a reduction to supervised learning. Finally, we show via simulation studies that this approach yields statistically significant improvements in profitability over production baselines. Using data from an ongoing real-world A/B test, we show that Gen-QOT generalizes well to off-policy data.
9/02/2023
Champion-level drone racing using deep reinforcement learning
Kaufmann, E., Bauersfeld, L., Loquercio, A. et al. Champion-level drone racing using deep reinforcement learning. Nature 620, 982–987 (2023). https://doi.org/10.1038/s41586-023-06419-4
First-person view (FPV) drone racing is a televised sport in which professional competitors pilot high-speed aircraft through a 3D circuit. Each pilot sees the environment from the perspective of their drone by means of video streamed from an onboard camera. Reaching the level of professional pilots with an autonomous drone is challenging because the robot needs to fly at its physical limits while estimating its speed and location in the circuit exclusively from onboard sensors. Here we introduce Swift, an autonomous system that can race physical vehicles at the level of the human world champions. The system combines deep reinforcement learning (RL) in simulation with data collected in the physical world. Swift competed against three human champions, including the world champions of two international leagues, in real-world head-to-head races. Swift won several races against each of the human champions and demonstrated the fastest recorded race time. This work represents a milestone for mobile robotics and machine intelligence, which may inspire the deployment of hybrid learning-based solutions in other physical systems.
4/14/2023
Using AI to Accelerate Scientific Discovery
Demis Hassabis, Using AI to Accelerate Scientific Discovery, Institute for Ethics in AI Oxford, 2022.
12/03/2022
Interpretable Machine Learning
Cynthia Rudin, Chaofan Chen, Zhi Chen, Haiyang Huang, Lesia Semenova, and Chudi Zhong, Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges. Statistics Surveys, 2022.
Interpretability in machine learning (ML) is crucial for high stakes decisions and troubleshooting. In this work, we provide fundamental principles for interpretable ML, and dispel common misunderstandings that dilute the importance of this crucial topic. We also identify 10 technical challenge areas in interpretable machine learning and provide history and background on each problem. Some of these problems are classically important, and some are recent problems that have arisen in the last few years. These problems are: (1) Optimizing sparse logical models such as decision trees; (2) Optimization of scoring systems; (3) Placing constraints into generalized additive models to encourage sparsity and better interpretability; (4) Modern case-based reasoning, including neural networks and matching for causal inference; (5) Complete supervised disentanglement of neural networks; (6) Complete or even partial unsupervised disentanglement of neural networks; (7) Dimensionality reduction for data visualization; (8) Machine learning models that can incorporate physics and other generative or causal constraints; (9) Characterization of the “Rashomon set” of good models; and (10) Interpretable reinforcement learning. This survey is suitable as a starting point for statisticians and computer scientists interested in working in interpretable machine learning.
10/14/2022
Discovering faster matrix multiplication algorithms with reinforcement learning
Fawzi, A., Balog, M., Huang, A. et al. Discovering faster matrix multiplication algorithms with reinforcement learning. Nature 610, 47–53 (2022). https://doi.org/10.1038/s41586-022-05172-4. (data and code)
Improving the efficiency of algorithms for fundamental computations can have a widespread impact, as it can affect the overall speed of a large amount of computations. Matrix multiplication is one such primitive task, occurring in many systems—from neural networks to scientific computing routines. The automatic discovery of algorithms using machine learning offers the prospect of reaching beyond human intuition and outperforming the current best human-designed algorithms. However, automating the algorithm discovery procedure is intricate, as the space of possible algorithms is enormous. Here we report a deep reinforcement learning approach based on AlphaZero for discovering efficient and provably correct algorithms for the multiplication of arbitrary matrices. Our agent, AlphaTensor, is trained to play a single-player game where the objective is finding tensor decompositions within a finite factor space. AlphaTensor discovered algorithms that outperform the state-of-the-art complexity for many matrix sizes. Particularly relevant is the case of 4 × 4 matrices in a finite field, where AlphaTensor’s algorithm improves on Strassen’s two-level algorithm for the first time, to our knowledge, since its discovery 50 years ago. We further showcase the flexibility of AlphaTensor through different use-cases: algorithms with state-of-the-art complexity for structured matrix multiplication and improved practical efficiency by optimizing matrix multiplication for runtime on specific hardware. Our results highlight AlphaTensor’s ability to accelerate the process of algorithmic discovery on a range of problems, and to optimize for different criteria.
10/08/2022
Deep reinforcement learning for inventory control
R.N. Boute, J. Gijsbrechts, W. van Jaarsveld, and N. Vanvuchelen, Deep reinforcement learning for inventory control: A roadmap, European Journal of Operational Research, Volume 298, Issue 2, 16 April 2022, Pages 401-412.
Deep reinforcement learning (DRL) has shown great potential for sequential decision-making, including early developments in inventory control. Yet, the abundance of choices that come with designing a DRL algorithm, combined with the intense computational effort to tune and evaluate each choice, may hamper their application in practice. This paper describes the key design choices of DRL algorithms to facilitate their implementation in inventory control. We also shed light on possible future research avenues that may elevate the current state-of-the-art of DRL applications for inventory control and broaden their scope by leveraging and improving on the structural policy insights within inventory research. Our discussion and roadmap may also spur future research in other domains within operations management.
6/19/2022
Online Network Revenue Management Using Thompson Sampling
Kris Johnson Ferreira, David Simchi-Levi, and He Wang. (2018). “Online network revenue management using Thompson sampling.” Operations Research, 66(6), 1586-1602. (Supplemental Material, code, MSOM Society 2021 Operations Research Best OM Paper Award)
We consider a price-based network revenue management problem in which a retailer aims to maximize revenue from multiple products with limited inventory over a finite selling season. As is common in practice, we assume the demand function contains unknown parameters that must be learned from sales data. In the presence of these unknown demand parameters, the retailer faces a trade-off commonly referred to as the “exploration-exploitation trade-off.” Toward the beginning of the selling season, the retailer may offer several different prices to try to learn demand at each price (“exploration” objective). Over time, the retailer can use this knowledge to set a price that maximizes revenue throughout the remainder of the selling season (“exploitation” objective). We propose a class of dynamic pricing algorithms that builds on the simple, yet powerful, machine learning technique known as “Thompson sampling” to address the challenge of balancing the exploration-exploitation trade-off under the presence of inventory constraints. Our algorithms have both strong theoretical performance guarantees and promising numerical performance results when compared with other algorithms developed for similar settings. Moreover, we show how our algorithms can be extended for use in general multiarmed bandit problems with resource constraints as well as in applications in other revenue management settings and beyond.
6/03/2022
OR-Gym
Christian D. Hubbs, Hector D. Perez, Owais Sarwar, Nikolaos V. Sahinidis, Ignacio E. Grossmann, John M. Wassick, OR-Gym: A Reinforcement Learning Library for Operations Research Problems, arXiv:2008.06319v2. (Python)
Reinforcement learning (RL) has been widely applied to game-playing and surpassed the best human-level performance in many domains, yet there are few use-cases in industrial or commercial settings. We introduce OR-Gym, an open-source library for developing reinforcement learning algorithms to address operations research problems. In this paper, we apply reinforcement learning to the knapsack, multi-dimensional bin packing, multi-echelon supply chain, and multi-period asset allocation model problems, as well as benchmark the RL solutions against MILP and heuristic models. These problems are used in logistics, finance, engineering, and are common in many business operation settings. We develop environments based on prototypical models in the literature and implement various optimization and heuristic models in order to benchmark the RL results. By re-framing a series of classic optimization problems as RL tasks, we seek to provide a new tool for the operations research community, while also opening those in the RL community to many of the problems and challenges in the OR field.
5/07/2022
Outracing champion Gran Turismo drivers with deep reinforcement learning
Wurman, P.R., Barrett, S., Kawamoto, K. et al. Outracing champion Gran Turismo drivers with deep reinforcement learning. Nature 602, 223–228 (2022). https://doi.org/10.1038/s41586-021-04357-7.
Many potential applications of artificial intelligence involve making real-time decisions in physical systems while interacting with humans. Automobile racing represents an extreme example of these conditions; drivers must execute complex tactical manoeuvres to pass or block opponents while operating their vehicles at their traction limits. Racing simulations, such as the PlayStation game Gran Turismo, faithfully reproduce the non-linear control challenges of real race cars while also encapsulating the complex multi-agent interactions. Here we describe how we trained agents for Gran Turismo that can compete with the world’s best e-sports drivers. We combine state-of-the-art, model-free, deep reinforcement learning algorithms with mixed-scenario training to learn an integrated control policy that combines exceptional speed with impressive tactics. In addition, we construct a reward function that enables the agent to be competitive while adhering to racing’s important, but under-specified, sportsmanship rules. We demonstrate the capabilities of our agent, Gran Turismo Sophy, by winning a head-to-head competition against four of the world’s best Gran Turismo drivers. By describing how we trained championship-level racers, we demonstrate the possibilities and challenges of using these techniques to control complex dynamical systems in domains where agents must respect imprecisely defined human norms.