Shelley Rigger, Why Taiwan Matters: Small Island, Global Powerhouse
讀書寫作
7/22/2025
7/21/2025
7/13/2025
Statistical Modeling: The Two Cultures
Cynthia Rudin, Leo Breiman, the Rashomon Effect, and the Occam Dilemma, arXiv:2507.03884, 2025.
In the famous “Two Cultures” paper, Leo Breiman provided a visionary perspective on the cultures of “data models” (modeling with consideration of data generation) versus “algorithmic models” (vanilla machine learning models). I provide a modern perspective on these two approaches. One of Breiman’s key arguments against data models is what he called the “Rashomon Effect,” which is the existence of many different-but-equally-good models. The Rashomon Effect implies that data modelers would not be able to determine which model generated the data. Conversely, one of his core advantages in favor of data models is simplicity, as he claimed there exists an “Occam Dilemma,” i.e., an accuracy-simplicity tradeoff, where algorithmic models must be complex in order to be accurate. After 25 years of more powerful computers, it has become clear that this claim is not generally true, in that algorithmic models do not need to be complex to be accurate; however, there are nuances that help explain Breiman’s logic, specifically, that by “simple,” he appears to consider only linear models or unoptimized decision trees. Interestingly, the Rashomon Effect is a key tool in proving the nullification of the Occam Dilemma. To his credit though, Breiman did not have the benefit of modern computers, with which my observations are much easier to make.
7/11/2025
The Batch by A. Ng
- )
- Large scale system: The system aggregates data generated by 240 million customers and 2 million store personnel, feeding applications that streamline operations among 100,000 suppliers, 150 distributors, and 10,000 retail venues in 19 countries.
7/10/2025
什麼是『大學生』
(25/6/30) 今晚,真的是開眼界。讓我體驗到人生的另一個高點。
睡不著,想一下今晚和研究生討論的方法,第 16 章 Sampling Plans 的確是我們需要的。他是『大學生』。
(25/7/1) 2:30 am 還沒睡意,來寫部落格。