Cynthia Rudin, Leo Breiman, the Rashomon Effect, and the Occam Dilemma, arXiv:2507.03884, 2025.
In the famous “Two Cultures” paper, Leo Breiman provided a visionary perspective on the
cultures of “data models” (modeling with consideration of data generation) versus “algorithmic models” (vanilla machine learning models). I provide a modern perspective on these two approaches. One of
Breiman’s key arguments against data models is what he called the “Rashomon Effect,” which is the existence of many different-but-equally-good models. The Rashomon Effect implies that data modelers would
not be able to determine which model generated the data. Conversely, one of his core advantages in favor
of data models is simplicity, as he claimed there exists an “Occam Dilemma,” i.e., an accuracy-simplicity
tradeoff, where algorithmic models must be complex in order to be accurate. After 25 years of more powerful computers, it has become clear that this claim is not generally true, in that algorithmic models do
not need to be complex to be accurate; however, there are nuances that help explain Breiman’s logic,
specifically, that by “simple,” he appears to consider only linear models or unoptimized decision trees.
Interestingly, the Rashomon Effect is a key tool in proving the nullification of the Occam Dilemma. To his
credit though, Breiman did not have the benefit of modern computers, with which my observations are
much easier to make.