Samuel G. Finlayson et al., The Clinician and Dataset Shift in Artificial Intelligence, New England Journal of Medicine, 2021; 385:283-286.
A major driver of AI system malfunction is known as “dataset shift.” Most clinical AI systems today use machine learning, algorithms that leverage statistical methods to learn key patterns from clinical data. Dataset shift occurs when a machine-learning system underperforms because of a mismatch between the data set with which it was developed and the data on which it is deployed. For example, the University of Michigan Hospital implemented the widely used sepsis-alerting model developed by Epic Systems; in April 2020, the model had to be deactivated because of spurious alerting owing to changes in patients’ demographic characteristics associated with the coronavirus disease 2019 pandemic. This was a case in which dataset shift fundamentally altered the relationship between fevers and bacterial sepsis, leading the hospital’s clinical AI governing committee (which one of the authors of this letter chairs) to decommission its use. This is an extreme example; many causes of dataset shift are more subtle. In Table 1, we present common causes of dataset shift, which we group into changes in technology (e.g., software vendors), changes in population and setting (e.g., new demographics), and changes in behavior (e.g., new reimbursement incentives); the list is not meant to be exhaustive.
Deb Raji, There’s more to data than distributions, Mar 31, 2022.
Jose G. Moreno-Torres et al., A unifying view on dataset shift in classification, Pattern Recognition, Volume 45, Issue 1, January 2012, Pages 521-530.
沒有留言:
張貼留言