4/09/2022

Efficient and targeted COVID-19 border testing via reinforcement learning

Bastani, H., Drakopoulos, K., Gupta, V. et al. Efficient and targeted COVID-19 border testing via reinforcement learning. Nature 599, 108–113 (2021). https://doi.org/10.1038/s41586-021-04014-z (EVA Public Dataset, Off-Policy and Counterfactual Analysis, Open-Source code for Project Eva)

Throughout the coronavirus disease 2019 (COVID-19) pandemic, countries have relied on a variety of ad hoc border control protocols to allow for non-essential travel while safeguarding public health, from quarantining all travellers to restricting entry from select nations on the basis of population-level epidemiological metrics such as cases, deaths or testing positivity rates. Here we report the design and performance of a reinforcement learning system, nicknamed Eva. In the summer of 2020, Eva was deployed across all Greek borders to limit the influx of asymptomatic travellers infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), and to inform border policies through real-time estimates of COVID-19 prevalence. In contrast to country-wide protocols, Eva allocated Greece’s limited testing resources on the basis of incoming travellers’ demographic information and testing results from previous travellers. By comparing Eva’s performance against modelled counterfactual scenarios, we show that Eva identified 1.85 times as many asymptomatic, infected travellers as random surveillance testing, with up to 2–4 times as many during peak travel, and 1.25–1.45 times as many asymptomatic, infected travellers as testing policies that utilize only epidemiological metrics. We demonstrate that this latter benefit arises, at least partially, because population-level epidemiological metrics had limited predictive value for the actual prevalence of SARS-CoV-2 among asymptomatic travellers and exhibited strong country-specific idiosyncrasies in the summer of 2020. Our results raise serious concerns on the effectiveness of country-agnostic internationally proposed border control policies3 that are based on population-level epidemiological metrics. Instead, our work represents a successful example of the potential of reinforcement learning and real-time data for safeguarding public health.

 Allocating scarce tests

The testing allocation decision is entirely algorithmic and balances two objectives. First, given current information, Eva seeks to maximize the number of infected asymptomatic travellers identified (exploitation). Second, Eva strategically allocates some tests to traveller types for which it does not currently have precise estimates to better learn their prevalence (exploration). This is a crucial feedback step. Today’s allocations will determine the available data in the prevalence estimation step above when determining future prevalence estimates. Hence, if Eva simply (greedily) sought to allocate tests to types that currently had high prevalence, then, in a few days, it would not have any recent testing data about many other types that had moderate prevalence. Since COVID-19 prevalence can spike quickly and unexpectedly, this would leave a ‘blind spot’ for the algorithm and pose a serious public health risk. Such allocation problems can be formulated as multi-armed bandits—which are widely studied within the body of literature on reinforcement learning—and have been used in numerous applications such as mobile health, clinical trial design, online advertising and recommender systems.

沒有留言:

張貼留言