Understanding expectations for evidence synthesis when using AI compared to current best practice

Understanding expectations for evidence synthesis when using AI compared to current best practice

The rapidly growing evidence base and the increasing complexity of methods make completing timely, high-quality evidence synthesis more and more challenging. To ensure we can keep up with the demand and expectations of funders and users of evidence synthesis, we need to make better use of automation and AI in evidence synthesis. 

To do this, the evidence synthesis field needs community-based best practices to facilitate the use of automation and AI in evidence synthesis. One area of uncertainty is how 'correct' evidence synthesis should be. We do not currently have consensus on how correct it should be with current best practice (i.e., humans only), how this changes if we add AI, what an acceptable impact of errors might be, and whether this changes for different types of evidence synthesis.

The Wellcome-funded DESTinY consortium (https://destiny-evidence.github.io/website/) has therefore launched a survey to better understand community expectations, which will inform future work and how the next generation of evidence synthesis tools driven by AI are built and evaluated.

We all know that the benefits of using AI include the potential to improve efficiency, consistency, scalability and cost-effectiveness. Whereas potential trade-offs include reduced accuracy, which may include misleading treatment recommendations or misguided policy decisions, ineffective funding decisions, wasted resources, and loss of trust. 

We welcome feedback from anyone interested in this topic. We believe it may raise more questions than it answers and therefore encourage you to be part of the discussion.

We ask that you consider the benefits and trade-offs of using AI concerning different types of evidence synthesis, as they may change. For each type of evidence synthesis, please consider whether the implications are critical with severe consequences, or whether we could tolerate some level of error (i.e., where AI may not be perfect but still effective enough to provide value).

Open until 2 July 2025 and takes approximately 35 minutes.

SURVEY