The overnight explosion of AI tools like ChatGPT, Google’s Bard, and Microsoft’s Bing have dazzled the public with their human-like conversational ability and seemingly vast knowledge. For scientific and medical researchers, AI language models have the great potential to simplify time-consuming writing tasks, enhancing productivity and efficiency. But many researchers are wondering: what are the guidelines?

Artificial intelligence in scientific research and medical applications is nothing new. Rudimentary AI in the form of rule-based algorithms was used to diagnose diseases based on patient symptoms in the 1970s and 80s. Over the last few decades, the evolution of AI has greatly accelerated, with increasingly more sophisticated systems being developed to make sense of the oceans of data generated from scientific experiments and medical imaging. At Amsterdam UMC, Cancer Center Amsterdam was a pioneer in AI use and paved the way for the university-wide implementation of AI in research and care.

At Your Fingertips

But the introduction of AI-tools like ChatGPT is different. “It’s not just specialized tools being used by highly-trained experts, but virtually anyone connected to the internet can now have access to AI,” says Dr. Henri van de Vrugt, Chief Scientific Officer of New Haven Biosciences Consulting and communications consultant at Cancer Center Amsterdam.

For scientific researchers, ChatGPT and similar tools have enormous potential to ease tedious, time-consuming writing tasks. Beside obvious advantages like enhancing productivity, there are additional perks such as stellar editing and translation capabilities. “These tools can be used to edit your writing and suggest improvements like grammar, syntax, and word choice. This can help researchers produce higher quality writing and improve the clarity of their scientific communications,” says Henri.

The Wild West

The swift evolution and implementation of AI language tools have also prompted fears that it can be used to perpetuate bias and amplify misinformation. For science, most of the debate has centered on the use of language models to write scientific papers. A handful of papers have already been released—and indexed in PubMed—with ChatGPT listed as an author.

“It is like the Wild West at the moment,” says Henri van de Vrugt. “There are no general regulations in place. But various publishers have started to take stands. For example, the journal Science has announced that it would not accept papers or abstracts written with the help of AI-tools, saying that submissions must be original. Nature also updated their policy to reject papers that list AI language model tools as an author, saying authors must be accountable for the work. We expect most publishers will adopt similar policies.”

Neural Hallucinations

There are important limitations to consider in the context of using ChatGPT in research, according to Henri. “It’s not always accurate. And sometimes it just makes things up - including making up fake references. It can be hard to spot: its answers seem very knowledgeable. Obviously, for researchers this means double checking everything AI language models spit out to discriminate facts from fiction.”

Also, ChatGPT’s training only includes information published up until 2021 and it cannot actively search the internet. More recent models like GPT-4 and Microsoft’s Bing do have these capabilities, but their access is currently restricted. “In scientific research, having access to the most up-to-date information and data is critical for staying at the forefront of research,” says Henri. “So, it is important to keep that in mind.”

Mindfulness required

There are also potential privacy concerns associated with ChatGPT and similar tools' use in scientific research or patient care. “There is a risk that personal and sensitive information could be processed or stored on servers or shared with third-parties,” says Henri. “One of the key challenges here is balancing data protection with the benefits of utilizing the technology. Proper policies and procedures need to be identified and put in place to mitigate these risks.”

The key right now is mindfulness. Henri: “ChatGPT is not inherently good or bad. It is just a tool that can be used for positive or negative purposes. The Amsterdam UMC code of conduct for scientific integrity does not specifically address the use of ChatGPT or similar AI tools at this time. But the general principles of transparency, accuracy, and honesty that are central to the code apply to the use of any tool or technique in scientific research. So if you use AI-language tools, you should clearly and accurately describe how the tool was used in your research.”

Cautious Optimism

Published in British Journal of Surgery Open, a recent article by Boris Janssen, Geert Kazemier and Marc Besselink examined the potential use of AI language tools in surgical science. With cautious optimism, the authors (assisted by ChatGPT) write: “Ultimately, with responsible and thoughtful implementation, these models have the potential to be a valuable tool in surgical science and clinical care by augmenting, not replacing, human expertise.”

Read the publication here: Janssen, B.V., Kazemier, G., Besselink, M.G. (2023) The use of ChatGPT and other large language models in surgical science. BJS Open. 7:zrad032. doi: 10.1093/bjsopen/zrad032.

For more information, contact Henri van de Vrugt.

Text by Laura Roy (without assistance of ChatGPT or other generative AI-language tools).

This article was created for Cancer Center Amsterdam.

© 2023 NHBC– All rights reserved.