Can the notes that general practitioners (GP) make in their patients' files contribute to earlier cancer detection? Two professors from Amsterdam UMC have started a project in which artificial intelligence sifts through these texts. In some cases of lung and colorectal cancer, the system sounded the alarm three to four months earlier than a GP.

General practitioners have an important role in detecting cancer. More than 90 percent of patients diagnosed with cancer in hospitals were referred by a general practitioner who considered further investigation advisable. Any delay in diagnosis has an impact. In lung cancer, for example, starting treatment four weeks later leads to a 6 percent increase in mortality. It is likely that earlier diagnosis will have an inverse effect on that death rate.

“However, a GP will only refer if there are sufficient indications of cancer,” says Henk van Weert, professor of General Practice. “Sometimes this takes a few consultations, because patients often initially only come to the GP with vague symptoms, which are not sufficiently worrying.”

With clearer symptoms, which may indicate cancer, most patients see a specialist within two weeks. In that respect, according to Van Weert, general practitioners do their job quite well. “With the current state of science, there seems to be little time gain, and therefore health gain, in the process with the general practitioner.”

Free text in patient records

Nevertheless, Van Weert and his colleague professor Niek de Wit of UMC Utrecht wondered in 2016 whether general practitioners could still make a contribution to detecting cancer even earlier with the help of the patient files they keep. “General practitioners often have contact with a patient for ten years or longer, sometimes also with their family and children,” explains Van Weert. “During that period, they record everything in their patient files: the reason for a consultation, diagnoses, research results, medicines, social information, work. Some information is coded, but the GP also notes a lot in free text. Such a file is therefore a rich source of data that may contain precursors, in one form or another, of the type of cancer the patient has developed.”

Searching for clues in the files

Van Weert and De Wit conceived the idea of ​​using artificial intelligence to examine the files of a large number of cancer patients for certain terms and (parts of) sentences to detect similarities. They approached Professor Ameen Abu-Hanna of the Department of Clinical Informatics. He was willing to take on the challenge of developing an algorithm that signals whether a patient may have cancer, even before he shows clear symptoms that could indicate it.

“We had no idea beforehand whether those clues could actually be found in the files,” says Abu-Hanna. The project was christened 'AI-DOC': Artificial Intelligence for earlier Detection Of Cancer.

To begin, a large number of patient files was required for a fruitful analysis. Together with AI-DOC partners UMC Utrecht and UMC Groningen, the Amsterdam UMC had access to 1.5 million patient files via a secure network. These were anonymized files, made available by general practitioners for research purposes. From this, a selection was made of patients over forty years old, after which 550,000 remained.

The researchers supplemented these files with data from the National Cancer Registry, so that they knew for sure which patients actually had cancer and when their diagnosis was made through tissue examination. “In doing so, we prevented misleading information in patient records from contaminating our analyses,” explains Van Weert. “For example, a note about colon cancer from the father, while the patient himself has not had colon cancer at all.”

After enriching the data, the researchers were able to perform their analyzes on all files of patients with the same type of cancer. The files of patients without that form of cancer served as a control group. They searched for lung, colon, ovarian and pancreatic cancer – forms of cancer for which a lot of gain can be made.

Colon cancer

The first results have been presented at the Intelligence Health Congress 2021, but have not yet been published in a scientific journal. The developed algorithm turned out to be able to detect cancer three to four months before referral by a general practitioner for lung cancer and colorectal cancer.

However, one in sixteen patients identified by the algorithm actually had cancer, and the algorithm missed one out of three cancer patients. “That means, for example, that the algorithm appears to be slightly better than the fecal immunochemical test that is currently used in the population screening for colorectal cancer,” says Van Weert.

In ovarian and pancreatic cancer, it was not possible to find early indications in patient files. Van Weert: “That may be because these cancers do not occur often enough, but we don't know that. We would have to run the algorithm on even larger amounts of patient data to get a definitive answer.”

Black box

Continuation of the AI-DOC project is dependent on further funding. “There are still many steps to be taken before we have a software program that can be used in general practice,” says Abu-Hanna. “The current algorithm is a black box. This means that we don't know exactly which words and phrases and combinations thereof are relevant as early indications of cancer. We will continue to investigate that.”

According to Abu-Hanna, this knowledge is important for the trust that general practitioners place in such a software program. “They still want to have an idea of ​​what happens in the black box.”

Putting into practice

Whether and how GPs will eventually use the program in their practice is still an open question. Abu-Hanna: “You could build it into a system that gives a warning or reminder when a patient comes for consultation. For example, with the message: pay attention, this patient has an increased chance of X, consider Y or Z. But that is all in the future.” In addition to further development of the algorithm, additional ethical and social research is required.

Text by Frank van Kolfschooten

This article first appeared in a longer version in the popular research magazine Janus.