Artificial Intelligence in Medicine – Part 7: AI: As Good or Better Than Radiologists at Spotting Breast Cancer
Roughly 1 out of 8 women will be diagnosed with invasive breast cancer (BCa). Screening mammograms are able to detect very early lesions before symptoms, such as a palpable lump, occur. Early detection facilitates a diagnosis during a period when the probability of treatment success is highest.
Standard 2D mammograms are the most used method of BCa screening, though a newer method called digital breast tomosynthesis (DBT or 3D mammogram) may improve a radiologist’s ability to diagnose BCa, especially in dense breast tissue. However, reading large numbers of mammograms is labor intensive, and reader accuracy varies.
There is much interest in the use of Artificial Intelligence (AI) to implement computer-aided detection (CAD) of cancer-suspicious abnormalities. More than one study has concluded that deep learning-driven CAD can achieve “…a cancer detection accuracy comparable to an average breast radiologist…”[i]
CAD as assistant
One application of CAD would be assisting busy radiologists by highlighting suspicious areas in order to filter abnormalities. For instance, a recent study explored using a deep-learning CAD algorithm with multiparametric MRI (mpMRI) of the breast as a diagnostic assistant to a radiologist. Their findings showed that the additional AI information could “…improve diagnostic performance by reducing the false-positive rate and improving the positive predictive value in breast imaging interpretation,”[ii] essentially acting as a radiology team member.
CAD as independent reader
Does CAD have the potential to be “…an independent reader making an overall assessment of the whole examination without radiologist intervention”? That question is explored in a meticulous analytic comparison of three commercial AI algorithms (AI-1, AI-2, and AI-3) and two trained radiology readers, in combination and as independent mammography readers (Salim, et al. 2020).[iii]
The authors drew upon a Swedish mammography screening database from which they included women ages 40-74 years who were diagnosed with BCa between 2008-15. All women had a complete screening exam prior to diagnosis, had no previous BCa, and did not have implants. The final study sample included 739 women who were subsequently diagnosed with BCa within 12 months of the mammogram either by screening alone (618 women) or clinical testing (121). The authors also included a random sample of 8066 controls who were negative for BCa. To illustrate the labor-intensive practice of Swedish mammographic reading, the national protocol observed in all of the above cases mandates
- A 1st and 2nd reader interpret for each scan, with 2nd readers often having more experience.
- Each reader assigns “normal” or “abnormal”
- Any abnormal reading requires a consensus reading (the 2 readers either agree on “normal” or “recall”)
Thus, it is easy to understand how many hundreds of radiologists’ hours of interpreting and discussing images might be saved if accurate AI software reviewed the images first. This would potentially reduce hundreds of hours into minutes by filtering out only images with abnormalities.
For the study, the findings of each of the three commercial AI algorithms were compared against the actual clinical records of the 739 BCa patients. Likewise, the findings of each reader, plus consensus findings, were similarly compared. All findings were statistically analyzed for sensitivity, specificity and several other factors. The question remains, did the Salim study find evidence of AI accuracy that would justify considering computers for the job of preliminary reading and filtering cases?
The answer is yes, especially for the AI-1 algorithm, which significantly outperformed the other two (no significant difference between AI-2 and AI-3). When all three AI performance measures were combined with each of the 2 radiologists, the most accurate performance was achieved by the combination of AI-1 plus the first radiologist reading. According to the authors, “No other examined combination of AI algorithms and radiologists surpassed this sensitivity level.” Put another way, combining the best algorithm with first readers “…identified more cases positive for cancer than combining first readers with second readers.”
Clearly, teamwork between high-performing AI and trained radiologists would not only save time, but improve screening accuracy. While the authors did not suggest that their top-ranking algorithm is ready to be a sole interpreter of women’s screening mammograms, its performance for this study earned their recommendation for putting it to the test as an independent reader in prospective clinical trials.
It increasingly appears that the future of AI to evaluate screening mammograms is now.
NOTE: This content is solely for purposes of information and does not substitute for diagnostic or medical advice. Talk to your doctor if you are experiencing pelvic pain, or have any other health concerns or questions of a personal medical nature.
[i] Rodriguez-Ruiz A, Lång K, Gubern-Merida A, Broeders M et al. Stand-Alone Artificial Intelligence for Breast Cancer Detection in Mammography: Comparison With 101 Radiologists. JNCI. 2019 Sep; 111(9): 916-922.
[ii] Hu Q, Whitney HM, Giger ML. A deep learning methodology for improved breast cancer diagnosis using multiparametric MRI. Sci Rep. 2020 Jun 29;10(1):10536.
[iii] Salim M, Wåhlin E, Dembrower K, Azavedo E et al. External Evaluation of 3 Commercial Artificial Intelligence Algorithms for Independent Assessment of Screening Mammograms. JAMA Oncol. Published online August 27, 2020. doi:10.1001/jamaoncol.2020.3321