User login
Applications of ChatGPT and Large Language Models in Medicine and Health Care: Benefits and Pitfalls
The development of [artificial intelligence] is as fundamental as the creation of the microprocessor, the personal computer, the Internet, and the mobile phone. It will change the way people work, learn, travel, get health care, and communicate with each other.
Bill Gates 1
As the world emerges from the pandemic and the health care system faces new challenges, technology has become an increasingly important tool for health care professionals (HCPs). One such technology is the large language model (LLM), which has the potential to revolutionize the health care industry. ChatGPT, a popular LLM developed by OpenAI, has gained particular attention in the medical community for its ability to pass the United States Medical Licensing Exam.2 This article will explore the benefits and potential pitfalls of using LLMs like ChatGPT in medicine and health care.
Benefits
HCP burnout is a serious issue that can lead to lower productivity, increased medical errors, and decreased patient satisfaction.3 LLMs can alleviate some administrative burdens on HCPs, allowing them to focus on patient care. By assisting with billing, coding, insurance claims, and organizing schedules, LLMs like ChatGPT can free up time for HCPs to focus on what they do best: providing quality patient care.4 ChatGPT also can assist with diagnoses by providing accurate and reliable information based on a vast amount of clinical data. By learning the relationships between different medical conditions, symptoms, and treatment options, ChatGPT can provide an appropriate differential diagnosis (Figure 1).
Imaging medical specialists like radiologists, pathologists, dermatologists, and others can benefit from combining computer vision diagnostics with ChatGPT report creation abilities to streamline the diagnostic workflow and improve diagnostic accuracy (Figure 2).
Although using ChatGPT and other LLMs in mental health care has potential benefits, it is essential to note that they are not a substitute for human interaction and personalized care. While ChatGPT can remember information from previous conversations, it cannot provide the same level of personalized, high-quality care that a professional therapist or HCP can. However, by augmenting the work of HCPs, ChatGPT and other LLMs have the potential to make mental health care more accessible and efficient. In addition to providing effective screening in underserved areas, ChatGPT technology may improve the competence of physician assistants and nurse practitioners in delivering mental health care. With the increased incidence of mental health problems in veterans, the pertinence of a ChatGPT-like feature will only increase with time.9
ChatGPT can also be integrated into health care organizations’ websites and mobile apps, providing patients instant access to medical information, self-care advice, symptom checkers, scheduling appointments, and arranging transportation. These features can reduce the burden on health care staff and help patients stay informed and motivated to take an active role in their health. Additionally, health care organizations can use ChatGPT to engage patients by providing reminders for medication renewals and assistance with self-care.4,6,10,11
The potential of artificial intelligence (AI) in the field of medical education and research is immense. According to a study by Gilson and colleagues, ChatGPT has shown promising results as a medical education tool.12 ChatGPT can simulate clinical scenarios, provide real-time feedback, and improve diagnostic skills. It also offers new interactive and personalized learning opportunities for medical students and HCPs.13 ChatGPT can help researchers by streamlining the process of data analysis. It can also administer surveys or questionnaires, facilitate data collection on preferences and experiences, and help in writing scientific publications.14 Nevertheless, to fully unlock the potential of these AI models, additional models that perform checks for factual accuracy, plagiarism, and copyright infringement must be developed.15,16
AI Bill of Rights
In order to protect the American public, the White House Office of Science and Technology Policy (OSTP) has released a blueprint for an AI Bill of Rights that emphasizes 5 principles to protect the public from the harmful effects of AI models, including safe and effective systems; algorithmic discrimination protection; data privacy; notice and explanation; and human alternatives, considerations, and fallback (Figure 3).17
One of the biggest challenges with LLMs like ChatGPT is the prevalence of inaccurate information or so-called hallucinations.16 These inaccuracies stem from the inability of LLMs to distinguish between real and fake information. To prevent hallucinations, researchers have proposed several methods, including training models on more diverse data, using adversarial training methods, and human-in-the-loop approaches.21 In addition, medicine-specific models like GatorTron, medPaLM, and Almanac were developed, increasing the accuracy of factual results.22-24 Unfortunately, only the GatorTron model is available to the public through the NVIDIA developers’ program.25
Despite these shortcomings, the future of LLMs in health care is promising. Although these models will not replace HCPs, they can help reduce the unnecessary burden on them, prevent burnout, and enable HCPs and patients spend more time together. Establishing an official hospital AI oversight governing body that would promote best practices could ensure the trustworthy implementation of these new technologies.26
Conclusions
The use of ChatGPT and other LLMs in health care has the potential to revolutionize the industry. By assisting HCPs with administrative tasks, improving the accuracy and reliability of diagnoses, and engaging patients, ChatGPT can help health care organizations provide better care to their patients. While LLMs are not a substitute for human interaction and personalized care, they can augment the work of HCPs, making health care more accessible and efficient. As the health care industry continues to evolve, it will be exciting to see how ChatGPT and other LLMs are used to improve patient outcomes and quality of care. In addition, AI technologies like ChatGPT offer enormous potential in medical education and research. To ensure that the benefits outweigh the risks, developing trustworthy AI health care products and establishing oversight governing bodies to ensure their implementation is essential. By doing so, we can help HCPs focus on what matters most, providing high-quality care to patients.
Acknowledgments
This material is the result of work supported by resources and the use of facilities at the James A. Haley Veterans’ Hospital.
1. Bill Gates. The age of AI has begun. March 21, 2023. Accessed May 10, 2023. https://www.gatesnotes.com/the-age-of-ai-has-begun
2. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. Published 2023 Feb 9. doi:10.1371/journal.pdig.0000198
3. Shanafelt TD, West CP, Sinsky C, et al. Changes in burnout and satisfaction with work-life integration in physicians and the general US working population between 2011 and 2020. Mayo Clin Proc. 2022;97(3):491-506. doi:10.1016/j.mayocp.2021.11.021
4. Goodman RS, Patrinely JR Jr, Osterman T, Wheless L, Johnson DB. On the cusp: considering the impact of artificial intelligence language models in healthcare. Med. 2023;4(3):139-140. doi:10.1016/j.medj.2023.02.008
5. Will ChatGPT transform healthcare? Nat Med. 2023;29(3):505-506. doi:10.1038/s41591-023-02289-5
6. Hopkins AM, Logan JM, Kichenadasse G, Sorich MJ. Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectr. 2023;7(2):pkad010. doi:10.1093/jncics/pkad010
7. Babar Z, van Laarhoven T, Zanzotto FM, Marchiori E. Evaluating diagnostic content of AI-generated radiology reports of chest X-rays. Artif Intell Med. 2021;116:102075. doi:10.1016/j.artmed.2021.102075
8. Lecler A, Duron L, Soyer P. Revolutionizing radiology with GPT-based models: current applications, future possibilities and limitations of ChatGPT. Diagn Interv Imaging. 2023;S2211-5684(23)00027-X. doi:10.1016/j.diii.2023.02.003
9. Germain JM. Is ChatGPT smart enough to practice mental health therapy? March 23, 2023. Accessed May 11, 2023. https://www.technewsworld.com/story/is-chatgpt-smart-enough-to-practice-mental-health-therapy-178064.html
10. Cascella M, Montomoli J, Bellini V, Bignami E. Evaluating the feasibility of ChatGPT in healthcare: an analysis of multiple clinical and research scenarios. J Med Syst. 2023;47(1):33. Published 2023 Mar 4. doi:10.1007/s10916-023-01925-4
11. Jungwirth D, Haluza D. Artificial intelligence and public health: an exploratory study. Int J Environ Res Public Health. 2023;20(5):4541. Published 2023 Mar 3. doi:10.3390/ijerph20054541
12. Gilson A, Safranek CW, Huang T, et al. How does ChatGPT perform on the United States Medical Licensing Examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ. 2023;9:e45312. Published 2023 Feb 8. doi:10.2196/45312
13. Eysenbach G. The role of ChatGPT, generative language models, and artificial intelligence in medical education: a conversation with ChatGPT and a call for papers. JMIR Med Educ. 2023;9:e46885. Published 2023 Mar 6. doi:10.2196/46885
14. Macdonald C, Adeloye D, Sheikh A, Rudan I. Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis. J Glob Health. 2023;13:01003. Published 2023 Feb 17. doi:10.7189/jogh.13.01003
15. Masters K. Ethical use of artificial intelligence in health professions education: AMEE Guide No.158. Med Teach. 2023;1-11. doi:10.1080/0142159X.2023.2186203
16. Smith CS. Hallucinations could blunt ChatGPT’s success. IEEE Spectrum. March 13, 2023. Accessed May 11, 2023. https://spectrum.ieee.org/ai-hallucination
17. Executive Office of the President, Office of Science and Technology Policy. Blueprint for an AI Bill of Rights. Accessed May 11, 2023. https://www.whitehouse.gov/ostp/ai-bill-of-rights
18. Executive office of the President. Executive Order 13960: promoting the use of trustworthy artificial intelligence in the federal government. Fed Regist. 2020;89(236):78939-78943.
19. US Department of Commerce, National institute of Standards and Technology. Artificial Intelligence Risk Management Framework (AI RMF 1.0). Published January 2023. doi:10.6028/NIST.AI.100-1
20. Microsoft. Azure Cognitive Search—Cloud Search Service. Accessed May 11, 2023. https://azure.microsoft.com/en-us/products/search
21. Aiyappa R, An J, Kwak H, Ahn YY. Can we trust the evaluation on ChatGPT? March 22, 2023. Accessed May 11, 2023. https://arxiv.org/abs/2303.12767v1
22. Yang X, Chen A, Pournejatian N, et al. GatorTron: a large clinical language model to unlock patient information from unstructured electronic health records. Updated December 16, 2022. Accessed May 11, 2023. https://arxiv.org/abs/2203.03540v3
23. Singhal K, Azizi S, Tu T, et al. Large language models encode clinical knowledge. December 26, 2022. Accessed May 11, 2023. https://arxiv.org/abs/2212.13138v1
24. Zakka C, Chaurasia A, Shad R, Hiesinger W. Almanac: knowledge-grounded language models for clinical medicine. March 1, 2023. Accessed May 11, 2023. https://arxiv.org/abs/2303.01229v1
25. NVIDIA. GatorTron-OG. Accessed May 11, 2023. https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og
26. Borkowski AA, Jakey CE, Thomas LB, Viswanadhan N, Mastorides SM. Establishing a hospital artificial intelligence committee to improve patient care. Fed Pract. 2022;39(8):334-336. doi:10.12788/fp.0299
The development of [artificial intelligence] is as fundamental as the creation of the microprocessor, the personal computer, the Internet, and the mobile phone. It will change the way people work, learn, travel, get health care, and communicate with each other.
Bill Gates 1
As the world emerges from the pandemic and the health care system faces new challenges, technology has become an increasingly important tool for health care professionals (HCPs). One such technology is the large language model (LLM), which has the potential to revolutionize the health care industry. ChatGPT, a popular LLM developed by OpenAI, has gained particular attention in the medical community for its ability to pass the United States Medical Licensing Exam.2 This article will explore the benefits and potential pitfalls of using LLMs like ChatGPT in medicine and health care.
Benefits
HCP burnout is a serious issue that can lead to lower productivity, increased medical errors, and decreased patient satisfaction.3 LLMs can alleviate some administrative burdens on HCPs, allowing them to focus on patient care. By assisting with billing, coding, insurance claims, and organizing schedules, LLMs like ChatGPT can free up time for HCPs to focus on what they do best: providing quality patient care.4 ChatGPT also can assist with diagnoses by providing accurate and reliable information based on a vast amount of clinical data. By learning the relationships between different medical conditions, symptoms, and treatment options, ChatGPT can provide an appropriate differential diagnosis (Figure 1).
Imaging medical specialists like radiologists, pathologists, dermatologists, and others can benefit from combining computer vision diagnostics with ChatGPT report creation abilities to streamline the diagnostic workflow and improve diagnostic accuracy (Figure 2).
Although using ChatGPT and other LLMs in mental health care has potential benefits, it is essential to note that they are not a substitute for human interaction and personalized care. While ChatGPT can remember information from previous conversations, it cannot provide the same level of personalized, high-quality care that a professional therapist or HCP can. However, by augmenting the work of HCPs, ChatGPT and other LLMs have the potential to make mental health care more accessible and efficient. In addition to providing effective screening in underserved areas, ChatGPT technology may improve the competence of physician assistants and nurse practitioners in delivering mental health care. With the increased incidence of mental health problems in veterans, the pertinence of a ChatGPT-like feature will only increase with time.9
ChatGPT can also be integrated into health care organizations’ websites and mobile apps, providing patients instant access to medical information, self-care advice, symptom checkers, scheduling appointments, and arranging transportation. These features can reduce the burden on health care staff and help patients stay informed and motivated to take an active role in their health. Additionally, health care organizations can use ChatGPT to engage patients by providing reminders for medication renewals and assistance with self-care.4,6,10,11
The potential of artificial intelligence (AI) in the field of medical education and research is immense. According to a study by Gilson and colleagues, ChatGPT has shown promising results as a medical education tool.12 ChatGPT can simulate clinical scenarios, provide real-time feedback, and improve diagnostic skills. It also offers new interactive and personalized learning opportunities for medical students and HCPs.13 ChatGPT can help researchers by streamlining the process of data analysis. It can also administer surveys or questionnaires, facilitate data collection on preferences and experiences, and help in writing scientific publications.14 Nevertheless, to fully unlock the potential of these AI models, additional models that perform checks for factual accuracy, plagiarism, and copyright infringement must be developed.15,16
AI Bill of Rights
In order to protect the American public, the White House Office of Science and Technology Policy (OSTP) has released a blueprint for an AI Bill of Rights that emphasizes 5 principles to protect the public from the harmful effects of AI models, including safe and effective systems; algorithmic discrimination protection; data privacy; notice and explanation; and human alternatives, considerations, and fallback (Figure 3).17
One of the biggest challenges with LLMs like ChatGPT is the prevalence of inaccurate information or so-called hallucinations.16 These inaccuracies stem from the inability of LLMs to distinguish between real and fake information. To prevent hallucinations, researchers have proposed several methods, including training models on more diverse data, using adversarial training methods, and human-in-the-loop approaches.21 In addition, medicine-specific models like GatorTron, medPaLM, and Almanac were developed, increasing the accuracy of factual results.22-24 Unfortunately, only the GatorTron model is available to the public through the NVIDIA developers’ program.25
Despite these shortcomings, the future of LLMs in health care is promising. Although these models will not replace HCPs, they can help reduce the unnecessary burden on them, prevent burnout, and enable HCPs and patients spend more time together. Establishing an official hospital AI oversight governing body that would promote best practices could ensure the trustworthy implementation of these new technologies.26
Conclusions
The use of ChatGPT and other LLMs in health care has the potential to revolutionize the industry. By assisting HCPs with administrative tasks, improving the accuracy and reliability of diagnoses, and engaging patients, ChatGPT can help health care organizations provide better care to their patients. While LLMs are not a substitute for human interaction and personalized care, they can augment the work of HCPs, making health care more accessible and efficient. As the health care industry continues to evolve, it will be exciting to see how ChatGPT and other LLMs are used to improve patient outcomes and quality of care. In addition, AI technologies like ChatGPT offer enormous potential in medical education and research. To ensure that the benefits outweigh the risks, developing trustworthy AI health care products and establishing oversight governing bodies to ensure their implementation is essential. By doing so, we can help HCPs focus on what matters most, providing high-quality care to patients.
Acknowledgments
This material is the result of work supported by resources and the use of facilities at the James A. Haley Veterans’ Hospital.
The development of [artificial intelligence] is as fundamental as the creation of the microprocessor, the personal computer, the Internet, and the mobile phone. It will change the way people work, learn, travel, get health care, and communicate with each other.
Bill Gates 1
As the world emerges from the pandemic and the health care system faces new challenges, technology has become an increasingly important tool for health care professionals (HCPs). One such technology is the large language model (LLM), which has the potential to revolutionize the health care industry. ChatGPT, a popular LLM developed by OpenAI, has gained particular attention in the medical community for its ability to pass the United States Medical Licensing Exam.2 This article will explore the benefits and potential pitfalls of using LLMs like ChatGPT in medicine and health care.
Benefits
HCP burnout is a serious issue that can lead to lower productivity, increased medical errors, and decreased patient satisfaction.3 LLMs can alleviate some administrative burdens on HCPs, allowing them to focus on patient care. By assisting with billing, coding, insurance claims, and organizing schedules, LLMs like ChatGPT can free up time for HCPs to focus on what they do best: providing quality patient care.4 ChatGPT also can assist with diagnoses by providing accurate and reliable information based on a vast amount of clinical data. By learning the relationships between different medical conditions, symptoms, and treatment options, ChatGPT can provide an appropriate differential diagnosis (Figure 1).
Imaging medical specialists like radiologists, pathologists, dermatologists, and others can benefit from combining computer vision diagnostics with ChatGPT report creation abilities to streamline the diagnostic workflow and improve diagnostic accuracy (Figure 2).
Although using ChatGPT and other LLMs in mental health care has potential benefits, it is essential to note that they are not a substitute for human interaction and personalized care. While ChatGPT can remember information from previous conversations, it cannot provide the same level of personalized, high-quality care that a professional therapist or HCP can. However, by augmenting the work of HCPs, ChatGPT and other LLMs have the potential to make mental health care more accessible and efficient. In addition to providing effective screening in underserved areas, ChatGPT technology may improve the competence of physician assistants and nurse practitioners in delivering mental health care. With the increased incidence of mental health problems in veterans, the pertinence of a ChatGPT-like feature will only increase with time.9
ChatGPT can also be integrated into health care organizations’ websites and mobile apps, providing patients instant access to medical information, self-care advice, symptom checkers, scheduling appointments, and arranging transportation. These features can reduce the burden on health care staff and help patients stay informed and motivated to take an active role in their health. Additionally, health care organizations can use ChatGPT to engage patients by providing reminders for medication renewals and assistance with self-care.4,6,10,11
The potential of artificial intelligence (AI) in the field of medical education and research is immense. According to a study by Gilson and colleagues, ChatGPT has shown promising results as a medical education tool.12 ChatGPT can simulate clinical scenarios, provide real-time feedback, and improve diagnostic skills. It also offers new interactive and personalized learning opportunities for medical students and HCPs.13 ChatGPT can help researchers by streamlining the process of data analysis. It can also administer surveys or questionnaires, facilitate data collection on preferences and experiences, and help in writing scientific publications.14 Nevertheless, to fully unlock the potential of these AI models, additional models that perform checks for factual accuracy, plagiarism, and copyright infringement must be developed.15,16
AI Bill of Rights
In order to protect the American public, the White House Office of Science and Technology Policy (OSTP) has released a blueprint for an AI Bill of Rights that emphasizes 5 principles to protect the public from the harmful effects of AI models, including safe and effective systems; algorithmic discrimination protection; data privacy; notice and explanation; and human alternatives, considerations, and fallback (Figure 3).17
One of the biggest challenges with LLMs like ChatGPT is the prevalence of inaccurate information or so-called hallucinations.16 These inaccuracies stem from the inability of LLMs to distinguish between real and fake information. To prevent hallucinations, researchers have proposed several methods, including training models on more diverse data, using adversarial training methods, and human-in-the-loop approaches.21 In addition, medicine-specific models like GatorTron, medPaLM, and Almanac were developed, increasing the accuracy of factual results.22-24 Unfortunately, only the GatorTron model is available to the public through the NVIDIA developers’ program.25
Despite these shortcomings, the future of LLMs in health care is promising. Although these models will not replace HCPs, they can help reduce the unnecessary burden on them, prevent burnout, and enable HCPs and patients spend more time together. Establishing an official hospital AI oversight governing body that would promote best practices could ensure the trustworthy implementation of these new technologies.26
Conclusions
The use of ChatGPT and other LLMs in health care has the potential to revolutionize the industry. By assisting HCPs with administrative tasks, improving the accuracy and reliability of diagnoses, and engaging patients, ChatGPT can help health care organizations provide better care to their patients. While LLMs are not a substitute for human interaction and personalized care, they can augment the work of HCPs, making health care more accessible and efficient. As the health care industry continues to evolve, it will be exciting to see how ChatGPT and other LLMs are used to improve patient outcomes and quality of care. In addition, AI technologies like ChatGPT offer enormous potential in medical education and research. To ensure that the benefits outweigh the risks, developing trustworthy AI health care products and establishing oversight governing bodies to ensure their implementation is essential. By doing so, we can help HCPs focus on what matters most, providing high-quality care to patients.
Acknowledgments
This material is the result of work supported by resources and the use of facilities at the James A. Haley Veterans’ Hospital.
1. Bill Gates. The age of AI has begun. March 21, 2023. Accessed May 10, 2023. https://www.gatesnotes.com/the-age-of-ai-has-begun
2. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. Published 2023 Feb 9. doi:10.1371/journal.pdig.0000198
3. Shanafelt TD, West CP, Sinsky C, et al. Changes in burnout and satisfaction with work-life integration in physicians and the general US working population between 2011 and 2020. Mayo Clin Proc. 2022;97(3):491-506. doi:10.1016/j.mayocp.2021.11.021
4. Goodman RS, Patrinely JR Jr, Osterman T, Wheless L, Johnson DB. On the cusp: considering the impact of artificial intelligence language models in healthcare. Med. 2023;4(3):139-140. doi:10.1016/j.medj.2023.02.008
5. Will ChatGPT transform healthcare? Nat Med. 2023;29(3):505-506. doi:10.1038/s41591-023-02289-5
6. Hopkins AM, Logan JM, Kichenadasse G, Sorich MJ. Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectr. 2023;7(2):pkad010. doi:10.1093/jncics/pkad010
7. Babar Z, van Laarhoven T, Zanzotto FM, Marchiori E. Evaluating diagnostic content of AI-generated radiology reports of chest X-rays. Artif Intell Med. 2021;116:102075. doi:10.1016/j.artmed.2021.102075
8. Lecler A, Duron L, Soyer P. Revolutionizing radiology with GPT-based models: current applications, future possibilities and limitations of ChatGPT. Diagn Interv Imaging. 2023;S2211-5684(23)00027-X. doi:10.1016/j.diii.2023.02.003
9. Germain JM. Is ChatGPT smart enough to practice mental health therapy? March 23, 2023. Accessed May 11, 2023. https://www.technewsworld.com/story/is-chatgpt-smart-enough-to-practice-mental-health-therapy-178064.html
10. Cascella M, Montomoli J, Bellini V, Bignami E. Evaluating the feasibility of ChatGPT in healthcare: an analysis of multiple clinical and research scenarios. J Med Syst. 2023;47(1):33. Published 2023 Mar 4. doi:10.1007/s10916-023-01925-4
11. Jungwirth D, Haluza D. Artificial intelligence and public health: an exploratory study. Int J Environ Res Public Health. 2023;20(5):4541. Published 2023 Mar 3. doi:10.3390/ijerph20054541
12. Gilson A, Safranek CW, Huang T, et al. How does ChatGPT perform on the United States Medical Licensing Examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ. 2023;9:e45312. Published 2023 Feb 8. doi:10.2196/45312
13. Eysenbach G. The role of ChatGPT, generative language models, and artificial intelligence in medical education: a conversation with ChatGPT and a call for papers. JMIR Med Educ. 2023;9:e46885. Published 2023 Mar 6. doi:10.2196/46885
14. Macdonald C, Adeloye D, Sheikh A, Rudan I. Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis. J Glob Health. 2023;13:01003. Published 2023 Feb 17. doi:10.7189/jogh.13.01003
15. Masters K. Ethical use of artificial intelligence in health professions education: AMEE Guide No.158. Med Teach. 2023;1-11. doi:10.1080/0142159X.2023.2186203
16. Smith CS. Hallucinations could blunt ChatGPT’s success. IEEE Spectrum. March 13, 2023. Accessed May 11, 2023. https://spectrum.ieee.org/ai-hallucination
17. Executive Office of the President, Office of Science and Technology Policy. Blueprint for an AI Bill of Rights. Accessed May 11, 2023. https://www.whitehouse.gov/ostp/ai-bill-of-rights
18. Executive office of the President. Executive Order 13960: promoting the use of trustworthy artificial intelligence in the federal government. Fed Regist. 2020;89(236):78939-78943.
19. US Department of Commerce, National institute of Standards and Technology. Artificial Intelligence Risk Management Framework (AI RMF 1.0). Published January 2023. doi:10.6028/NIST.AI.100-1
20. Microsoft. Azure Cognitive Search—Cloud Search Service. Accessed May 11, 2023. https://azure.microsoft.com/en-us/products/search
21. Aiyappa R, An J, Kwak H, Ahn YY. Can we trust the evaluation on ChatGPT? March 22, 2023. Accessed May 11, 2023. https://arxiv.org/abs/2303.12767v1
22. Yang X, Chen A, Pournejatian N, et al. GatorTron: a large clinical language model to unlock patient information from unstructured electronic health records. Updated December 16, 2022. Accessed May 11, 2023. https://arxiv.org/abs/2203.03540v3
23. Singhal K, Azizi S, Tu T, et al. Large language models encode clinical knowledge. December 26, 2022. Accessed May 11, 2023. https://arxiv.org/abs/2212.13138v1
24. Zakka C, Chaurasia A, Shad R, Hiesinger W. Almanac: knowledge-grounded language models for clinical medicine. March 1, 2023. Accessed May 11, 2023. https://arxiv.org/abs/2303.01229v1
25. NVIDIA. GatorTron-OG. Accessed May 11, 2023. https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og
26. Borkowski AA, Jakey CE, Thomas LB, Viswanadhan N, Mastorides SM. Establishing a hospital artificial intelligence committee to improve patient care. Fed Pract. 2022;39(8):334-336. doi:10.12788/fp.0299
1. Bill Gates. The age of AI has begun. March 21, 2023. Accessed May 10, 2023. https://www.gatesnotes.com/the-age-of-ai-has-begun
2. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. Published 2023 Feb 9. doi:10.1371/journal.pdig.0000198
3. Shanafelt TD, West CP, Sinsky C, et al. Changes in burnout and satisfaction with work-life integration in physicians and the general US working population between 2011 and 2020. Mayo Clin Proc. 2022;97(3):491-506. doi:10.1016/j.mayocp.2021.11.021
4. Goodman RS, Patrinely JR Jr, Osterman T, Wheless L, Johnson DB. On the cusp: considering the impact of artificial intelligence language models in healthcare. Med. 2023;4(3):139-140. doi:10.1016/j.medj.2023.02.008
5. Will ChatGPT transform healthcare? Nat Med. 2023;29(3):505-506. doi:10.1038/s41591-023-02289-5
6. Hopkins AM, Logan JM, Kichenadasse G, Sorich MJ. Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectr. 2023;7(2):pkad010. doi:10.1093/jncics/pkad010
7. Babar Z, van Laarhoven T, Zanzotto FM, Marchiori E. Evaluating diagnostic content of AI-generated radiology reports of chest X-rays. Artif Intell Med. 2021;116:102075. doi:10.1016/j.artmed.2021.102075
8. Lecler A, Duron L, Soyer P. Revolutionizing radiology with GPT-based models: current applications, future possibilities and limitations of ChatGPT. Diagn Interv Imaging. 2023;S2211-5684(23)00027-X. doi:10.1016/j.diii.2023.02.003
9. Germain JM. Is ChatGPT smart enough to practice mental health therapy? March 23, 2023. Accessed May 11, 2023. https://www.technewsworld.com/story/is-chatgpt-smart-enough-to-practice-mental-health-therapy-178064.html
10. Cascella M, Montomoli J, Bellini V, Bignami E. Evaluating the feasibility of ChatGPT in healthcare: an analysis of multiple clinical and research scenarios. J Med Syst. 2023;47(1):33. Published 2023 Mar 4. doi:10.1007/s10916-023-01925-4
11. Jungwirth D, Haluza D. Artificial intelligence and public health: an exploratory study. Int J Environ Res Public Health. 2023;20(5):4541. Published 2023 Mar 3. doi:10.3390/ijerph20054541
12. Gilson A, Safranek CW, Huang T, et al. How does ChatGPT perform on the United States Medical Licensing Examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ. 2023;9:e45312. Published 2023 Feb 8. doi:10.2196/45312
13. Eysenbach G. The role of ChatGPT, generative language models, and artificial intelligence in medical education: a conversation with ChatGPT and a call for papers. JMIR Med Educ. 2023;9:e46885. Published 2023 Mar 6. doi:10.2196/46885
14. Macdonald C, Adeloye D, Sheikh A, Rudan I. Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis. J Glob Health. 2023;13:01003. Published 2023 Feb 17. doi:10.7189/jogh.13.01003
15. Masters K. Ethical use of artificial intelligence in health professions education: AMEE Guide No.158. Med Teach. 2023;1-11. doi:10.1080/0142159X.2023.2186203
16. Smith CS. Hallucinations could blunt ChatGPT’s success. IEEE Spectrum. March 13, 2023. Accessed May 11, 2023. https://spectrum.ieee.org/ai-hallucination
17. Executive Office of the President, Office of Science and Technology Policy. Blueprint for an AI Bill of Rights. Accessed May 11, 2023. https://www.whitehouse.gov/ostp/ai-bill-of-rights
18. Executive office of the President. Executive Order 13960: promoting the use of trustworthy artificial intelligence in the federal government. Fed Regist. 2020;89(236):78939-78943.
19. US Department of Commerce, National institute of Standards and Technology. Artificial Intelligence Risk Management Framework (AI RMF 1.0). Published January 2023. doi:10.6028/NIST.AI.100-1
20. Microsoft. Azure Cognitive Search—Cloud Search Service. Accessed May 11, 2023. https://azure.microsoft.com/en-us/products/search
21. Aiyappa R, An J, Kwak H, Ahn YY. Can we trust the evaluation on ChatGPT? March 22, 2023. Accessed May 11, 2023. https://arxiv.org/abs/2303.12767v1
22. Yang X, Chen A, Pournejatian N, et al. GatorTron: a large clinical language model to unlock patient information from unstructured electronic health records. Updated December 16, 2022. Accessed May 11, 2023. https://arxiv.org/abs/2203.03540v3
23. Singhal K, Azizi S, Tu T, et al. Large language models encode clinical knowledge. December 26, 2022. Accessed May 11, 2023. https://arxiv.org/abs/2212.13138v1
24. Zakka C, Chaurasia A, Shad R, Hiesinger W. Almanac: knowledge-grounded language models for clinical medicine. March 1, 2023. Accessed May 11, 2023. https://arxiv.org/abs/2303.01229v1
25. NVIDIA. GatorTron-OG. Accessed May 11, 2023. https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og
26. Borkowski AA, Jakey CE, Thomas LB, Viswanadhan N, Mastorides SM. Establishing a hospital artificial intelligence committee to improve patient care. Fed Pract. 2022;39(8):334-336. doi:10.12788/fp.0299
Establishing a Hospital Artificial Intelligence Committee to Improve Patient Care
In the past 10 years, artificial intelligence (AI) applications have exploded in numerous fields, including medicine. Myriad publications report that the use of AI in health care is increasing, and AI has shown utility in many medical specialties, eg, pathology, radiology, and oncology.1,2
In cancer pathology, AI was able not only to detect various cancers, but also to subtype and grade them. In addition, AI could predict survival, the success of therapeutic response, and underlying mutations from histopathologic images.3 In other medical fields, AI applications are as notable. For example, in imaging specialties like radiology, ophthalmology, dermatology, and gastroenterology, AI is being used for image recognition, enhancement, and segmentation. In addition, AI is beneficial for predicting disease progression, survival, and response to therapy in other medical specialties. Finally, AI may help with administrative tasks like scheduling.
However, many obstacles to successfully implementing AI programs in the clinical setting exist, including clinical data limitations and ethical use of data, trust in the AI models, regulatory barriers, and lack of clinical buy-in due to insufficient basic AI understanding.2 To address these barriers to successful clinical AI implementation, we decided to create a formal governing body at James A. Haley Veterans’ Hospital in Tampa, Florida. Accordingly, the hospital AI committee charter was officially approved on July 22, 2021. Our model could be used by both US Department of Veterans Affairs (VA) and non-VA hospitals throughout the country.
AI Committee
The vision of the AI committee is to improve outcomes and experiences for our veterans by developing trustworthy AI capabilities to support the VA mission. The mission is to build robust capacity in AI to create and apply innovative AI solutions and transform the VA by facilitating a learning environment that supports the delivery of world-class benefits and services to our veterans. Our vision and mission are aligned with the VA National AI Institute. 4
The AI Committee comprises 7 subcommittees: ethics, AI clinical product evaluation, education, data sharing and acquisition, research, 3D printing, and improvement and innovation. The role of the ethics subcommittee is to ensure the ethical and equitable implementation of clinical AI. We created the ethics subcommittee guidelines based on the World Health Organization ethics and governance of AI for health documents.5 They include 6 basic principles: protecting human autonomy; promoting human well-being and safety and the public interest; ensuring transparency, explainability, and intelligibility; fostering responsibility and accountability; ensuring inclusiveness and equity; and promoting AI that is responsive and sustainable (Table 1).
As the name indicates, the role of the AI clinical product evaluation subcommittee is to evaluate commercially available clinical AI products. More than 400 US Food and Drug Administration–approved AI medical applications exist, and the list is growing rapidly. Most AI applications are in medical imaging like radiology, dermatology, ophthalmology, and pathology.6,7 Each clinical product is evaluated according to 6 principles: relevance, usability, risks, regulatory, technical requirements, and financial (Table 2).8 We are in the process of evaluating a few commercial AI algorithms for pathology and radiology, using these 6 principles.
Implementations
After a comprehensive evaluation, we implemented 2 ClearRead (Riverain Technologies) AI radiology solutions. ClearRead CT Vessel Suppress produces a secondary series of computed tomography (CT) images, suppressing vessels and other normal structures within the lungs to improve nodule detectability, and ClearRead Xray Bone Suppress, which increases the visibility of soft tissue in standard chest X-rays by suppressing the bone on the digital image without the need for 2 exposures.
The role of the education subcommittee is to educate the staff about AI and how it can improve patient care. Every Friday, we email an AI article of the week to our practitioners. In addition, we publish a newsletter, and we organize an annual AI conference. The first conference in 2022 included speakers from the National AI Institute, Moffitt Cancer Center, the University of South Florida, and our facility.
As the name indicates, the data sharing and acquisition subcommittee oversees preparing data for our clinical and research projects. The role of the research subcommittee is to coordinate and promote AI research with the ultimate goal of improving patient care.
Other Technologies
Although 3D printing does not fall under the umbrella of AI, we have decided to include it in our future-oriented AI committee. We created an online 3D printing course to promote the technology throughout the VA. We 3D print organ models to help surgeons prepare for complicated operations. In addition, together with our colleagues from the University of Florida, we used 3D printing to address the shortage of swabs for COVID-19 testing. The VA Sunshine Healthcare Network (Veterans Integrated Services Network 8) has an active Innovation and Improvement Committee. 9 Our improvement and innovation subcommittee serves as a coordinating body with the network committee .
Conclusions
Through the hospital AI committee, we believe that we may overcome many obstacles to successfully implementing AI applications in the clinical setting, including the ethical use of data, trust in the AI models, regulatory barriers, and lack of clinical buy-in due to insufficient basic AI knowledge.
Acknowledgments
This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.
In the past 10 years, artificial intelligence (AI) applications have exploded in numerous fields, including medicine. Myriad publications report that the use of AI in health care is increasing, and AI has shown utility in many medical specialties, eg, pathology, radiology, and oncology.1,2
In cancer pathology, AI was able not only to detect various cancers, but also to subtype and grade them. In addition, AI could predict survival, the success of therapeutic response, and underlying mutations from histopathologic images.3 In other medical fields, AI applications are as notable. For example, in imaging specialties like radiology, ophthalmology, dermatology, and gastroenterology, AI is being used for image recognition, enhancement, and segmentation. In addition, AI is beneficial for predicting disease progression, survival, and response to therapy in other medical specialties. Finally, AI may help with administrative tasks like scheduling.
However, many obstacles to successfully implementing AI programs in the clinical setting exist, including clinical data limitations and ethical use of data, trust in the AI models, regulatory barriers, and lack of clinical buy-in due to insufficient basic AI understanding.2 To address these barriers to successful clinical AI implementation, we decided to create a formal governing body at James A. Haley Veterans’ Hospital in Tampa, Florida. Accordingly, the hospital AI committee charter was officially approved on July 22, 2021. Our model could be used by both US Department of Veterans Affairs (VA) and non-VA hospitals throughout the country.
AI Committee
The vision of the AI committee is to improve outcomes and experiences for our veterans by developing trustworthy AI capabilities to support the VA mission. The mission is to build robust capacity in AI to create and apply innovative AI solutions and transform the VA by facilitating a learning environment that supports the delivery of world-class benefits and services to our veterans. Our vision and mission are aligned with the VA National AI Institute. 4
The AI Committee comprises 7 subcommittees: ethics, AI clinical product evaluation, education, data sharing and acquisition, research, 3D printing, and improvement and innovation. The role of the ethics subcommittee is to ensure the ethical and equitable implementation of clinical AI. We created the ethics subcommittee guidelines based on the World Health Organization ethics and governance of AI for health documents.5 They include 6 basic principles: protecting human autonomy; promoting human well-being and safety and the public interest; ensuring transparency, explainability, and intelligibility; fostering responsibility and accountability; ensuring inclusiveness and equity; and promoting AI that is responsive and sustainable (Table 1).
As the name indicates, the role of the AI clinical product evaluation subcommittee is to evaluate commercially available clinical AI products. More than 400 US Food and Drug Administration–approved AI medical applications exist, and the list is growing rapidly. Most AI applications are in medical imaging like radiology, dermatology, ophthalmology, and pathology.6,7 Each clinical product is evaluated according to 6 principles: relevance, usability, risks, regulatory, technical requirements, and financial (Table 2).8 We are in the process of evaluating a few commercial AI algorithms for pathology and radiology, using these 6 principles.
Implementations
After a comprehensive evaluation, we implemented 2 ClearRead (Riverain Technologies) AI radiology solutions. ClearRead CT Vessel Suppress produces a secondary series of computed tomography (CT) images, suppressing vessels and other normal structures within the lungs to improve nodule detectability, and ClearRead Xray Bone Suppress, which increases the visibility of soft tissue in standard chest X-rays by suppressing the bone on the digital image without the need for 2 exposures.
The role of the education subcommittee is to educate the staff about AI and how it can improve patient care. Every Friday, we email an AI article of the week to our practitioners. In addition, we publish a newsletter, and we organize an annual AI conference. The first conference in 2022 included speakers from the National AI Institute, Moffitt Cancer Center, the University of South Florida, and our facility.
As the name indicates, the data sharing and acquisition subcommittee oversees preparing data for our clinical and research projects. The role of the research subcommittee is to coordinate and promote AI research with the ultimate goal of improving patient care.
Other Technologies
Although 3D printing does not fall under the umbrella of AI, we have decided to include it in our future-oriented AI committee. We created an online 3D printing course to promote the technology throughout the VA. We 3D print organ models to help surgeons prepare for complicated operations. In addition, together with our colleagues from the University of Florida, we used 3D printing to address the shortage of swabs for COVID-19 testing. The VA Sunshine Healthcare Network (Veterans Integrated Services Network 8) has an active Innovation and Improvement Committee. 9 Our improvement and innovation subcommittee serves as a coordinating body with the network committee .
Conclusions
Through the hospital AI committee, we believe that we may overcome many obstacles to successfully implementing AI applications in the clinical setting, including the ethical use of data, trust in the AI models, regulatory barriers, and lack of clinical buy-in due to insufficient basic AI knowledge.
Acknowledgments
This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.
In the past 10 years, artificial intelligence (AI) applications have exploded in numerous fields, including medicine. Myriad publications report that the use of AI in health care is increasing, and AI has shown utility in many medical specialties, eg, pathology, radiology, and oncology.1,2
In cancer pathology, AI was able not only to detect various cancers, but also to subtype and grade them. In addition, AI could predict survival, the success of therapeutic response, and underlying mutations from histopathologic images.3 In other medical fields, AI applications are as notable. For example, in imaging specialties like radiology, ophthalmology, dermatology, and gastroenterology, AI is being used for image recognition, enhancement, and segmentation. In addition, AI is beneficial for predicting disease progression, survival, and response to therapy in other medical specialties. Finally, AI may help with administrative tasks like scheduling.
However, many obstacles to successfully implementing AI programs in the clinical setting exist, including clinical data limitations and ethical use of data, trust in the AI models, regulatory barriers, and lack of clinical buy-in due to insufficient basic AI understanding.2 To address these barriers to successful clinical AI implementation, we decided to create a formal governing body at James A. Haley Veterans’ Hospital in Tampa, Florida. Accordingly, the hospital AI committee charter was officially approved on July 22, 2021. Our model could be used by both US Department of Veterans Affairs (VA) and non-VA hospitals throughout the country.
AI Committee
The vision of the AI committee is to improve outcomes and experiences for our veterans by developing trustworthy AI capabilities to support the VA mission. The mission is to build robust capacity in AI to create and apply innovative AI solutions and transform the VA by facilitating a learning environment that supports the delivery of world-class benefits and services to our veterans. Our vision and mission are aligned with the VA National AI Institute. 4
The AI Committee comprises 7 subcommittees: ethics, AI clinical product evaluation, education, data sharing and acquisition, research, 3D printing, and improvement and innovation. The role of the ethics subcommittee is to ensure the ethical and equitable implementation of clinical AI. We created the ethics subcommittee guidelines based on the World Health Organization ethics and governance of AI for health documents.5 They include 6 basic principles: protecting human autonomy; promoting human well-being and safety and the public interest; ensuring transparency, explainability, and intelligibility; fostering responsibility and accountability; ensuring inclusiveness and equity; and promoting AI that is responsive and sustainable (Table 1).
As the name indicates, the role of the AI clinical product evaluation subcommittee is to evaluate commercially available clinical AI products. More than 400 US Food and Drug Administration–approved AI medical applications exist, and the list is growing rapidly. Most AI applications are in medical imaging like radiology, dermatology, ophthalmology, and pathology.6,7 Each clinical product is evaluated according to 6 principles: relevance, usability, risks, regulatory, technical requirements, and financial (Table 2).8 We are in the process of evaluating a few commercial AI algorithms for pathology and radiology, using these 6 principles.
Implementations
After a comprehensive evaluation, we implemented 2 ClearRead (Riverain Technologies) AI radiology solutions. ClearRead CT Vessel Suppress produces a secondary series of computed tomography (CT) images, suppressing vessels and other normal structures within the lungs to improve nodule detectability, and ClearRead Xray Bone Suppress, which increases the visibility of soft tissue in standard chest X-rays by suppressing the bone on the digital image without the need for 2 exposures.
The role of the education subcommittee is to educate the staff about AI and how it can improve patient care. Every Friday, we email an AI article of the week to our practitioners. In addition, we publish a newsletter, and we organize an annual AI conference. The first conference in 2022 included speakers from the National AI Institute, Moffitt Cancer Center, the University of South Florida, and our facility.
As the name indicates, the data sharing and acquisition subcommittee oversees preparing data for our clinical and research projects. The role of the research subcommittee is to coordinate and promote AI research with the ultimate goal of improving patient care.
Other Technologies
Although 3D printing does not fall under the umbrella of AI, we have decided to include it in our future-oriented AI committee. We created an online 3D printing course to promote the technology throughout the VA. We 3D print organ models to help surgeons prepare for complicated operations. In addition, together with our colleagues from the University of Florida, we used 3D printing to address the shortage of swabs for COVID-19 testing. The VA Sunshine Healthcare Network (Veterans Integrated Services Network 8) has an active Innovation and Improvement Committee. 9 Our improvement and innovation subcommittee serves as a coordinating body with the network committee .
Conclusions
Through the hospital AI committee, we believe that we may overcome many obstacles to successfully implementing AI applications in the clinical setting, including the ethical use of data, trust in the AI models, regulatory barriers, and lack of clinical buy-in due to insufficient basic AI knowledge.
Acknowledgments
This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.
Artificial Intelligence: Review of Current and Future Applications in Medicine
Artificial Intelligence (AI) was first described in 1956 and refers to machines having the ability to learn as they receive and process information, resulting in the ability to “think” like humans.1 AI’s impact in medicine is increasing; currently, at least 29 AI medical devices and algorithms are approved by the US Food and Drug Administration (FDA) in a variety of areas, including radiograph interpretation, managing glucose levels in patients with diabetes mellitus, analyzing electrocardiograms (ECGs), and diagnosing sleep disorders among others.2 Significantly, in 2020, the Centers for Medicare and Medicaid Services (CMS) announced the first reimbursement to hospitals for an AI platform, a model for early detection of strokes.3 AI is rapidly becoming an integral part of health care, and its role will only increase in the future (Table).
As knowledge in medicine is expanding exponentially, AI has great potential to assist with handling complex patient care data. The concept of exponential growth is not a natural one. As Bini described, with exponential growth the volume of knowledge amassed over the past 10 years will now occur in perhaps only 1 year.1 Likewise, equivalent advances over the past year may take just a few months. This phenomenon is partly due to the law of accelerating returns, which states that advances feed on themselves, continually increasing the rate of further advances.4 The volume of medical data doubles every 2 to 5 years.5 Fortunately, the field of AI is growing exponentially as well and can help health care practitioners (HCPs) keep pace, allowing the continued delivery of effective health care.
In this report, we review common terminology, principles, and general applications of AI, followed by current and potential applications of AI for selected medical specialties. Finally, we discuss AI’s future in health care, along with potential risks and pitfalls.
AI Overview
AI refers to machine programs that can “learn” or think based on past experiences. This functionality contrasts with simple rules-based programming available to health care for years. An example of rules-based programming is the warfarindosing.org website developed by Barnes-Jewish Hospital at Washington University Medical Center, which guides initial warfarin dosing.6,7 The prescriber inputs detailed patient information, including age, sex, height, weight, tobacco history, medications, laboratory results, and genotype if available. The application then calculates recommended warfarin dosing regimens to avoid over- or underanticoagulation. While the dosing algorithm may be complex, it depends entirely on preprogrammed rules. The program does not learn to reach its conclusions and recommendations from patient data.
In contrast, one of the most common subsets of AI is machine learning (ML). ML describes a program that “learns from experience and improves its performance as it learns.”1 With ML, the computer is initially provided with a training data set—data with known outcomes or labels. Because the initial data are input from known samples, this type of AI is known as supervised learning.8-10 As an example, we recently reported using ML to diagnose various types of cancer from pathology slides.11 In one experiment, we captured images of colon adenocarcinoma and normal colon (these 2 groups represent the training data set). Unlike traditional programming, we did not define characteristics that would differentiate colon cancer from normal; rather, the machine learned these characteristics independently by assessing the labeled images provided. A second data set (the validation data set) was used to evaluate the program and fine-tune the ML training model’s parameters. Finally, the program was presented with new images of cancer and normal cases for final assessment of accuracy (test data set). Our program learned to recognize differences from the images provided and was able to differentiate normal and cancer images with > 95% accuracy.
Advances in computer processing have allowed for the development of artificial neural networks (ANNs). While there are several types of ANNs, the most common types used for image classification and segmentation are known as convolutional neural networks (CNNs).9,12-14 The programs are designed to work similar to the human brain, specifically the visual cortex.15,16 As data are acquired, they are processed by various layers in the program. Much like neurons in the brain, one layer decides whether to advance information to the next.13,14 CNNs can be many layers deep, leading to the term deep learning: “computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction.”1,13,17
ANNs can process larger volumes of data. This advance has led to the development of unstructured or unsupervised learning. With this type of learning, imputing defined features (ie, predetermined answers) of the training data set described above is no longer required.1,8,10,14 The advantage of unsupervised learning is that the program can be presented raw data and extract meaningful interpretation without human input, often with less bias than may exist with supervised learning.1,18 If shown enough data, the program can extract relevant features to make conclusions independently without predefined definitions, potentially uncovering markers not previously known. For example, several studies have used unsupervised learning to search patient data to assess readmission risks of patients with congestive heart failure.10,19,20 AI compiled features independently and not previously defined, predicting patients at greater risk for readmission superior to traditional methods.
A more detailed description of the various terminologies and techniques of AI is beyond the scope of this review.9,10,17,21 However, in this basic overview, we describe 4 general areas that AI impacts health care (Figure).
Health Care Applications
Image analysis has seen the most AI health care applications.8,15 AI has shown potential in interpreting many types of medical images, including pathology slides, radiographs of various types, retina and other eye scans, and photographs of skin lesions. Many studies have demonstrated that AI can interpret these images as accurately as or even better than experienced clinicians.9,13,22-29 Studies have suggested AI interpretation of radiographs may better distinguish patients infected with COVID-19 from other causes of pneumonia, and AI interpretation of pathology slides may detect specific genetic mutations not previously identified without additional molecular tests.11,14,23,24,30-32
The second area in which AI can impact health care is improving workflow and efficiency. AI has improved surgery scheduling, saving significant revenue, and decreased patient wait times for appointments.1 AI can screen and triage radiographs, allowing attention to be directed to critical patients. This use would be valuable in many busy clinical settings, such as the recent COVID-19 pandemic.8,23 Similarly, AI can screen retina images to prioritize urgent conditions.25 AI has improved pathologists’ efficiency when used to detect breast metastases.33 Finally, AI may reduce medical errors, thereby ensuring patient safety.8,9,34
A third health care benefit of AI is in public health and epidemiology. AI can assist with clinical decision-making and diagnoses in low-income countries and areas with limited health care resources and personnel.25,29 AI can improve identification of infectious outbreaks, such as tuberculosis, malaria, dengue fever, and influenza.29,35-40 AI has been used to predict transmission patterns of the Zika virus and the current COVID-19 pandemic.41,42 Applications can stratify the risk of outbreaks based on multiple factors, including age, income, race, atypical geographic clusters, and seasonal factors like rainfall and temperature.35,36,38,43 AI has been used to assess morbidity and mortality, such as predicting disease severity with malaria and identifying treatment failures in tuberculosis.29
Finally, AI can dramatically impact health care due to processing large data sets or disconnected volumes of patient information—so-called big data.44-46 An example is the widespread use of electronic health records (EHRs) such as the Computerized Patient Record System used in Veteran Affairs medical centers (VAMCs). Much of patient information exists in written text: HCP notes, laboratory and radiology reports, medication records, etc. Natural language processing (NLP) allows platforms to sort through extensive volumes of data on complex patients at rates much faster than human capability, which has great potential to assist with diagnosis and treatment decisions.9
Medical literature is being produced at rates that exceed our ability to digest. More than 200,000 cancer-related articles were published in 2019 alone.14 NLP capabilities of AI have the potential to rapidly sort through this extensive medical literature and relate specific verbiage in patient records guiding therapy.46 IBM Watson, a supercomputer based on ML and NLP, demonstrates this concept with many potential applications, only some of which relate to health care.1,9 Watson has an oncology component to assimilate multiple aspects of patient care, including clinical notes, pathology results, radiograph findings, staging, and a tumor’s genetic profile. It coordinates these inputs from the EHR and mines medical literature and research databases to recommend treatment options.1,46 AI can assess and compile far greater patient data and therapeutic options than would be feasible by individual clinicians, thus providing customized patient care.47 Watson has partnered with numerous medical centers, including MD Anderson Cancer Center and Memorial Sloan Kettering Cancer Center, with variable success.44,47-49 While the full potential of Watson appears not yet realized, these AI-driven approaches will likely play an important role in leveraging the hidden value in the expanding volume of health care information.
Medical Specialty Applications
Radiology
Currently > 70% of FDA-approved AI medical devices are in the field of radiology.2 Most radiology departments have used AI-friendly digital imaging for years, such as the picture archiving and communication systems used by numerous health care systems, including VAMCs.2,15 Gray-scale images common in radiology lend themselves to standardization, although AI is not limited to black-and- white image interpretation.15
An abundance of literature describes plain radiograph interpretation using AI. One FDA-approved platform improved X-ray diagnosis of wrist fractures when used by emergency medicine clinicians.2,50 AI has been applied to chest X-ray (CXR) interpretation of many conditions, including pneumonia, tuberculosis, malignant lung lesions, and COVID-19.23,25,28,44,51-53 For example, Nam and colleagues suggested AI is better at diagnosing malignant pulmonary nodules from CXRs than are trained radiologists.28
In addition to plain radiographs, AI has been applied to many other imaging technologies, including ultrasounds, positron emission tomography, mammograms, computed tomography (CT), and magnetic resonance imaging (MRI).15,26,44,48,54-56 A large study demonstrated that ML platforms significantly reduced the time to diagnose intracranial hemorrhages on CT and identified subtle hemorrhages missed by radiologists.55 Other studies have claimed that AI programs may be better than radiologists in detecting cancer in screening mammograms, and 3 FDA-approved devices focus on mammogram interpretation.2,15,54,57 There is also great interest in MRI applications to detect and predict prognosis for breast cancer based on imaging findings.21,56
Aside from providing accurate diagnoses, other studies focus on AI radiograph interpretation to assist with patient screening, triage, improving time to final diagnosis, providing a rapid “second opinion,” and even monitoring disease progression and offering insights into prognosis.8,21,23,52,55,56,58 These features help in busy urban centers but may play an even greater role in areas with limited access to health care or trained specialists such as radiologists.52
Cardiology
Cardiology has the second highest number of FDA-approved AI applications.2 Many cardiology AI platforms involve image analysis, as described in several recent reviews.45,59,60 AI has been applied to echocardiography to measure ejection fractions, detect valvular disease, and assess heart failure from hypertrophic and restrictive cardiomyopathy and amyloidosis.45,48,59 Applications for cardiac CT scans and CT angiography have successfully quantified both calcified and noncalcified coronary artery plaques and lumen assessments, assessed myocardial perfusion, and performed coronary artery calcium scoring.45,59,60 Likewise, AI applications for cardiac MRI have been used to quantitate ejection fraction, large vessel flow assessment, and cardiac scar burden.45,59
For years ECG devices have provided interpretation with limited accuracy using preprogrammed parameters.48 However, the application of AI allows ECG interpretation on par with trained cardiologists. Numerous such AI applications exist, and 2 FDA-approved devices perform ECG interpretation.2,61-64 One of these devices incorporates an AI-powered stethoscope to detect atrial fibrillation and heart murmurs.65
Pathology
The advancement of whole slide imaging, wherein entire slides can be scanned and digitized at high speed and resolution, creates great potential for AI applications in pathology.12,24,32,33,66 A landmark study demonstrating the potential of AI for assessing whole slide imaging examined sentinel lymph node metastases in patients with breast cancer.22 Multiple algorithms in the study demonstrated that AI was equivalent or better than pathologists in detecting metastases, especially when the pathologists were time-constrained consistent with a normal working environment. Significantly, the most accurate and efficient diagnoses were achieved when the pathologist and AI interpretations were used together.22,33
AI has shown promise in diagnosing many other entities, including cancers of the prostate (including Gleason scoring), lung, colon, breast, and skin.11,12,24,27,32,67 In addition, AI has shown great potential in scoring biomarkers important for prognosis and treatment, such as immunohistochemistry (IHC) labeling of Ki-67 and PD-L1.32 Pathologists can have difficulty classifying certain tumors or determining the site of origin for metastases, often having to rely on IHC with limited success. The unique features of image analysis with AI have the potential to assist in classifying difficult tumors and identifying sites of origin for metastatic disease based on morphology alone.11
Oncology depends heavily on molecular pathology testing to dictate treatment options and determine prognosis. Preliminary studies suggest that AI interpretation alone has the potential to delineate whether certain molecular mutations are present in tumors from various sites.11,14,24,32 One study combined histology and genomic results for AI interpretation that improved prognostic predictions.68 In addition, AI analysis may have potential in predicting tumor recurrence or prognosis based on cellular features, as demonstrated for lung cancer and melanoma.67,69,70
Ophthalmology
AI applications for ophthalmology have focused on diabetic retinopathy, age-related macular degeneration, glaucoma, retinopathy of prematurity, age-related and congenital cataracts, and retinal vein occlusion.71-73 Diabetic retinopathy is a leading cause of blindness and has been studied by numerous platforms with good success, most having used color fundus photography.71,72 One study showed AI could diagnose diabetic retinopathy and diabetic macular edema with specificities similar to ophthalmologists.74 In 2018, the FDA approved the AI platform IDx-DR. This diagnostic system classifies retinal images and recommends referral for patients determined to have “more than mild diabetic retinopathy” and reexamination within a year for other patients.8,75 Significantly, the platform recommendations do not require confirmation by a clinician.8
AI has been applied to other modalities in ophthalmology such as optical coherence tomography (OCT) to diagnose retinal disease and to predict appropriate management of congenital cataracts.25,73,76 For example, an AI application using OCT has been demonstrated to match or exceed the accuracy of retinal experts in diagnosing and triaging patients with a variety of retinal pathologies, including patients needing urgent referrals.77
Dermatology
Multiple studies demonstrate AI performs at least equal to experienced dermatologists in differentiating selected skin lesions.78-81 For example, Esteva and colleagues demonstrated AI could differentiate keratinocyte carcinomas from benign seborrheic keratoses and malignant melanomas from benign nevi with accuracy equal to 21 board-certified dermatologists.78
AI is applicable to various imaging procedures common to dermatology, such as dermoscopy, very high-frequency ultrasound, and reflectance confocal microscopy.82 Several studies have demonstrated that AI interpretation compared favorably to dermatologists evaluating dermoscopy to assess melanocytic lesions.78-81,83
A limitation in these studies is that they differentiate only a few diagnoses.82 Furthermore, dermatologists have sensory input such as touch and visual examination under various conditions, something AI has yet to replicate.15,34,84 Also, most AI devices use no or limited clinical information.81 Dermatologists can recognize rarer conditions for which AI models may have had limited or no training.34 Nevertheless, a recent study assessed AI for the diagnosis of 134 separate skin disorders with promising results, including providing diagnoses with accuracy comparable to that of dermatologists and providing accurate treatment strategies.84 As Topol points out, most skin lesions are diagnosed in the primary care setting where AI can have a greater impact when used in conjunction with the clinical impression, especially where specialists are in limited supply.48,78
Finally, dermatology lends itself to using portable or smartphone applications (apps) wherein the user can photograph a lesion for analysis by AI algorithms to assess the need for further evaluation or make treatment recommendations.34,84,85 Although results from currently available apps are not encouraging, they may play a greater role as the technology advances.34,85
Oncology
Applications of AI in oncology include predicting prognosis for patients with cancer based on histologic and/or genetic information.14,68,86 Programs can predict the risk of complications before and recurrence risks after surgery for malignancies.44,87-89 AI can also assist in treatment planning and predict treatment failure with radiation therapy.90,91
AI has great potential in processing the large volumes of patient data in cancer genomics. Next-generation sequencing has allowed for the identification of millions of DNA sequences in a single tumor to detect genetic anomalies.92 Thousands of mutations can be found in individual tumor samples, and processing this information and determining its significance can be beyond human capability.14 We know little about the effects of various mutation combinations, and most tumors have a heterogeneous molecular profile among different cell populations.14,93 The presence or absence of various mutations can have diagnostic, prognostic, and therapeutic implications.93 AI has great potential to sort through these complex data and identify actionable findings.
More than 200,000 cancer-related articles were published in 2019, and publications in the field of cancer genomics are increasing exponentially.14,92,93 Patel and colleagues assessed the utility of IBM Watson for Genomics against results from a molecular tumor board.93 Watson for Genomics identified potentially significant mutations not identified by the tumor board in 32% of patients. Most mutations were related to new clinical trials not yet added to the tumor board watch list, demonstrating the role AI will have in processing the large volume of genetic data required to deliver personalized medicine moving forward.
Gastroenterology
AI has shown promise in predicting risk or outcomes based on clinical parameters in various common gastroenterology problems, including gastric reflux, acute pancreatitis, gastrointestinal bleeding, celiac disease, and inflammatory bowel disease.94,95 AI endoscopic analysis has demonstrated potential in assessing Barrett’s esophagus, gastric Helicobacter pylori infections, gastric atrophy, and gastric intestinal metaplasia.95 Applications have been used to assess esophageal, gastric, and colonic malignancies, including depth of invasion based on endoscopic images.95 Finally, studies have evaluated AI to assess small colon polyps during colonoscopy, including differentiating benign and premalignant polyps with success comparable to gastroenterologists.94,95 AI has been shown to increase the speed and accuracy of gastroenterologists in detecting small polyps during colonoscopy.48 In a prospective randomized study, colonoscopies performed using an AI device identified significantly more small adenomatous polyps than colonoscopies without AI.96
Neurology
It has been suggested that AI technologies are well suited for application in neurology due to the subtle presentation of many neurologic diseases.16 Viz LVO, the first CMS-approved AI reimbursement for the diagnosis of strokes, analyzes CTs to detect early ischemic strokes and alerts the medical team, thus shortening time to treatment.3,97 Many other AI platforms are in use or development that use CT and MRI for the early detection of strokes as well as for treatment and prognosis.9,97
AI technologies have been applied to neurodegenerative diseases, such as Alzheimer and Parkinson diseases.16,98 For example, several studies have evaluated patient movements in Parkinson disease for both early diagnosis and to assess response to treatment.98 These evaluations included assessment with both external cameras as well as wearable devices and smartphone apps.
AI has also been applied to seizure disorders, attempting to determine seizure type, localize the area of seizure onset, and address the challenges of identifying seizures in neonates.99,100 Other potential applications range from early detection and prognosis predictions for cases of multiple sclerosis to restoring movement in paralysis from a variety of conditions such as spinal cord injury.9,101,102
Mental Health
Due to the interactive nature of mental health care, the field has been slower to develop AI applications.18 With heavy reliance on textual information (eg, clinic notes, mood rating scales, and documentation of conversations), successful AI applications in this field will likely rely heavily on NLP.18 However, studies investigating the application of AI to mental health have also incorporated data such as brain imaging, smartphone monitoring, and social media platforms, such as Facebook and Twitter.18,103,104
The risk of suicide is higher in veteran patients, and ML algorithms have had limited success in predicting suicide risk in both veteran and nonveteran populations.104-106 While early models have low positive predictive values and low sensitivities, they still promise to be a useful tool in conjunction with traditional risk assessments.106 Kessler and colleagues suggest that combining multiple rather than single ML algorithms might lead to greater success.105,106
AI may assist in diagnosing other mental health disorders, including major depressive disorder, attention deficit hyperactivity disorder (ADHD), schizophrenia, posttraumatic stress disorder, and Alzheimer disease.103,104,107 These investigations are in the early stages with limited clinical applicability. However, 2 AI applications awaiting FDA approval relate to ADHD and opioid use.2 Furthermore, potential exists for AI to not only assist with prevention and diagnosis of ADHD, but also to identify optimal treatment options.2,103
General and Personalized Medicine
Additional AI applications include diagnosing patients with suspected sepsis, measuring liver iron concentrations, predicting hospital mortality at the time of admission, and more.2,108,109 AI can guide end-of-life decisions such as resuscitation status or whether to initiate mechanical ventilation.48
AI-driven smartphone apps can be beneficial to both patients and clinicians. Examples include predicting nonadherence to anticoagulation therapy, monitoring heart rhythms for atrial fibrillation or signs of hyperkalemia in patients with renal failure, and improving outcomes for patients with diabetes mellitus by decreasing glycemic variability and reducing hypoglycemia.8,48,110,111 The potential for AI applications to health care and personalized medicine are almost limitless.
Discussion
With ever-increasing expectations for all health care sectors to deliver timely, fiscally-responsible, high-quality health care, AI has the potential to have numerous impacts. AI can improve diagnostic accuracy while limiting errors and impact patient safety such as assisting with prescription delivery.8,9,34 It can screen and triage patients, alerting clinicians to those needing more urgent evaluation.8,23,77,97 AI also may increase a clinician’s efficiency and speed to render a diagnosis.12,13,55,97 AI can provide a rapid second opinion, an ability especially beneficial in underserved areas with shortages of specialists.23,25,26,29,34 Similarly, AI may decrease the inter- and intraobserver variability common in many medical specialties.12,27,45 AI applications can also monitor disease progression, identifying patients at greatest risk, and provide information for prognosis.21,23,56,58 Finally, as described with applications using IBM Watson, AI can allow for an integrated approach to health care that is currently lacking.
We have described many reports suggesting AI can render diagnoses as well as or better than experienced clinicians, and speculation exists that AI will replace many roles currently performed by health care practitioners.9,26 However, most studies demonstrate that AI’s diagnostic benefits are best realized when used to supplement a clinician’s impression.8,22,30,33,52,54,56,69,84 AI is not likely to replace humans in health care in the foreseeable future. The technology can be likened to the impact of CT scans developed in the 1970s in neurology. Prior to such detailed imaging, neurologists spent extensive time performing detailed physicals to render diagnoses and locate lesions before surgery. There was mistrust of this new technology and concern that CT scans would eliminate the need for neurologists.112 On the contrary, neurology is alive and well, frequently being augmented by the technologies once speculated to replace it.
Commercial AI health care platforms represented a $2 billion industry in 2018 and are growing rapidly each year.13,32 Many AI products are offered ready for implementation for various tasks, including diagnostics, patient management, and improved efficiency. Others will likely be provided as templates suitable for modification to meet the specific needs of the facility, practice, or specialty for its patient population.
AI Risks and Limitations
AI has several risks and limitations. Although there is progress in explainable AI, at times we still struggle to understand how the output provided by machine learning algorithms was created.44,48 The many layers associated with deep learning self-determine the criteria to reach its conclusion, and these criteria can continually evolve. The parameters of deep learning are not preprogrammed, and there are too many individual data points to be extrapolated or deconvoluted for evaluation at our current level of knowledge.26,51 These apparent lack of constraints cause concern for patient safety and suggest that greater validation and continued scrutiny of validity is required.8,48 Efforts are underway to create explainable AI programs to make their processes more transparent, but such clarification is limited presently.14,26,48,77
Another challenge of AI is determining the amount of training data required to function optimally. Also, if the output describes multiple variables or diagnoses, are each equally valid?113 Furthermore, many AI applications look for a specific process, such as cancer diagnoses on CXRs. However, how coexisting conditions like cardiomegaly, emphysema, pneumonia, etc, seen on CXRs will affect the diagnosis needs to be considered.51,52 Zech and colleagues provide the example that diagnoses for pneumothorax are frequently rendered on CXRs with chest tubes in place.51 They suggest that CNNs may develop a bias toward diagnosing pneumothorax when chest tubes are present. Many current studies approach an issue in isolation, a situation not realistic in real-world clinical practice.26
Most studies on AI have been retrospective, and frequently data used to train the program are preselected.13,26 The data are typically validated on available databases rather than actual patients in the clinical setting, limiting confidence in the validity of the AI output when applied to real-world situations. Currently, fewer than 12 prospective trials had been published comparing AI with traditional clinical care.13,114 Randomized prospective clinical trials are even fewer, with none currently reported from the United States.13,114 The results from several studies have been shown to diminish when repeated prospectively.114
The FDA has created a new category known as Software as a Medical Device and has a Digital Health Innovation Action Plan to regulate AI platforms. Still, the process of AI regulation is of necessity different from traditional approval processes and is continually evolving.8 The FDA approval process cannot account for the fact that the program’s parameters may continually evolve or adapt.2
Guidelines for investigating and reporting AI research with its unique attributes are being developed. Examples include the TRIPOD-ML statement and others.49,115 In September 2020, 2 publications addressed the paucity of gold-standard randomized clinical trials in clinical AI applications.116,117 The SPIRIT-AI statement expands on the original SPIRIT statement published in 2013 to guide minimal reporting standards for AI clinical trial protocols to promote transparency of design and methodology.116 Similarly, the CONSORT-AI extension, stemming from the original CONSORT statement in 1996, aims to ensure quality reporting of randomized controlled trials in AI.117
Another risk with AI is that while an individual physician making a mistake may adversely affect 1 patient, a single mistake in an AI algorithm could potentially affect thousands of patients.48 Also, AI programs developed for patient populations at a facility may not translate to another. Referred to as overfitting, this phenomenon relates to selection bias in training data sets.15,34,49,51,52 Studies have shown that programs that underrepresent certain group characteristics such as age, sex, or race may be less effective when applied to a population in which these characteristics have differing representations.8,48,49 This problem of underrepresentation has been demonstrated in programs interpreting pathology slides, radiographs, and skin lesions.15,32,51
Admittedly, most of these challenges are not specific to AI and existed in health care previously. Physicians make mistakes, treatments are sometimes used without adequate prospective studies, and medications are given without understanding their mechanism of action, much like AI-facilitated processes reach a conclusion that cannot be fully explained.48
Conclusions
The view that AI will dramatically impact health care in the coming years will likely prove true. However, much work is needed, especially because of the paucity of prospective clinical trials as has been historically required in medical research. Any concern that AI will replace HCPs seems unwarranted. Early studies suggest that even AI programs that appear to exceed human interpretation perform best when working in cooperation with and oversight from clinicians. AI’s greatest potential appears to be its ability to augment care from health professionals, improving efficiency and accuracy, and should be anticipated with enthusiasm as the field moves forward at an exponential rate.
Acknowledgments
The authors thank Makenna G. Thomas for proofreading and review of the manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital. This research has been approved by the James A. Haley Veteran’s Hospital Office of Communications and Media.
1. Bini SA. Artificial intelligence, machine learning, deep learning, and cognitive computing: what do these terms mean and how will they impact health care? J Arthroplasty. 2018;33(8):2358-2361. doi:10.1016/j.arth.2018.02.067
2. Benjamens S, Dhunnoo P, Meskó B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. NPJ Digit Med. 2020;3:118. doi:10.1038/s41746-020-00324-0
3. Viz. AI powered synchronized stroke care. Accessed September 15, 2021. https://www.viz.ai/ischemic-stroke
4. Buchanan M. The law of accelerating returns. Nat Phys. 2008;4(7):507. doi:10.1038/nphys1010
5. IBM Watson Health computes a pair of new solutions to improve healthcare data and security. Published September 10, 2015. Accessed October 21, 2020. https://www.techrepublic.com/article/ibm-watson-health-computes-a-pair-of-new-solutions-to-improve-healthcare-data-and-security
6. Borkowski AA, Kardani A, Mastorides SM, Thomas LB. Warfarin pharmacogenomics: recommendations with available patented clinical technologies. Recent Pat Biotechnol. 2014;8(2):110-115. doi:10.2174/1872208309666140904112003
7. Washington University in St. Louis. Warfarin dosing. Accessed September 15, 2021. http://www.warfarindosing.org/Source/Home.aspx
8. He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K. The practical implementation of artificial intelligence technologies in medicine. Nat Med. 2019;25(1):30-36. doi:10.1038/s41591-018-0307-0
9. Jiang F, Jiang Y, Zhi H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243. Published 2017 Jun 21. doi:10.1136/svn-2017-000101
10. Johnson KW, Torres Soto J, Glicksberg BS, et al. Artificial intelligence in cardiology. J Am Coll Cardiol. 2018;71(23):2668-2679. doi:10.1016/j.jacc.2018.03.521
11. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.
12. Cruz-Roa A, Gilmore H, Basavanhally A, et al. High-throughput adaptive sampling for whole-slide histopathology image analysis (HASHI) via convolutional neural networks: application to invasive breast cancer detection. PLoS One. 2018;13(5):e0196828. Published 2018 May 24. doi:10.1371/journal.pone.0196828
13. Nagendran M, Chen Y, Lovejoy CA, et al. Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. BMJ. 2020;368:m689. Published 2020 Mar 25. doi:10.1136/bmj.m689
14. Shimizu H, Nakayama KI. Artificial intelligence in oncology. Cancer Sci. 2020;111(5):1452-1460. doi:10.1111/cas.14377
15. Talebi-Liasi F, Markowitz O. Is artificial intelligence going to replace dermatologists?. Cutis. 2020;105(1):28-31.
16. Valliani AA, Ranti D, Oermann EK. Deep learning and neurology: a systematic review. Neurol Ther. 2019;8(2):351-365. doi:10.1007/s40120-019-00153-8
17. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539
18. Graham S, Depp C, Lee EE, et al. Artificial intelligence for mental health and mental illnesses: an overview. Curr Psychiatry Rep. 2019;21(11):116. Published 2019 Nov 7. doi:10.1007/s11920-019-1094-0
19. Golas SB, Shibahara T, Agboola S, et al. A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data. BMC Med Inform Decis Mak. 2018;18(1):44. Published 2018 Jun 22. doi:10.1186/s12911-018-0620-z
20. Mortazavi BJ, Downing NS, Bucholz EM, et al. Analysis of machine learning techniques for heart failure readmissions. Circ Cardiovasc Qual Outcomes. 2016;9(6):629-640. doi:10.1161/CIRCOUTCOMES.116.003039
21. Meyer-Bäse A, Morra L, Meyer-Bäse U, Pinker K. Current status and future perspectives of artificial intelligence in magnetic resonance breast imaging. Contrast Media Mol Imaging. 2020;2020:6805710. Published 2020 Aug 28. doi:10.1155/2020/6805710
22. Ehteshami Bejnordi B, Veta M, Johannes van Diest P, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210. doi:10.1001/jama.2017.14585
23. Borkowski AA, Viswanadhan NA, Thomas LB, Guzman RD, Deland LA, Mastorides SM. Using artificial intelligence for COVID-19 chest X-ray diagnosis. Fed Pract. 2020;37(9):398-404. doi:10.12788/fp.0045
24. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567. doi:10.1038/s41591-018-0177-5
25. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010
26. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Health. 2019;1(6):e271-e297. doi:10.1016/S2589-7500(19)30123-2
27. Nagpal K, Foote D, Liu Y, et al. Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer [published correction appears in NPJ Digit Med. 2019 Nov 19;2:113]. NPJ Digit Med. 2019;2:48. Published 2019 Jun 7. doi:10.1038/s41746-019-0112-2
28. Nam JG, Park S, Hwang EJ, et al. Development and validation of deep learning-based automatic detection algorithm for malignant pulmonary nodules on chest radiographs. Radiology. 2019;290(1):218-228. doi:10.1148/radiol.2018180237
29. Schwalbe N, Wahl B. Artificial intelligence and the future of global health. Lancet. 2020;395(10236):1579-1586. doi:10.1016/S0140-6736(20)30226-9
30. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT [published correction appears in Radiology. 2021 Apr;299(1):E225]. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491
31. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905
32. Serag A, Ion-Margineanu A, Qureshi H, et al. Translational AI and deep learning in diagnostic pathology. Front Med (Lausanne). 2019;6:185. Published 2019 Oct 1. doi:10.3389/fmed.2019.00185
33. Wang D, Khosla A, Gargeya R, Irshad H, Beck AH. Deep learning for identifying metastatic breast cancer. ArXiv. 2016 June 18:arXiv:1606.05718v1. Published online June 18, 2016. Accessed September 15, 2021. http://arxiv.org/abs/1606.05718
34. Alabdulkareem A. Artificial intelligence and dermatologists: friends or foes? J Dermatology Dermatol Surg. 2019;23(2):57-60. doi:10.4103/jdds.jdds_19_19
35. Mollalo A, Mao L, Rashidi P, Glass GE. A GIS-based artificial neural network model for spatial distribution of tuberculosis across the continental United States. Int J Environ Res Public Health. 2019;16(1):157. Published 2019 Jan 8. doi:10.3390/ijerph16010157
36. Haddawy P, Hasan AHMI, Kasantikul R, et al. Spatiotemporal Bayesian networks for malaria prediction. Artif Intell Med. 2018;84:127-138. doi:10.1016/j.artmed.2017.12.002
37. Laureano-Rosario AE, Duncan AP, Mendez-Lazaro PA, et al. Application of artificial neural networks for dengue fever outbreak predictions in the northwest coast of Yucatan, Mexico and San Juan, Puerto Rico. Trop Med Infect Dis. 2018;3(1):5. Published 2018 Jan 5. doi:10.3390/tropicalmed3010005
38. Buczak AL, Koshute PT, Babin SM, Feighner BH, Lewis SH. A data-driven epidemiological prediction method for dengue outbreaks using local and remote sensing data. BMC Med Inform Decis Mak. 2012;12:124. Published 2012 Nov 5. doi:10.1186/1472-6947-12-124
39. Scavuzzo JM, Trucco F, Espinosa M, et al. Modeling dengue vector population using remotely sensed data and machine learning. Acta Trop. 2018;185:167-175. doi:10.1016/j.actatropica.2018.05.003
40. Xue H, Bai Y, Hu H, Liang H. Influenza activity surveillance based on multiple regression model and artificial neural network. IEEE Access. 2018;6:563-575. doi:10.1109/ACCESS.2017.2771798
41. Jiang D, Hao M, Ding F, Fu J, Li M. Mapping the transmission risk of Zika virus using machine learning models. Acta Trop. 2018;185:391-399. doi:10.1016/j.actatropica.2018.06.021
42. Bragazzi NL, Dai H, Damiani G, Behzadifar M, Martini M, Wu J. How big data and artificial intelligence can help better manage the COVID-19 pandemic. Int J Environ Res Public Health. 2020;17(9):3176. Published 2020 May 2. doi:10.3390/ijerph17093176
43. Lake IR, Colón-González FJ, Barker GC, Morbey RA, Smith GE, Elliot AJ. Machine learning to refine decision making within a syndromic surveillance service. BMC Public Health. 2019;19(1):559. Published 2019 May 14. doi:10.1186/s12889-019-6916-9
44. Khan OF, Bebb G, Alimohamed NA. Artificial intelligence in medicine: what oncologists need to know about its potential-and its limitations. Oncol Exch. 2017;16(4):8-13. Accessed September 1, 2021. http://www.oncologyex.com/pdf/vol16_no4/feature_khan-ai.pdf
45. Badano LP, Keller DM, Muraru D, Torlasco C, Parati G. Artificial intelligence and cardiovascular imaging: A win-win combination. Anatol J Cardiol. 2020;24(4):214-223. doi:10.14744/AnatolJCardiol.2020.94491
46. Murdoch TB, Detsky AS. The inevitable application of big data to health care. JAMA. 2013;309(13):1351-1352. doi:10.1001/jama.2013.393
47. Greatbatch O, Garrett A, Snape K. The impact of artificial intelligence on the current and future practice of clinical cancer genomics. Genet Res (Camb). 2019;101:e9. Published 2019 Oct 31. doi:10.1017/S0016672319000089
48. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44-56. doi:10.1038/s41591-018-0300-7
49. Vollmer S, Mateen BA, Bohner G, et al. Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness [published correction appears in BMJ. 2020 Apr 1;369:m1312]. BMJ. 2020;368:l6927. Published 2020 Mar 20. doi:10.1136/bmj.l6927
50. Lindsey R, Daluiski A, Chopra S, et al. Deep neural network improves fracture detection by clinicians. Proc Natl Acad Sci U S A. 2018;115(45):11591-11596. doi:10.1073/pnas.1806905115
51. Zech JR, Badgeley MA, Liu M, Costa AB, Titano JJ, Oermann EK. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS Med. 2018;15(11):e1002683. doi:10.1371/journal.pmed.1002683
52. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582. doi:10.1148/radiol.2017162326
53. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. ArXiv. 2020 Feb 26:arXiv:2002.11379v2. Revised March 11, 2020. Accessed September 15, 2021. http://arxiv.org/abs/2002.11379
54. Salim M, Wåhlin E, Dembrower K, et al. External evaluation of 3 commercial artificial intelligence algorithms for independent assessment of screening mammograms. JAMA Oncol. 2020;6(10):1581-1588. doi:10.1001/jamaoncol.2020.3321
55. Arbabshirani MR, Fornwalt BK, Mongelluzzo GJ, et al. Advanced machine learning in action: identification of intracranial hemorrhage on computed tomography scans of the head with clinical workflow integration. NPJ Digit Med. 2018;1:9. doi:10.1038/s41746-017-0015-z
56. Sheth D, Giger ML. Artificial intelligence in the interpretation of breast cancer on MRI. J Magn Reson Imaging. 2020;51(5):1310-1324. doi:10.1002/jmri.26878
57. McKinney SM, Sieniek M, Godbole V, et al. International evaluation of an AI system for breast cancer screening. Nature. 2020;577(7788):89-94. doi:10.1038/s41586-019-1799-6
58. Booth AL, Abels E, McCaffrey P. Development of a prognostic model for mortality in COVID-19 infection using machine learning. Mod Pathol. 2021;34(3):522-531. doi:10.1038/s41379-020-00700-x
59. Xu B, Kocyigit D, Grimm R, Griffin BP, Cheng F. Applications of artificial intelligence in multimodality cardiovascular imaging: a state-of-the-art review. Prog Cardiovasc Dis. 2020;63(3):367-376. doi:10.1016/j.pcad.2020.03.003
60. Dey D, Slomka PJ, Leeson P, et al. Artificial intelligence in cardiovascular imaging: JACC state-of-the-art review. J Am Coll Cardiol. 2019;73(11):1317-1335. doi:10.1016/j.jacc.2018.12.054
61. Carewell Health. AI powered ECG diagnosis solutions. Accessed November 2, 2020. https://www.carewellhealth.com/products_aiecg.html
62. Strodthoff N, Strodthoff C. Detecting and interpreting myocardial infarction using fully convolutional neural networks. Physiol Meas. 2019;40(1):015001. doi:10.1088/1361-6579/aaf34d
63. Hannun AY, Rajpurkar P, Haghpanahi M, et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med. 2019;25(1):65-69. doi:10.1038/s41591-018-0268-3
64. Kwon JM, Jeon KH, Kim HM, et al. Comparing the performance of artificial intelligence and conventional diagnosis criteria for detecting left ventricular hypertrophy using electrocardiography. Europace. 2020;22(3):412-419. doi:10.1093/europace/euz324
65. Eko. FDA clears Eko’s AFib and heart murmur detection algorithms, making it the first AI-powered stethoscope to screen for serious heart conditions [press release]. Published January 28, 2020. Accessed September 15, 2021. https://www.businesswire.com/news/home/20200128005232/en/FDA-Clears-Eko’s-AFib-and-Heart-Murmur-Detection-Algorithms-Making-It-the-First-AI-Powered-Stethoscope-to-Screen-for-Serious-Heart-Conditions
66. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450. doi:10.1038/srep46450
67. Acs B, Rantalainen M, Hartman J. Artificial intelligence as the next step towards precision pathology. J Intern Med. 2020;288(1):62-81. doi:10.1111/joim.13030
68. Mobadersany P, Yousefi S, Amgad M, et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc Natl Acad Sci U S A. 2018;115(13):E2970-E2979. doi:10.1073/pnas.1717139115
69. Wang X, Janowczyk A, Zhou Y, et al. Prediction of recurrence in early stage non-small cell lung cancer using computer extracted nuclear features from digital H&E images. Sci Rep. 2017;7:13543. doi:10.1038/s41598-017-13773-7
70. Kulkarni PM, Robinson EJ, Pradhan JS, et al. Deep learning based on standard H&E images of primary melanoma tumors identifies patients at risk for visceral recurrence and death. Clin Cancer Res. 2020;26(5):1126-1134. doi:10.1158/1078-0432.CCR-19-1495
71. Du XL, Li WB, Hu BJ. Application of artificial intelligence in ophthalmology. Int J Ophthalmol. 2018;11(9):1555-1561. doi:10.18240/ijo.2018.09.21
72. Gunasekeran DV, Wong TY. Artificial intelligence in ophthalmology in 2020: a technology on the cusp for translation and implementation. Asia Pac J Ophthalmol (Phila). 2020;9(2):61-66. doi:10.1097/01.APO.0000656984.56467.2c
73. Ting DSW, Pasquale LR, Peng L, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2019;103(2):167-175. doi:10.1136/bjophthalmol-2018-313173
74. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2410. doi:10.1001/jama.2016.17216
75. US Food and Drug Administration. FDA permits marketing of artificial intelligence-based device to detect certain diabetes-related eye problems [press release]. Published April 11, 2018. Accessed September 15, 2021. https://www.fda.gov/news-events/press-announcements/fda-permits-marketing-artificial-intelligence-based-device-detect-certain-diabetes-related-eye
76. Long E, Chen J, Wu X, et al. Artificial intelligence manages congenital cataract with individualized prediction and telehealth computing. NPJ Digit Med. 2020;3:112. doi:10.1038/s41746-020-00319-x
77. De Fauw J, Ledsam JR, Romera-Paredes B, et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med. 2018;24(9):1342-1350. doi:10.1038/s41591-018-0107-6
78. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118. doi:10.1038/nature21056
79. Brinker TJ, Hekler A, Enk AH, et al. Deep neural networks are superior to dermatologists in melanoma image classification. Eur J Cancer. 2019;119:11-17. doi:10.1016/j.ejca.2019.05.023
80. Brinker TJ, Hekler A, Enk AH, et al. A convolutional neural network trained with dermoscopic images performed on par with 145 dermatologists in a clinical melanoma image classification task. Eur J Cancer. 2019;111:148-154. doi:10.1016/j.ejca.2019.02.005
81. Haenssle HA, Fink C, Schneiderbauer R, et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol. 2018;29(8):1836-1842. doi:10.1093/annonc/mdy166
82. Li CX, Shen CB, Xue K, et al. Artificial intelligence in dermatology: past, present, and future. Chin Med J (Engl). 2019;132(17):2017-2020. doi:10.1097/CM9.0000000000000372
83. Tschandl P, Codella N, Akay BN, et al. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. Lancet Oncol. 2019;20(7):938-947. doi:10.1016/S1470-2045(19)30333-X
84. Han SS, Park I, Eun Chang SE, et al. Augmented intelligence dermatology: deep neural networks empower medical professionals in diagnosing skin cancer and predicting treatment options for 134 skin disorders. J Invest Dermatol. 2020;140(9):1753-1761. doi:10.1016/j.jid.2020.01.019
85. Freeman K, Dinnes J, Chuchu N, et al. Algorithm based smartphone apps to assess risk of skin cancer in adults: systematic review of diagnostic accuracy studies [published correction appears in BMJ. 2020 Feb 25;368:m645]. BMJ. 2020;368:m127. Published 2020 Feb 10. doi:10.1136/bmj.m127
86. Chen YC, Ke WC, Chiu HW. Risk classification of cancer survival using ANN with gene expression data from multiple laboratories. Comput Biol Med. 2014;48:1-7. doi:10.1016/j.compbiomed.2014.02.006
87. Kim W, Kim KS, Lee JE, et al. Development of novel breast cancer recurrence prediction model using support vector machine. J Breast Cancer. 2012;15(2):230-238. doi:10.4048/jbc.2012.15.2.230
88. Merath K, Hyer JM, Mehta R, et al. Use of machine learning for prediction of patient risk of postoperative complications after liver, pancreatic, and colorectal surgery. J Gastrointest Surg. 2020;24(8):1843-1851. doi:10.1007/s11605-019-04338-2
89. Santos-García G, Varela G, Novoa N, Jiménez MF. Prediction of postoperative morbidity after lung resection using an artificial neural network ensemble. Artif Intell Med. 2004;30(1):61-69. doi:10.1016/S0933-3657(03)00059-9
90. Ibragimov B, Xing L. Segmentation of organs-at-risks in head and neck CT images using convolutional neural networks. Med Phys. 2017;44(2):547-557. doi:10.1002/mp.12045
91. Lou B, Doken S, Zhuang T, et al. An image-based deep learning framework for individualizing radiotherapy dose. Lancet Digit Health. 2019;1(3):e136-e147. doi:10.1016/S2589-7500(19)30058-5
92. Xu J, Yang P, Xue S, et al. Translating cancer genomics into precision medicine with artificial intelligence: applications, challenges and future perspectives. Hum Genet. 2019;138(2):109-124. doi:10.1007/s00439-019-01970-5
93. Patel NM, Michelini VV, Snell JM, et al. Enhancing next‐generation sequencing‐guided cancer care through cognitive computing. Oncologist. 2018;23(2):179-185. doi:10.1634/theoncologist.2017-0170
94. Le Berre C, Sandborn WJ, Aridhi S, et al. Application of artificial intelligence to gastroenterology and hepatology. Gastroenterology. 2020;158(1):76-94.e2. doi:10.1053/j.gastro.2019.08.058
95. Yang YJ, Bang CS. Application of artificial intelligence in gastroenterology. World J Gastroenterol. 2019;25(14):1666-1683. doi:10.3748/wjg.v25.i14.1666
96. Wang P, Berzin TM, Glissen Brown JR, et al. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut. 2019;68(10):1813-1819. doi:10.1136/gutjnl-2018-317500
97. Gupta R, Krishnam SP, Schaefer PW, Lev MH, Gonzalez RG. An East Coast perspective on artificial intelligence and machine learning: part 2: ischemic stroke imaging and triage. Neuroimaging Clin N Am. 2020;30(4):467-478. doi:10.1016/j.nic.2020.08.002
98. Beli M, Bobi V, Badža M, Šolaja N, Duri-Jovii M, Kosti VS. Artificial intelligence for assisting diagnostics and assessment of Parkinson’s disease—a review. Clin Neurol Neurosurg. 2019;184:105442. doi:10.1016/j.clineuro.2019.105442
99. An S, Kang C, Lee HW. Artificial intelligence and computational approaches for epilepsy. J Epilepsy Res. 2020;10(1):8-17. doi:10.14581/jer.20003
100. Pavel AM, Rennie JM, de Vries LS, et al. A machine-learning algorithm for neonatal seizure recognition: a multicentre, randomised, controlled trial. Lancet Child Adolesc Health. 2020;4(10):740-749. doi:10.1016/S2352-4642(20)30239-X
101. Afzal HMR, Luo S, Ramadan S, Lechner-Scott J. The emerging role of artificial intelligence in multiple sclerosis imaging [published online ahead of print, 2020 Oct 28]. Mult Scler. 2020;1352458520966298. doi:10.1177/1352458520966298
102. Bouton CE. Restoring movement in paralysis with a bioelectronic neural bypass approach: current state and future directions. Cold Spring Harb Perspect Med. 2019;9(11):a034306. doi:10.1101/cshperspect.a034306
103. Durstewitz D, Koppe G, Meyer-Lindenberg A. Deep neural networks in psychiatry. Mol Psychiatry. 2019;24(11):1583-1598. doi:10.1038/s41380-019-0365-9
104. Fonseka TM, Bhat V, Kennedy SH. The utility of artificial intelligence in suicide risk prediction and the management of suicidal behaviors. Aust N Z J Psychiatry. 2019;53(10):954-964. doi:10.1177/0004867419864428
105. Kessler RC, Hwang I, Hoffmire CA, et al. Developing a practical suicide risk prediction model for targeting high-risk patients in the Veterans Health Administration. Int J Methods Psychiatr Res. 2017;26(3):e1575. doi:10.1002/mpr.1575
106. Kessler RC, Bauer MS, Bishop TM, et al. Using administrative data to predict suicide after psychiatric hospitalization in the Veterans Health Administration System. Front Psychiatry. 2020;11:390. doi:10.3389/fpsyt.2020.00390
107. Kessler RC, van Loo HM, Wardenaar KJ, et al. Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports. Mol Psychiatry. 2016;21(10):1366-1371. doi:10.1038/mp.2015.198
108. Horng S, Sontag DA, Halpern Y, Jernite Y, Shapiro NI, Nathanson LA. Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning. PLoS One. 2017;12(4):e0174708. doi:10.1371/journal.pone.0174708
109. Soffer S, Klang E, Barash Y, Grossman E, Zimlichman E. Predicting in-hospital mortality at admission to the medical ward: a big-data machine learning model. Am J Med. 2021;134(2):227-234.e4. doi:10.1016/j.amjmed.2020.07.014
110. Labovitz DL, Shafner L, Reyes Gil M, Virmani D, Hanina A. Using artificial intelligence to reduce the risk of nonadherence in patients on anticoagulation therapy. Stroke. 2017;48(5):1416-1419. doi:10.1161/STROKEAHA.116.016281
111. Forlenza GP. Use of artificial intelligence to improve diabetes outcomes in patients using multiple daily injections therapy. Diabetes Technol Ther. 2019;21(S2):S24-S28. doi:10.1089/dia.2019.0077
112. Poser CM. CT scan and the practice of neurology. Arch Neurol. 1977;34(2):132. doi:10.1001/archneur.1977.00500140086023
113. Angus DC. Randomized clinical trials of artificial intelligence. JAMA. 2020;323(11):1043-1045. doi:10.1001/jama.2020.1039
114. Topol EJ. Welcoming new guidelines for AI clinical research. Nat Med. 2020;26(9):1318-1320. doi:10.1038/s41591-020-1042-x
115. Collins GS, Moons KGM. Reporting of artificial intelligence prediction models. Lancet. 2019;393(10181):1577-1579. doi:10.1016/S0140-6736(19)30037-6
116. Cruz Rivera S, Liu X, Chan AW, et al. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension. Nat Med. 2020;26(9):1351-1363. doi:10.1038/s41591-020-1037-7
117. Liu X, Cruz Rivera S, Moher D, Calvert MJ, Denniston AK; SPIRIT-AI and CONSORT-AI Working Group. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat Med. 2020;26(9):1364-1374. doi:10.1038/s41591-020-1034-x
118. McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys. 1943;5(4):115-133. doi:10.1007/BF02478259
119. Samuel AL. Some studies in machine learning using the game of Checkers. IBM J Res Dev. 1959;3(3):535-554. Accessed September 15, 2021. https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.368.2254
120. Sonoda M, Takano M, Miyahara J, Kato H. Computed radiography utilizing scanning laser stimulated luminescence. Radiology. 1983;148(3):833-838. doi:10.1148/radiology.148.3.6878707
121. Dechter R. Learning while searching in constraint-satisfaction-problems. AAAI’86: proceedings of the fifth AAAI national conference on artificial intelligence. Published 1986. Accessed September 15, 2021. https://www.aaai.org/Papers/AAAI/1986/AAAI86-029.pdf
122. Le Cun Y, Jackel LD, Boser B, et al. Handwritten digit recognition: applications of neural network chips and automatic learning. IEEE Commun Mag. 1989;27(11):41-46. doi:10.1109/35.41400
123. US Food and Drug Administration. FDA allows marketing of first whole slide imaging system for digital pathology [press release]. Published April 12, 2017. Accessed September 15, 2021. https://www.fda.gov/news-events/press-announcements/fda-allows-marketing-first-whole-slide-imaging-system-digital-pathology
Artificial Intelligence (AI) was first described in 1956 and refers to machines having the ability to learn as they receive and process information, resulting in the ability to “think” like humans.1 AI’s impact in medicine is increasing; currently, at least 29 AI medical devices and algorithms are approved by the US Food and Drug Administration (FDA) in a variety of areas, including radiograph interpretation, managing glucose levels in patients with diabetes mellitus, analyzing electrocardiograms (ECGs), and diagnosing sleep disorders among others.2 Significantly, in 2020, the Centers for Medicare and Medicaid Services (CMS) announced the first reimbursement to hospitals for an AI platform, a model for early detection of strokes.3 AI is rapidly becoming an integral part of health care, and its role will only increase in the future (Table).
As knowledge in medicine is expanding exponentially, AI has great potential to assist with handling complex patient care data. The concept of exponential growth is not a natural one. As Bini described, with exponential growth the volume of knowledge amassed over the past 10 years will now occur in perhaps only 1 year.1 Likewise, equivalent advances over the past year may take just a few months. This phenomenon is partly due to the law of accelerating returns, which states that advances feed on themselves, continually increasing the rate of further advances.4 The volume of medical data doubles every 2 to 5 years.5 Fortunately, the field of AI is growing exponentially as well and can help health care practitioners (HCPs) keep pace, allowing the continued delivery of effective health care.
In this report, we review common terminology, principles, and general applications of AI, followed by current and potential applications of AI for selected medical specialties. Finally, we discuss AI’s future in health care, along with potential risks and pitfalls.
AI Overview
AI refers to machine programs that can “learn” or think based on past experiences. This functionality contrasts with simple rules-based programming available to health care for years. An example of rules-based programming is the warfarindosing.org website developed by Barnes-Jewish Hospital at Washington University Medical Center, which guides initial warfarin dosing.6,7 The prescriber inputs detailed patient information, including age, sex, height, weight, tobacco history, medications, laboratory results, and genotype if available. The application then calculates recommended warfarin dosing regimens to avoid over- or underanticoagulation. While the dosing algorithm may be complex, it depends entirely on preprogrammed rules. The program does not learn to reach its conclusions and recommendations from patient data.
In contrast, one of the most common subsets of AI is machine learning (ML). ML describes a program that “learns from experience and improves its performance as it learns.”1 With ML, the computer is initially provided with a training data set—data with known outcomes or labels. Because the initial data are input from known samples, this type of AI is known as supervised learning.8-10 As an example, we recently reported using ML to diagnose various types of cancer from pathology slides.11 In one experiment, we captured images of colon adenocarcinoma and normal colon (these 2 groups represent the training data set). Unlike traditional programming, we did not define characteristics that would differentiate colon cancer from normal; rather, the machine learned these characteristics independently by assessing the labeled images provided. A second data set (the validation data set) was used to evaluate the program and fine-tune the ML training model’s parameters. Finally, the program was presented with new images of cancer and normal cases for final assessment of accuracy (test data set). Our program learned to recognize differences from the images provided and was able to differentiate normal and cancer images with > 95% accuracy.
Advances in computer processing have allowed for the development of artificial neural networks (ANNs). While there are several types of ANNs, the most common types used for image classification and segmentation are known as convolutional neural networks (CNNs).9,12-14 The programs are designed to work similar to the human brain, specifically the visual cortex.15,16 As data are acquired, they are processed by various layers in the program. Much like neurons in the brain, one layer decides whether to advance information to the next.13,14 CNNs can be many layers deep, leading to the term deep learning: “computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction.”1,13,17
ANNs can process larger volumes of data. This advance has led to the development of unstructured or unsupervised learning. With this type of learning, imputing defined features (ie, predetermined answers) of the training data set described above is no longer required.1,8,10,14 The advantage of unsupervised learning is that the program can be presented raw data and extract meaningful interpretation without human input, often with less bias than may exist with supervised learning.1,18 If shown enough data, the program can extract relevant features to make conclusions independently without predefined definitions, potentially uncovering markers not previously known. For example, several studies have used unsupervised learning to search patient data to assess readmission risks of patients with congestive heart failure.10,19,20 AI compiled features independently and not previously defined, predicting patients at greater risk for readmission superior to traditional methods.
A more detailed description of the various terminologies and techniques of AI is beyond the scope of this review.9,10,17,21 However, in this basic overview, we describe 4 general areas that AI impacts health care (Figure).
Health Care Applications
Image analysis has seen the most AI health care applications.8,15 AI has shown potential in interpreting many types of medical images, including pathology slides, radiographs of various types, retina and other eye scans, and photographs of skin lesions. Many studies have demonstrated that AI can interpret these images as accurately as or even better than experienced clinicians.9,13,22-29 Studies have suggested AI interpretation of radiographs may better distinguish patients infected with COVID-19 from other causes of pneumonia, and AI interpretation of pathology slides may detect specific genetic mutations not previously identified without additional molecular tests.11,14,23,24,30-32
The second area in which AI can impact health care is improving workflow and efficiency. AI has improved surgery scheduling, saving significant revenue, and decreased patient wait times for appointments.1 AI can screen and triage radiographs, allowing attention to be directed to critical patients. This use would be valuable in many busy clinical settings, such as the recent COVID-19 pandemic.8,23 Similarly, AI can screen retina images to prioritize urgent conditions.25 AI has improved pathologists’ efficiency when used to detect breast metastases.33 Finally, AI may reduce medical errors, thereby ensuring patient safety.8,9,34
A third health care benefit of AI is in public health and epidemiology. AI can assist with clinical decision-making and diagnoses in low-income countries and areas with limited health care resources and personnel.25,29 AI can improve identification of infectious outbreaks, such as tuberculosis, malaria, dengue fever, and influenza.29,35-40 AI has been used to predict transmission patterns of the Zika virus and the current COVID-19 pandemic.41,42 Applications can stratify the risk of outbreaks based on multiple factors, including age, income, race, atypical geographic clusters, and seasonal factors like rainfall and temperature.35,36,38,43 AI has been used to assess morbidity and mortality, such as predicting disease severity with malaria and identifying treatment failures in tuberculosis.29
Finally, AI can dramatically impact health care due to processing large data sets or disconnected volumes of patient information—so-called big data.44-46 An example is the widespread use of electronic health records (EHRs) such as the Computerized Patient Record System used in Veteran Affairs medical centers (VAMCs). Much of patient information exists in written text: HCP notes, laboratory and radiology reports, medication records, etc. Natural language processing (NLP) allows platforms to sort through extensive volumes of data on complex patients at rates much faster than human capability, which has great potential to assist with diagnosis and treatment decisions.9
Medical literature is being produced at rates that exceed our ability to digest. More than 200,000 cancer-related articles were published in 2019 alone.14 NLP capabilities of AI have the potential to rapidly sort through this extensive medical literature and relate specific verbiage in patient records guiding therapy.46 IBM Watson, a supercomputer based on ML and NLP, demonstrates this concept with many potential applications, only some of which relate to health care.1,9 Watson has an oncology component to assimilate multiple aspects of patient care, including clinical notes, pathology results, radiograph findings, staging, and a tumor’s genetic profile. It coordinates these inputs from the EHR and mines medical literature and research databases to recommend treatment options.1,46 AI can assess and compile far greater patient data and therapeutic options than would be feasible by individual clinicians, thus providing customized patient care.47 Watson has partnered with numerous medical centers, including MD Anderson Cancer Center and Memorial Sloan Kettering Cancer Center, with variable success.44,47-49 While the full potential of Watson appears not yet realized, these AI-driven approaches will likely play an important role in leveraging the hidden value in the expanding volume of health care information.
Medical Specialty Applications
Radiology
Currently > 70% of FDA-approved AI medical devices are in the field of radiology.2 Most radiology departments have used AI-friendly digital imaging for years, such as the picture archiving and communication systems used by numerous health care systems, including VAMCs.2,15 Gray-scale images common in radiology lend themselves to standardization, although AI is not limited to black-and- white image interpretation.15
An abundance of literature describes plain radiograph interpretation using AI. One FDA-approved platform improved X-ray diagnosis of wrist fractures when used by emergency medicine clinicians.2,50 AI has been applied to chest X-ray (CXR) interpretation of many conditions, including pneumonia, tuberculosis, malignant lung lesions, and COVID-19.23,25,28,44,51-53 For example, Nam and colleagues suggested AI is better at diagnosing malignant pulmonary nodules from CXRs than are trained radiologists.28
In addition to plain radiographs, AI has been applied to many other imaging technologies, including ultrasounds, positron emission tomography, mammograms, computed tomography (CT), and magnetic resonance imaging (MRI).15,26,44,48,54-56 A large study demonstrated that ML platforms significantly reduced the time to diagnose intracranial hemorrhages on CT and identified subtle hemorrhages missed by radiologists.55 Other studies have claimed that AI programs may be better than radiologists in detecting cancer in screening mammograms, and 3 FDA-approved devices focus on mammogram interpretation.2,15,54,57 There is also great interest in MRI applications to detect and predict prognosis for breast cancer based on imaging findings.21,56
Aside from providing accurate diagnoses, other studies focus on AI radiograph interpretation to assist with patient screening, triage, improving time to final diagnosis, providing a rapid “second opinion,” and even monitoring disease progression and offering insights into prognosis.8,21,23,52,55,56,58 These features help in busy urban centers but may play an even greater role in areas with limited access to health care or trained specialists such as radiologists.52
Cardiology
Cardiology has the second highest number of FDA-approved AI applications.2 Many cardiology AI platforms involve image analysis, as described in several recent reviews.45,59,60 AI has been applied to echocardiography to measure ejection fractions, detect valvular disease, and assess heart failure from hypertrophic and restrictive cardiomyopathy and amyloidosis.45,48,59 Applications for cardiac CT scans and CT angiography have successfully quantified both calcified and noncalcified coronary artery plaques and lumen assessments, assessed myocardial perfusion, and performed coronary artery calcium scoring.45,59,60 Likewise, AI applications for cardiac MRI have been used to quantitate ejection fraction, large vessel flow assessment, and cardiac scar burden.45,59
For years ECG devices have provided interpretation with limited accuracy using preprogrammed parameters.48 However, the application of AI allows ECG interpretation on par with trained cardiologists. Numerous such AI applications exist, and 2 FDA-approved devices perform ECG interpretation.2,61-64 One of these devices incorporates an AI-powered stethoscope to detect atrial fibrillation and heart murmurs.65
Pathology
The advancement of whole slide imaging, wherein entire slides can be scanned and digitized at high speed and resolution, creates great potential for AI applications in pathology.12,24,32,33,66 A landmark study demonstrating the potential of AI for assessing whole slide imaging examined sentinel lymph node metastases in patients with breast cancer.22 Multiple algorithms in the study demonstrated that AI was equivalent or better than pathologists in detecting metastases, especially when the pathologists were time-constrained consistent with a normal working environment. Significantly, the most accurate and efficient diagnoses were achieved when the pathologist and AI interpretations were used together.22,33
AI has shown promise in diagnosing many other entities, including cancers of the prostate (including Gleason scoring), lung, colon, breast, and skin.11,12,24,27,32,67 In addition, AI has shown great potential in scoring biomarkers important for prognosis and treatment, such as immunohistochemistry (IHC) labeling of Ki-67 and PD-L1.32 Pathologists can have difficulty classifying certain tumors or determining the site of origin for metastases, often having to rely on IHC with limited success. The unique features of image analysis with AI have the potential to assist in classifying difficult tumors and identifying sites of origin for metastatic disease based on morphology alone.11
Oncology depends heavily on molecular pathology testing to dictate treatment options and determine prognosis. Preliminary studies suggest that AI interpretation alone has the potential to delineate whether certain molecular mutations are present in tumors from various sites.11,14,24,32 One study combined histology and genomic results for AI interpretation that improved prognostic predictions.68 In addition, AI analysis may have potential in predicting tumor recurrence or prognosis based on cellular features, as demonstrated for lung cancer and melanoma.67,69,70
Ophthalmology
AI applications for ophthalmology have focused on diabetic retinopathy, age-related macular degeneration, glaucoma, retinopathy of prematurity, age-related and congenital cataracts, and retinal vein occlusion.71-73 Diabetic retinopathy is a leading cause of blindness and has been studied by numerous platforms with good success, most having used color fundus photography.71,72 One study showed AI could diagnose diabetic retinopathy and diabetic macular edema with specificities similar to ophthalmologists.74 In 2018, the FDA approved the AI platform IDx-DR. This diagnostic system classifies retinal images and recommends referral for patients determined to have “more than mild diabetic retinopathy” and reexamination within a year for other patients.8,75 Significantly, the platform recommendations do not require confirmation by a clinician.8
AI has been applied to other modalities in ophthalmology such as optical coherence tomography (OCT) to diagnose retinal disease and to predict appropriate management of congenital cataracts.25,73,76 For example, an AI application using OCT has been demonstrated to match or exceed the accuracy of retinal experts in diagnosing and triaging patients with a variety of retinal pathologies, including patients needing urgent referrals.77
Dermatology
Multiple studies demonstrate AI performs at least equal to experienced dermatologists in differentiating selected skin lesions.78-81 For example, Esteva and colleagues demonstrated AI could differentiate keratinocyte carcinomas from benign seborrheic keratoses and malignant melanomas from benign nevi with accuracy equal to 21 board-certified dermatologists.78
AI is applicable to various imaging procedures common to dermatology, such as dermoscopy, very high-frequency ultrasound, and reflectance confocal microscopy.82 Several studies have demonstrated that AI interpretation compared favorably to dermatologists evaluating dermoscopy to assess melanocytic lesions.78-81,83
A limitation in these studies is that they differentiate only a few diagnoses.82 Furthermore, dermatologists have sensory input such as touch and visual examination under various conditions, something AI has yet to replicate.15,34,84 Also, most AI devices use no or limited clinical information.81 Dermatologists can recognize rarer conditions for which AI models may have had limited or no training.34 Nevertheless, a recent study assessed AI for the diagnosis of 134 separate skin disorders with promising results, including providing diagnoses with accuracy comparable to that of dermatologists and providing accurate treatment strategies.84 As Topol points out, most skin lesions are diagnosed in the primary care setting where AI can have a greater impact when used in conjunction with the clinical impression, especially where specialists are in limited supply.48,78
Finally, dermatology lends itself to using portable or smartphone applications (apps) wherein the user can photograph a lesion for analysis by AI algorithms to assess the need for further evaluation or make treatment recommendations.34,84,85 Although results from currently available apps are not encouraging, they may play a greater role as the technology advances.34,85
Oncology
Applications of AI in oncology include predicting prognosis for patients with cancer based on histologic and/or genetic information.14,68,86 Programs can predict the risk of complications before and recurrence risks after surgery for malignancies.44,87-89 AI can also assist in treatment planning and predict treatment failure with radiation therapy.90,91
AI has great potential in processing the large volumes of patient data in cancer genomics. Next-generation sequencing has allowed for the identification of millions of DNA sequences in a single tumor to detect genetic anomalies.92 Thousands of mutations can be found in individual tumor samples, and processing this information and determining its significance can be beyond human capability.14 We know little about the effects of various mutation combinations, and most tumors have a heterogeneous molecular profile among different cell populations.14,93 The presence or absence of various mutations can have diagnostic, prognostic, and therapeutic implications.93 AI has great potential to sort through these complex data and identify actionable findings.
More than 200,000 cancer-related articles were published in 2019, and publications in the field of cancer genomics are increasing exponentially.14,92,93 Patel and colleagues assessed the utility of IBM Watson for Genomics against results from a molecular tumor board.93 Watson for Genomics identified potentially significant mutations not identified by the tumor board in 32% of patients. Most mutations were related to new clinical trials not yet added to the tumor board watch list, demonstrating the role AI will have in processing the large volume of genetic data required to deliver personalized medicine moving forward.
Gastroenterology
AI has shown promise in predicting risk or outcomes based on clinical parameters in various common gastroenterology problems, including gastric reflux, acute pancreatitis, gastrointestinal bleeding, celiac disease, and inflammatory bowel disease.94,95 AI endoscopic analysis has demonstrated potential in assessing Barrett’s esophagus, gastric Helicobacter pylori infections, gastric atrophy, and gastric intestinal metaplasia.95 Applications have been used to assess esophageal, gastric, and colonic malignancies, including depth of invasion based on endoscopic images.95 Finally, studies have evaluated AI to assess small colon polyps during colonoscopy, including differentiating benign and premalignant polyps with success comparable to gastroenterologists.94,95 AI has been shown to increase the speed and accuracy of gastroenterologists in detecting small polyps during colonoscopy.48 In a prospective randomized study, colonoscopies performed using an AI device identified significantly more small adenomatous polyps than colonoscopies without AI.96
Neurology
It has been suggested that AI technologies are well suited for application in neurology due to the subtle presentation of many neurologic diseases.16 Viz LVO, the first CMS-approved AI reimbursement for the diagnosis of strokes, analyzes CTs to detect early ischemic strokes and alerts the medical team, thus shortening time to treatment.3,97 Many other AI platforms are in use or development that use CT and MRI for the early detection of strokes as well as for treatment and prognosis.9,97
AI technologies have been applied to neurodegenerative diseases, such as Alzheimer and Parkinson diseases.16,98 For example, several studies have evaluated patient movements in Parkinson disease for both early diagnosis and to assess response to treatment.98 These evaluations included assessment with both external cameras as well as wearable devices and smartphone apps.
AI has also been applied to seizure disorders, attempting to determine seizure type, localize the area of seizure onset, and address the challenges of identifying seizures in neonates.99,100 Other potential applications range from early detection and prognosis predictions for cases of multiple sclerosis to restoring movement in paralysis from a variety of conditions such as spinal cord injury.9,101,102
Mental Health
Due to the interactive nature of mental health care, the field has been slower to develop AI applications.18 With heavy reliance on textual information (eg, clinic notes, mood rating scales, and documentation of conversations), successful AI applications in this field will likely rely heavily on NLP.18 However, studies investigating the application of AI to mental health have also incorporated data such as brain imaging, smartphone monitoring, and social media platforms, such as Facebook and Twitter.18,103,104
The risk of suicide is higher in veteran patients, and ML algorithms have had limited success in predicting suicide risk in both veteran and nonveteran populations.104-106 While early models have low positive predictive values and low sensitivities, they still promise to be a useful tool in conjunction with traditional risk assessments.106 Kessler and colleagues suggest that combining multiple rather than single ML algorithms might lead to greater success.105,106
AI may assist in diagnosing other mental health disorders, including major depressive disorder, attention deficit hyperactivity disorder (ADHD), schizophrenia, posttraumatic stress disorder, and Alzheimer disease.103,104,107 These investigations are in the early stages with limited clinical applicability. However, 2 AI applications awaiting FDA approval relate to ADHD and opioid use.2 Furthermore, potential exists for AI to not only assist with prevention and diagnosis of ADHD, but also to identify optimal treatment options.2,103
General and Personalized Medicine
Additional AI applications include diagnosing patients with suspected sepsis, measuring liver iron concentrations, predicting hospital mortality at the time of admission, and more.2,108,109 AI can guide end-of-life decisions such as resuscitation status or whether to initiate mechanical ventilation.48
AI-driven smartphone apps can be beneficial to both patients and clinicians. Examples include predicting nonadherence to anticoagulation therapy, monitoring heart rhythms for atrial fibrillation or signs of hyperkalemia in patients with renal failure, and improving outcomes for patients with diabetes mellitus by decreasing glycemic variability and reducing hypoglycemia.8,48,110,111 The potential for AI applications to health care and personalized medicine are almost limitless.
Discussion
With ever-increasing expectations for all health care sectors to deliver timely, fiscally-responsible, high-quality health care, AI has the potential to have numerous impacts. AI can improve diagnostic accuracy while limiting errors and impact patient safety such as assisting with prescription delivery.8,9,34 It can screen and triage patients, alerting clinicians to those needing more urgent evaluation.8,23,77,97 AI also may increase a clinician’s efficiency and speed to render a diagnosis.12,13,55,97 AI can provide a rapid second opinion, an ability especially beneficial in underserved areas with shortages of specialists.23,25,26,29,34 Similarly, AI may decrease the inter- and intraobserver variability common in many medical specialties.12,27,45 AI applications can also monitor disease progression, identifying patients at greatest risk, and provide information for prognosis.21,23,56,58 Finally, as described with applications using IBM Watson, AI can allow for an integrated approach to health care that is currently lacking.
We have described many reports suggesting AI can render diagnoses as well as or better than experienced clinicians, and speculation exists that AI will replace many roles currently performed by health care practitioners.9,26 However, most studies demonstrate that AI’s diagnostic benefits are best realized when used to supplement a clinician’s impression.8,22,30,33,52,54,56,69,84 AI is not likely to replace humans in health care in the foreseeable future. The technology can be likened to the impact of CT scans developed in the 1970s in neurology. Prior to such detailed imaging, neurologists spent extensive time performing detailed physicals to render diagnoses and locate lesions before surgery. There was mistrust of this new technology and concern that CT scans would eliminate the need for neurologists.112 On the contrary, neurology is alive and well, frequently being augmented by the technologies once speculated to replace it.
Commercial AI health care platforms represented a $2 billion industry in 2018 and are growing rapidly each year.13,32 Many AI products are offered ready for implementation for various tasks, including diagnostics, patient management, and improved efficiency. Others will likely be provided as templates suitable for modification to meet the specific needs of the facility, practice, or specialty for its patient population.
AI Risks and Limitations
AI has several risks and limitations. Although there is progress in explainable AI, at times we still struggle to understand how the output provided by machine learning algorithms was created.44,48 The many layers associated with deep learning self-determine the criteria to reach its conclusion, and these criteria can continually evolve. The parameters of deep learning are not preprogrammed, and there are too many individual data points to be extrapolated or deconvoluted for evaluation at our current level of knowledge.26,51 These apparent lack of constraints cause concern for patient safety and suggest that greater validation and continued scrutiny of validity is required.8,48 Efforts are underway to create explainable AI programs to make their processes more transparent, but such clarification is limited presently.14,26,48,77
Another challenge of AI is determining the amount of training data required to function optimally. Also, if the output describes multiple variables or diagnoses, are each equally valid?113 Furthermore, many AI applications look for a specific process, such as cancer diagnoses on CXRs. However, how coexisting conditions like cardiomegaly, emphysema, pneumonia, etc, seen on CXRs will affect the diagnosis needs to be considered.51,52 Zech and colleagues provide the example that diagnoses for pneumothorax are frequently rendered on CXRs with chest tubes in place.51 They suggest that CNNs may develop a bias toward diagnosing pneumothorax when chest tubes are present. Many current studies approach an issue in isolation, a situation not realistic in real-world clinical practice.26
Most studies on AI have been retrospective, and frequently data used to train the program are preselected.13,26 The data are typically validated on available databases rather than actual patients in the clinical setting, limiting confidence in the validity of the AI output when applied to real-world situations. Currently, fewer than 12 prospective trials had been published comparing AI with traditional clinical care.13,114 Randomized prospective clinical trials are even fewer, with none currently reported from the United States.13,114 The results from several studies have been shown to diminish when repeated prospectively.114
The FDA has created a new category known as Software as a Medical Device and has a Digital Health Innovation Action Plan to regulate AI platforms. Still, the process of AI regulation is of necessity different from traditional approval processes and is continually evolving.8 The FDA approval process cannot account for the fact that the program’s parameters may continually evolve or adapt.2
Guidelines for investigating and reporting AI research with its unique attributes are being developed. Examples include the TRIPOD-ML statement and others.49,115 In September 2020, 2 publications addressed the paucity of gold-standard randomized clinical trials in clinical AI applications.116,117 The SPIRIT-AI statement expands on the original SPIRIT statement published in 2013 to guide minimal reporting standards for AI clinical trial protocols to promote transparency of design and methodology.116 Similarly, the CONSORT-AI extension, stemming from the original CONSORT statement in 1996, aims to ensure quality reporting of randomized controlled trials in AI.117
Another risk with AI is that while an individual physician making a mistake may adversely affect 1 patient, a single mistake in an AI algorithm could potentially affect thousands of patients.48 Also, AI programs developed for patient populations at a facility may not translate to another. Referred to as overfitting, this phenomenon relates to selection bias in training data sets.15,34,49,51,52 Studies have shown that programs that underrepresent certain group characteristics such as age, sex, or race may be less effective when applied to a population in which these characteristics have differing representations.8,48,49 This problem of underrepresentation has been demonstrated in programs interpreting pathology slides, radiographs, and skin lesions.15,32,51
Admittedly, most of these challenges are not specific to AI and existed in health care previously. Physicians make mistakes, treatments are sometimes used without adequate prospective studies, and medications are given without understanding their mechanism of action, much like AI-facilitated processes reach a conclusion that cannot be fully explained.48
Conclusions
The view that AI will dramatically impact health care in the coming years will likely prove true. However, much work is needed, especially because of the paucity of prospective clinical trials as has been historically required in medical research. Any concern that AI will replace HCPs seems unwarranted. Early studies suggest that even AI programs that appear to exceed human interpretation perform best when working in cooperation with and oversight from clinicians. AI’s greatest potential appears to be its ability to augment care from health professionals, improving efficiency and accuracy, and should be anticipated with enthusiasm as the field moves forward at an exponential rate.
Acknowledgments
The authors thank Makenna G. Thomas for proofreading and review of the manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital. This research has been approved by the James A. Haley Veteran’s Hospital Office of Communications and Media.
Artificial Intelligence (AI) was first described in 1956 and refers to machines having the ability to learn as they receive and process information, resulting in the ability to “think” like humans.1 AI’s impact in medicine is increasing; currently, at least 29 AI medical devices and algorithms are approved by the US Food and Drug Administration (FDA) in a variety of areas, including radiograph interpretation, managing glucose levels in patients with diabetes mellitus, analyzing electrocardiograms (ECGs), and diagnosing sleep disorders among others.2 Significantly, in 2020, the Centers for Medicare and Medicaid Services (CMS) announced the first reimbursement to hospitals for an AI platform, a model for early detection of strokes.3 AI is rapidly becoming an integral part of health care, and its role will only increase in the future (Table).
As knowledge in medicine is expanding exponentially, AI has great potential to assist with handling complex patient care data. The concept of exponential growth is not a natural one. As Bini described, with exponential growth the volume of knowledge amassed over the past 10 years will now occur in perhaps only 1 year.1 Likewise, equivalent advances over the past year may take just a few months. This phenomenon is partly due to the law of accelerating returns, which states that advances feed on themselves, continually increasing the rate of further advances.4 The volume of medical data doubles every 2 to 5 years.5 Fortunately, the field of AI is growing exponentially as well and can help health care practitioners (HCPs) keep pace, allowing the continued delivery of effective health care.
In this report, we review common terminology, principles, and general applications of AI, followed by current and potential applications of AI for selected medical specialties. Finally, we discuss AI’s future in health care, along with potential risks and pitfalls.
AI Overview
AI refers to machine programs that can “learn” or think based on past experiences. This functionality contrasts with simple rules-based programming available to health care for years. An example of rules-based programming is the warfarindosing.org website developed by Barnes-Jewish Hospital at Washington University Medical Center, which guides initial warfarin dosing.6,7 The prescriber inputs detailed patient information, including age, sex, height, weight, tobacco history, medications, laboratory results, and genotype if available. The application then calculates recommended warfarin dosing regimens to avoid over- or underanticoagulation. While the dosing algorithm may be complex, it depends entirely on preprogrammed rules. The program does not learn to reach its conclusions and recommendations from patient data.
In contrast, one of the most common subsets of AI is machine learning (ML). ML describes a program that “learns from experience and improves its performance as it learns.”1 With ML, the computer is initially provided with a training data set—data with known outcomes or labels. Because the initial data are input from known samples, this type of AI is known as supervised learning.8-10 As an example, we recently reported using ML to diagnose various types of cancer from pathology slides.11 In one experiment, we captured images of colon adenocarcinoma and normal colon (these 2 groups represent the training data set). Unlike traditional programming, we did not define characteristics that would differentiate colon cancer from normal; rather, the machine learned these characteristics independently by assessing the labeled images provided. A second data set (the validation data set) was used to evaluate the program and fine-tune the ML training model’s parameters. Finally, the program was presented with new images of cancer and normal cases for final assessment of accuracy (test data set). Our program learned to recognize differences from the images provided and was able to differentiate normal and cancer images with > 95% accuracy.
Advances in computer processing have allowed for the development of artificial neural networks (ANNs). While there are several types of ANNs, the most common types used for image classification and segmentation are known as convolutional neural networks (CNNs).9,12-14 The programs are designed to work similar to the human brain, specifically the visual cortex.15,16 As data are acquired, they are processed by various layers in the program. Much like neurons in the brain, one layer decides whether to advance information to the next.13,14 CNNs can be many layers deep, leading to the term deep learning: “computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction.”1,13,17
ANNs can process larger volumes of data. This advance has led to the development of unstructured or unsupervised learning. With this type of learning, imputing defined features (ie, predetermined answers) of the training data set described above is no longer required.1,8,10,14 The advantage of unsupervised learning is that the program can be presented raw data and extract meaningful interpretation without human input, often with less bias than may exist with supervised learning.1,18 If shown enough data, the program can extract relevant features to make conclusions independently without predefined definitions, potentially uncovering markers not previously known. For example, several studies have used unsupervised learning to search patient data to assess readmission risks of patients with congestive heart failure.10,19,20 AI compiled features independently and not previously defined, predicting patients at greater risk for readmission superior to traditional methods.
A more detailed description of the various terminologies and techniques of AI is beyond the scope of this review.9,10,17,21 However, in this basic overview, we describe 4 general areas that AI impacts health care (Figure).
Health Care Applications
Image analysis has seen the most AI health care applications.8,15 AI has shown potential in interpreting many types of medical images, including pathology slides, radiographs of various types, retina and other eye scans, and photographs of skin lesions. Many studies have demonstrated that AI can interpret these images as accurately as or even better than experienced clinicians.9,13,22-29 Studies have suggested AI interpretation of radiographs may better distinguish patients infected with COVID-19 from other causes of pneumonia, and AI interpretation of pathology slides may detect specific genetic mutations not previously identified without additional molecular tests.11,14,23,24,30-32
The second area in which AI can impact health care is improving workflow and efficiency. AI has improved surgery scheduling, saving significant revenue, and decreased patient wait times for appointments.1 AI can screen and triage radiographs, allowing attention to be directed to critical patients. This use would be valuable in many busy clinical settings, such as the recent COVID-19 pandemic.8,23 Similarly, AI can screen retina images to prioritize urgent conditions.25 AI has improved pathologists’ efficiency when used to detect breast metastases.33 Finally, AI may reduce medical errors, thereby ensuring patient safety.8,9,34
A third health care benefit of AI is in public health and epidemiology. AI can assist with clinical decision-making and diagnoses in low-income countries and areas with limited health care resources and personnel.25,29 AI can improve identification of infectious outbreaks, such as tuberculosis, malaria, dengue fever, and influenza.29,35-40 AI has been used to predict transmission patterns of the Zika virus and the current COVID-19 pandemic.41,42 Applications can stratify the risk of outbreaks based on multiple factors, including age, income, race, atypical geographic clusters, and seasonal factors like rainfall and temperature.35,36,38,43 AI has been used to assess morbidity and mortality, such as predicting disease severity with malaria and identifying treatment failures in tuberculosis.29
Finally, AI can dramatically impact health care due to processing large data sets or disconnected volumes of patient information—so-called big data.44-46 An example is the widespread use of electronic health records (EHRs) such as the Computerized Patient Record System used in Veteran Affairs medical centers (VAMCs). Much of patient information exists in written text: HCP notes, laboratory and radiology reports, medication records, etc. Natural language processing (NLP) allows platforms to sort through extensive volumes of data on complex patients at rates much faster than human capability, which has great potential to assist with diagnosis and treatment decisions.9
Medical literature is being produced at rates that exceed our ability to digest. More than 200,000 cancer-related articles were published in 2019 alone.14 NLP capabilities of AI have the potential to rapidly sort through this extensive medical literature and relate specific verbiage in patient records guiding therapy.46 IBM Watson, a supercomputer based on ML and NLP, demonstrates this concept with many potential applications, only some of which relate to health care.1,9 Watson has an oncology component to assimilate multiple aspects of patient care, including clinical notes, pathology results, radiograph findings, staging, and a tumor’s genetic profile. It coordinates these inputs from the EHR and mines medical literature and research databases to recommend treatment options.1,46 AI can assess and compile far greater patient data and therapeutic options than would be feasible by individual clinicians, thus providing customized patient care.47 Watson has partnered with numerous medical centers, including MD Anderson Cancer Center and Memorial Sloan Kettering Cancer Center, with variable success.44,47-49 While the full potential of Watson appears not yet realized, these AI-driven approaches will likely play an important role in leveraging the hidden value in the expanding volume of health care information.
Medical Specialty Applications
Radiology
Currently > 70% of FDA-approved AI medical devices are in the field of radiology.2 Most radiology departments have used AI-friendly digital imaging for years, such as the picture archiving and communication systems used by numerous health care systems, including VAMCs.2,15 Gray-scale images common in radiology lend themselves to standardization, although AI is not limited to black-and- white image interpretation.15
An abundance of literature describes plain radiograph interpretation using AI. One FDA-approved platform improved X-ray diagnosis of wrist fractures when used by emergency medicine clinicians.2,50 AI has been applied to chest X-ray (CXR) interpretation of many conditions, including pneumonia, tuberculosis, malignant lung lesions, and COVID-19.23,25,28,44,51-53 For example, Nam and colleagues suggested AI is better at diagnosing malignant pulmonary nodules from CXRs than are trained radiologists.28
In addition to plain radiographs, AI has been applied to many other imaging technologies, including ultrasounds, positron emission tomography, mammograms, computed tomography (CT), and magnetic resonance imaging (MRI).15,26,44,48,54-56 A large study demonstrated that ML platforms significantly reduced the time to diagnose intracranial hemorrhages on CT and identified subtle hemorrhages missed by radiologists.55 Other studies have claimed that AI programs may be better than radiologists in detecting cancer in screening mammograms, and 3 FDA-approved devices focus on mammogram interpretation.2,15,54,57 There is also great interest in MRI applications to detect and predict prognosis for breast cancer based on imaging findings.21,56
Aside from providing accurate diagnoses, other studies focus on AI radiograph interpretation to assist with patient screening, triage, improving time to final diagnosis, providing a rapid “second opinion,” and even monitoring disease progression and offering insights into prognosis.8,21,23,52,55,56,58 These features help in busy urban centers but may play an even greater role in areas with limited access to health care or trained specialists such as radiologists.52
Cardiology
Cardiology has the second highest number of FDA-approved AI applications.2 Many cardiology AI platforms involve image analysis, as described in several recent reviews.45,59,60 AI has been applied to echocardiography to measure ejection fractions, detect valvular disease, and assess heart failure from hypertrophic and restrictive cardiomyopathy and amyloidosis.45,48,59 Applications for cardiac CT scans and CT angiography have successfully quantified both calcified and noncalcified coronary artery plaques and lumen assessments, assessed myocardial perfusion, and performed coronary artery calcium scoring.45,59,60 Likewise, AI applications for cardiac MRI have been used to quantitate ejection fraction, large vessel flow assessment, and cardiac scar burden.45,59
For years ECG devices have provided interpretation with limited accuracy using preprogrammed parameters.48 However, the application of AI allows ECG interpretation on par with trained cardiologists. Numerous such AI applications exist, and 2 FDA-approved devices perform ECG interpretation.2,61-64 One of these devices incorporates an AI-powered stethoscope to detect atrial fibrillation and heart murmurs.65
Pathology
The advancement of whole slide imaging, wherein entire slides can be scanned and digitized at high speed and resolution, creates great potential for AI applications in pathology.12,24,32,33,66 A landmark study demonstrating the potential of AI for assessing whole slide imaging examined sentinel lymph node metastases in patients with breast cancer.22 Multiple algorithms in the study demonstrated that AI was equivalent or better than pathologists in detecting metastases, especially when the pathologists were time-constrained consistent with a normal working environment. Significantly, the most accurate and efficient diagnoses were achieved when the pathologist and AI interpretations were used together.22,33
AI has shown promise in diagnosing many other entities, including cancers of the prostate (including Gleason scoring), lung, colon, breast, and skin.11,12,24,27,32,67 In addition, AI has shown great potential in scoring biomarkers important for prognosis and treatment, such as immunohistochemistry (IHC) labeling of Ki-67 and PD-L1.32 Pathologists can have difficulty classifying certain tumors or determining the site of origin for metastases, often having to rely on IHC with limited success. The unique features of image analysis with AI have the potential to assist in classifying difficult tumors and identifying sites of origin for metastatic disease based on morphology alone.11
Oncology depends heavily on molecular pathology testing to dictate treatment options and determine prognosis. Preliminary studies suggest that AI interpretation alone has the potential to delineate whether certain molecular mutations are present in tumors from various sites.11,14,24,32 One study combined histology and genomic results for AI interpretation that improved prognostic predictions.68 In addition, AI analysis may have potential in predicting tumor recurrence or prognosis based on cellular features, as demonstrated for lung cancer and melanoma.67,69,70
Ophthalmology
AI applications for ophthalmology have focused on diabetic retinopathy, age-related macular degeneration, glaucoma, retinopathy of prematurity, age-related and congenital cataracts, and retinal vein occlusion.71-73 Diabetic retinopathy is a leading cause of blindness and has been studied by numerous platforms with good success, most having used color fundus photography.71,72 One study showed AI could diagnose diabetic retinopathy and diabetic macular edema with specificities similar to ophthalmologists.74 In 2018, the FDA approved the AI platform IDx-DR. This diagnostic system classifies retinal images and recommends referral for patients determined to have “more than mild diabetic retinopathy” and reexamination within a year for other patients.8,75 Significantly, the platform recommendations do not require confirmation by a clinician.8
AI has been applied to other modalities in ophthalmology such as optical coherence tomography (OCT) to diagnose retinal disease and to predict appropriate management of congenital cataracts.25,73,76 For example, an AI application using OCT has been demonstrated to match or exceed the accuracy of retinal experts in diagnosing and triaging patients with a variety of retinal pathologies, including patients needing urgent referrals.77
Dermatology
Multiple studies demonstrate AI performs at least equal to experienced dermatologists in differentiating selected skin lesions.78-81 For example, Esteva and colleagues demonstrated AI could differentiate keratinocyte carcinomas from benign seborrheic keratoses and malignant melanomas from benign nevi with accuracy equal to 21 board-certified dermatologists.78
AI is applicable to various imaging procedures common to dermatology, such as dermoscopy, very high-frequency ultrasound, and reflectance confocal microscopy.82 Several studies have demonstrated that AI interpretation compared favorably to dermatologists evaluating dermoscopy to assess melanocytic lesions.78-81,83
A limitation in these studies is that they differentiate only a few diagnoses.82 Furthermore, dermatologists have sensory input such as touch and visual examination under various conditions, something AI has yet to replicate.15,34,84 Also, most AI devices use no or limited clinical information.81 Dermatologists can recognize rarer conditions for which AI models may have had limited or no training.34 Nevertheless, a recent study assessed AI for the diagnosis of 134 separate skin disorders with promising results, including providing diagnoses with accuracy comparable to that of dermatologists and providing accurate treatment strategies.84 As Topol points out, most skin lesions are diagnosed in the primary care setting where AI can have a greater impact when used in conjunction with the clinical impression, especially where specialists are in limited supply.48,78
Finally, dermatology lends itself to using portable or smartphone applications (apps) wherein the user can photograph a lesion for analysis by AI algorithms to assess the need for further evaluation or make treatment recommendations.34,84,85 Although results from currently available apps are not encouraging, they may play a greater role as the technology advances.34,85
Oncology
Applications of AI in oncology include predicting prognosis for patients with cancer based on histologic and/or genetic information.14,68,86 Programs can predict the risk of complications before and recurrence risks after surgery for malignancies.44,87-89 AI can also assist in treatment planning and predict treatment failure with radiation therapy.90,91
AI has great potential in processing the large volumes of patient data in cancer genomics. Next-generation sequencing has allowed for the identification of millions of DNA sequences in a single tumor to detect genetic anomalies.92 Thousands of mutations can be found in individual tumor samples, and processing this information and determining its significance can be beyond human capability.14 We know little about the effects of various mutation combinations, and most tumors have a heterogeneous molecular profile among different cell populations.14,93 The presence or absence of various mutations can have diagnostic, prognostic, and therapeutic implications.93 AI has great potential to sort through these complex data and identify actionable findings.
More than 200,000 cancer-related articles were published in 2019, and publications in the field of cancer genomics are increasing exponentially.14,92,93 Patel and colleagues assessed the utility of IBM Watson for Genomics against results from a molecular tumor board.93 Watson for Genomics identified potentially significant mutations not identified by the tumor board in 32% of patients. Most mutations were related to new clinical trials not yet added to the tumor board watch list, demonstrating the role AI will have in processing the large volume of genetic data required to deliver personalized medicine moving forward.
Gastroenterology
AI has shown promise in predicting risk or outcomes based on clinical parameters in various common gastroenterology problems, including gastric reflux, acute pancreatitis, gastrointestinal bleeding, celiac disease, and inflammatory bowel disease.94,95 AI endoscopic analysis has demonstrated potential in assessing Barrett’s esophagus, gastric Helicobacter pylori infections, gastric atrophy, and gastric intestinal metaplasia.95 Applications have been used to assess esophageal, gastric, and colonic malignancies, including depth of invasion based on endoscopic images.95 Finally, studies have evaluated AI to assess small colon polyps during colonoscopy, including differentiating benign and premalignant polyps with success comparable to gastroenterologists.94,95 AI has been shown to increase the speed and accuracy of gastroenterologists in detecting small polyps during colonoscopy.48 In a prospective randomized study, colonoscopies performed using an AI device identified significantly more small adenomatous polyps than colonoscopies without AI.96
Neurology
It has been suggested that AI technologies are well suited for application in neurology due to the subtle presentation of many neurologic diseases.16 Viz LVO, the first CMS-approved AI reimbursement for the diagnosis of strokes, analyzes CTs to detect early ischemic strokes and alerts the medical team, thus shortening time to treatment.3,97 Many other AI platforms are in use or development that use CT and MRI for the early detection of strokes as well as for treatment and prognosis.9,97
AI technologies have been applied to neurodegenerative diseases, such as Alzheimer and Parkinson diseases.16,98 For example, several studies have evaluated patient movements in Parkinson disease for both early diagnosis and to assess response to treatment.98 These evaluations included assessment with both external cameras as well as wearable devices and smartphone apps.
AI has also been applied to seizure disorders, attempting to determine seizure type, localize the area of seizure onset, and address the challenges of identifying seizures in neonates.99,100 Other potential applications range from early detection and prognosis predictions for cases of multiple sclerosis to restoring movement in paralysis from a variety of conditions such as spinal cord injury.9,101,102
Mental Health
Due to the interactive nature of mental health care, the field has been slower to develop AI applications.18 With heavy reliance on textual information (eg, clinic notes, mood rating scales, and documentation of conversations), successful AI applications in this field will likely rely heavily on NLP.18 However, studies investigating the application of AI to mental health have also incorporated data such as brain imaging, smartphone monitoring, and social media platforms, such as Facebook and Twitter.18,103,104
The risk of suicide is higher in veteran patients, and ML algorithms have had limited success in predicting suicide risk in both veteran and nonveteran populations.104-106 While early models have low positive predictive values and low sensitivities, they still promise to be a useful tool in conjunction with traditional risk assessments.106 Kessler and colleagues suggest that combining multiple rather than single ML algorithms might lead to greater success.105,106
AI may assist in diagnosing other mental health disorders, including major depressive disorder, attention deficit hyperactivity disorder (ADHD), schizophrenia, posttraumatic stress disorder, and Alzheimer disease.103,104,107 These investigations are in the early stages with limited clinical applicability. However, 2 AI applications awaiting FDA approval relate to ADHD and opioid use.2 Furthermore, potential exists for AI to not only assist with prevention and diagnosis of ADHD, but also to identify optimal treatment options.2,103
General and Personalized Medicine
Additional AI applications include diagnosing patients with suspected sepsis, measuring liver iron concentrations, predicting hospital mortality at the time of admission, and more.2,108,109 AI can guide end-of-life decisions such as resuscitation status or whether to initiate mechanical ventilation.48
AI-driven smartphone apps can be beneficial to both patients and clinicians. Examples include predicting nonadherence to anticoagulation therapy, monitoring heart rhythms for atrial fibrillation or signs of hyperkalemia in patients with renal failure, and improving outcomes for patients with diabetes mellitus by decreasing glycemic variability and reducing hypoglycemia.8,48,110,111 The potential for AI applications to health care and personalized medicine are almost limitless.
Discussion
With ever-increasing expectations for all health care sectors to deliver timely, fiscally-responsible, high-quality health care, AI has the potential to have numerous impacts. AI can improve diagnostic accuracy while limiting errors and impact patient safety such as assisting with prescription delivery.8,9,34 It can screen and triage patients, alerting clinicians to those needing more urgent evaluation.8,23,77,97 AI also may increase a clinician’s efficiency and speed to render a diagnosis.12,13,55,97 AI can provide a rapid second opinion, an ability especially beneficial in underserved areas with shortages of specialists.23,25,26,29,34 Similarly, AI may decrease the inter- and intraobserver variability common in many medical specialties.12,27,45 AI applications can also monitor disease progression, identifying patients at greatest risk, and provide information for prognosis.21,23,56,58 Finally, as described with applications using IBM Watson, AI can allow for an integrated approach to health care that is currently lacking.
We have described many reports suggesting AI can render diagnoses as well as or better than experienced clinicians, and speculation exists that AI will replace many roles currently performed by health care practitioners.9,26 However, most studies demonstrate that AI’s diagnostic benefits are best realized when used to supplement a clinician’s impression.8,22,30,33,52,54,56,69,84 AI is not likely to replace humans in health care in the foreseeable future. The technology can be likened to the impact of CT scans developed in the 1970s in neurology. Prior to such detailed imaging, neurologists spent extensive time performing detailed physicals to render diagnoses and locate lesions before surgery. There was mistrust of this new technology and concern that CT scans would eliminate the need for neurologists.112 On the contrary, neurology is alive and well, frequently being augmented by the technologies once speculated to replace it.
Commercial AI health care platforms represented a $2 billion industry in 2018 and are growing rapidly each year.13,32 Many AI products are offered ready for implementation for various tasks, including diagnostics, patient management, and improved efficiency. Others will likely be provided as templates suitable for modification to meet the specific needs of the facility, practice, or specialty for its patient population.
AI Risks and Limitations
AI has several risks and limitations. Although there is progress in explainable AI, at times we still struggle to understand how the output provided by machine learning algorithms was created.44,48 The many layers associated with deep learning self-determine the criteria to reach its conclusion, and these criteria can continually evolve. The parameters of deep learning are not preprogrammed, and there are too many individual data points to be extrapolated or deconvoluted for evaluation at our current level of knowledge.26,51 These apparent lack of constraints cause concern for patient safety and suggest that greater validation and continued scrutiny of validity is required.8,48 Efforts are underway to create explainable AI programs to make their processes more transparent, but such clarification is limited presently.14,26,48,77
Another challenge of AI is determining the amount of training data required to function optimally. Also, if the output describes multiple variables or diagnoses, are each equally valid?113 Furthermore, many AI applications look for a specific process, such as cancer diagnoses on CXRs. However, how coexisting conditions like cardiomegaly, emphysema, pneumonia, etc, seen on CXRs will affect the diagnosis needs to be considered.51,52 Zech and colleagues provide the example that diagnoses for pneumothorax are frequently rendered on CXRs with chest tubes in place.51 They suggest that CNNs may develop a bias toward diagnosing pneumothorax when chest tubes are present. Many current studies approach an issue in isolation, a situation not realistic in real-world clinical practice.26
Most studies on AI have been retrospective, and frequently data used to train the program are preselected.13,26 The data are typically validated on available databases rather than actual patients in the clinical setting, limiting confidence in the validity of the AI output when applied to real-world situations. Currently, fewer than 12 prospective trials had been published comparing AI with traditional clinical care.13,114 Randomized prospective clinical trials are even fewer, with none currently reported from the United States.13,114 The results from several studies have been shown to diminish when repeated prospectively.114
The FDA has created a new category known as Software as a Medical Device and has a Digital Health Innovation Action Plan to regulate AI platforms. Still, the process of AI regulation is of necessity different from traditional approval processes and is continually evolving.8 The FDA approval process cannot account for the fact that the program’s parameters may continually evolve or adapt.2
Guidelines for investigating and reporting AI research with its unique attributes are being developed. Examples include the TRIPOD-ML statement and others.49,115 In September 2020, 2 publications addressed the paucity of gold-standard randomized clinical trials in clinical AI applications.116,117 The SPIRIT-AI statement expands on the original SPIRIT statement published in 2013 to guide minimal reporting standards for AI clinical trial protocols to promote transparency of design and methodology.116 Similarly, the CONSORT-AI extension, stemming from the original CONSORT statement in 1996, aims to ensure quality reporting of randomized controlled trials in AI.117
Another risk with AI is that while an individual physician making a mistake may adversely affect 1 patient, a single mistake in an AI algorithm could potentially affect thousands of patients.48 Also, AI programs developed for patient populations at a facility may not translate to another. Referred to as overfitting, this phenomenon relates to selection bias in training data sets.15,34,49,51,52 Studies have shown that programs that underrepresent certain group characteristics such as age, sex, or race may be less effective when applied to a population in which these characteristics have differing representations.8,48,49 This problem of underrepresentation has been demonstrated in programs interpreting pathology slides, radiographs, and skin lesions.15,32,51
Admittedly, most of these challenges are not specific to AI and existed in health care previously. Physicians make mistakes, treatments are sometimes used without adequate prospective studies, and medications are given without understanding their mechanism of action, much like AI-facilitated processes reach a conclusion that cannot be fully explained.48
Conclusions
The view that AI will dramatically impact health care in the coming years will likely prove true. However, much work is needed, especially because of the paucity of prospective clinical trials as has been historically required in medical research. Any concern that AI will replace HCPs seems unwarranted. Early studies suggest that even AI programs that appear to exceed human interpretation perform best when working in cooperation with and oversight from clinicians. AI’s greatest potential appears to be its ability to augment care from health professionals, improving efficiency and accuracy, and should be anticipated with enthusiasm as the field moves forward at an exponential rate.
Acknowledgments
The authors thank Makenna G. Thomas for proofreading and review of the manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital. This research has been approved by the James A. Haley Veteran’s Hospital Office of Communications and Media.
1. Bini SA. Artificial intelligence, machine learning, deep learning, and cognitive computing: what do these terms mean and how will they impact health care? J Arthroplasty. 2018;33(8):2358-2361. doi:10.1016/j.arth.2018.02.067
2. Benjamens S, Dhunnoo P, Meskó B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. NPJ Digit Med. 2020;3:118. doi:10.1038/s41746-020-00324-0
3. Viz. AI powered synchronized stroke care. Accessed September 15, 2021. https://www.viz.ai/ischemic-stroke
4. Buchanan M. The law of accelerating returns. Nat Phys. 2008;4(7):507. doi:10.1038/nphys1010
5. IBM Watson Health computes a pair of new solutions to improve healthcare data and security. Published September 10, 2015. Accessed October 21, 2020. https://www.techrepublic.com/article/ibm-watson-health-computes-a-pair-of-new-solutions-to-improve-healthcare-data-and-security
6. Borkowski AA, Kardani A, Mastorides SM, Thomas LB. Warfarin pharmacogenomics: recommendations with available patented clinical technologies. Recent Pat Biotechnol. 2014;8(2):110-115. doi:10.2174/1872208309666140904112003
7. Washington University in St. Louis. Warfarin dosing. Accessed September 15, 2021. http://www.warfarindosing.org/Source/Home.aspx
8. He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K. The practical implementation of artificial intelligence technologies in medicine. Nat Med. 2019;25(1):30-36. doi:10.1038/s41591-018-0307-0
9. Jiang F, Jiang Y, Zhi H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243. Published 2017 Jun 21. doi:10.1136/svn-2017-000101
10. Johnson KW, Torres Soto J, Glicksberg BS, et al. Artificial intelligence in cardiology. J Am Coll Cardiol. 2018;71(23):2668-2679. doi:10.1016/j.jacc.2018.03.521
11. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.
12. Cruz-Roa A, Gilmore H, Basavanhally A, et al. High-throughput adaptive sampling for whole-slide histopathology image analysis (HASHI) via convolutional neural networks: application to invasive breast cancer detection. PLoS One. 2018;13(5):e0196828. Published 2018 May 24. doi:10.1371/journal.pone.0196828
13. Nagendran M, Chen Y, Lovejoy CA, et al. Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. BMJ. 2020;368:m689. Published 2020 Mar 25. doi:10.1136/bmj.m689
14. Shimizu H, Nakayama KI. Artificial intelligence in oncology. Cancer Sci. 2020;111(5):1452-1460. doi:10.1111/cas.14377
15. Talebi-Liasi F, Markowitz O. Is artificial intelligence going to replace dermatologists?. Cutis. 2020;105(1):28-31.
16. Valliani AA, Ranti D, Oermann EK. Deep learning and neurology: a systematic review. Neurol Ther. 2019;8(2):351-365. doi:10.1007/s40120-019-00153-8
17. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539
18. Graham S, Depp C, Lee EE, et al. Artificial intelligence for mental health and mental illnesses: an overview. Curr Psychiatry Rep. 2019;21(11):116. Published 2019 Nov 7. doi:10.1007/s11920-019-1094-0
19. Golas SB, Shibahara T, Agboola S, et al. A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data. BMC Med Inform Decis Mak. 2018;18(1):44. Published 2018 Jun 22. doi:10.1186/s12911-018-0620-z
20. Mortazavi BJ, Downing NS, Bucholz EM, et al. Analysis of machine learning techniques for heart failure readmissions. Circ Cardiovasc Qual Outcomes. 2016;9(6):629-640. doi:10.1161/CIRCOUTCOMES.116.003039
21. Meyer-Bäse A, Morra L, Meyer-Bäse U, Pinker K. Current status and future perspectives of artificial intelligence in magnetic resonance breast imaging. Contrast Media Mol Imaging. 2020;2020:6805710. Published 2020 Aug 28. doi:10.1155/2020/6805710
22. Ehteshami Bejnordi B, Veta M, Johannes van Diest P, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210. doi:10.1001/jama.2017.14585
23. Borkowski AA, Viswanadhan NA, Thomas LB, Guzman RD, Deland LA, Mastorides SM. Using artificial intelligence for COVID-19 chest X-ray diagnosis. Fed Pract. 2020;37(9):398-404. doi:10.12788/fp.0045
24. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567. doi:10.1038/s41591-018-0177-5
25. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010
26. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Health. 2019;1(6):e271-e297. doi:10.1016/S2589-7500(19)30123-2
27. Nagpal K, Foote D, Liu Y, et al. Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer [published correction appears in NPJ Digit Med. 2019 Nov 19;2:113]. NPJ Digit Med. 2019;2:48. Published 2019 Jun 7. doi:10.1038/s41746-019-0112-2
28. Nam JG, Park S, Hwang EJ, et al. Development and validation of deep learning-based automatic detection algorithm for malignant pulmonary nodules on chest radiographs. Radiology. 2019;290(1):218-228. doi:10.1148/radiol.2018180237
29. Schwalbe N, Wahl B. Artificial intelligence and the future of global health. Lancet. 2020;395(10236):1579-1586. doi:10.1016/S0140-6736(20)30226-9
30. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT [published correction appears in Radiology. 2021 Apr;299(1):E225]. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491
31. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905
32. Serag A, Ion-Margineanu A, Qureshi H, et al. Translational AI and deep learning in diagnostic pathology. Front Med (Lausanne). 2019;6:185. Published 2019 Oct 1. doi:10.3389/fmed.2019.00185
33. Wang D, Khosla A, Gargeya R, Irshad H, Beck AH. Deep learning for identifying metastatic breast cancer. ArXiv. 2016 June 18:arXiv:1606.05718v1. Published online June 18, 2016. Accessed September 15, 2021. http://arxiv.org/abs/1606.05718
34. Alabdulkareem A. Artificial intelligence and dermatologists: friends or foes? J Dermatology Dermatol Surg. 2019;23(2):57-60. doi:10.4103/jdds.jdds_19_19
35. Mollalo A, Mao L, Rashidi P, Glass GE. A GIS-based artificial neural network model for spatial distribution of tuberculosis across the continental United States. Int J Environ Res Public Health. 2019;16(1):157. Published 2019 Jan 8. doi:10.3390/ijerph16010157
36. Haddawy P, Hasan AHMI, Kasantikul R, et al. Spatiotemporal Bayesian networks for malaria prediction. Artif Intell Med. 2018;84:127-138. doi:10.1016/j.artmed.2017.12.002
37. Laureano-Rosario AE, Duncan AP, Mendez-Lazaro PA, et al. Application of artificial neural networks for dengue fever outbreak predictions in the northwest coast of Yucatan, Mexico and San Juan, Puerto Rico. Trop Med Infect Dis. 2018;3(1):5. Published 2018 Jan 5. doi:10.3390/tropicalmed3010005
38. Buczak AL, Koshute PT, Babin SM, Feighner BH, Lewis SH. A data-driven epidemiological prediction method for dengue outbreaks using local and remote sensing data. BMC Med Inform Decis Mak. 2012;12:124. Published 2012 Nov 5. doi:10.1186/1472-6947-12-124
39. Scavuzzo JM, Trucco F, Espinosa M, et al. Modeling dengue vector population using remotely sensed data and machine learning. Acta Trop. 2018;185:167-175. doi:10.1016/j.actatropica.2018.05.003
40. Xue H, Bai Y, Hu H, Liang H. Influenza activity surveillance based on multiple regression model and artificial neural network. IEEE Access. 2018;6:563-575. doi:10.1109/ACCESS.2017.2771798
41. Jiang D, Hao M, Ding F, Fu J, Li M. Mapping the transmission risk of Zika virus using machine learning models. Acta Trop. 2018;185:391-399. doi:10.1016/j.actatropica.2018.06.021
42. Bragazzi NL, Dai H, Damiani G, Behzadifar M, Martini M, Wu J. How big data and artificial intelligence can help better manage the COVID-19 pandemic. Int J Environ Res Public Health. 2020;17(9):3176. Published 2020 May 2. doi:10.3390/ijerph17093176
43. Lake IR, Colón-González FJ, Barker GC, Morbey RA, Smith GE, Elliot AJ. Machine learning to refine decision making within a syndromic surveillance service. BMC Public Health. 2019;19(1):559. Published 2019 May 14. doi:10.1186/s12889-019-6916-9
44. Khan OF, Bebb G, Alimohamed NA. Artificial intelligence in medicine: what oncologists need to know about its potential-and its limitations. Oncol Exch. 2017;16(4):8-13. Accessed September 1, 2021. http://www.oncologyex.com/pdf/vol16_no4/feature_khan-ai.pdf
45. Badano LP, Keller DM, Muraru D, Torlasco C, Parati G. Artificial intelligence and cardiovascular imaging: A win-win combination. Anatol J Cardiol. 2020;24(4):214-223. doi:10.14744/AnatolJCardiol.2020.94491
46. Murdoch TB, Detsky AS. The inevitable application of big data to health care. JAMA. 2013;309(13):1351-1352. doi:10.1001/jama.2013.393
47. Greatbatch O, Garrett A, Snape K. The impact of artificial intelligence on the current and future practice of clinical cancer genomics. Genet Res (Camb). 2019;101:e9. Published 2019 Oct 31. doi:10.1017/S0016672319000089
48. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44-56. doi:10.1038/s41591-018-0300-7
49. Vollmer S, Mateen BA, Bohner G, et al. Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness [published correction appears in BMJ. 2020 Apr 1;369:m1312]. BMJ. 2020;368:l6927. Published 2020 Mar 20. doi:10.1136/bmj.l6927
50. Lindsey R, Daluiski A, Chopra S, et al. Deep neural network improves fracture detection by clinicians. Proc Natl Acad Sci U S A. 2018;115(45):11591-11596. doi:10.1073/pnas.1806905115
51. Zech JR, Badgeley MA, Liu M, Costa AB, Titano JJ, Oermann EK. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS Med. 2018;15(11):e1002683. doi:10.1371/journal.pmed.1002683
52. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582. doi:10.1148/radiol.2017162326
53. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. ArXiv. 2020 Feb 26:arXiv:2002.11379v2. Revised March 11, 2020. Accessed September 15, 2021. http://arxiv.org/abs/2002.11379
54. Salim M, Wåhlin E, Dembrower K, et al. External evaluation of 3 commercial artificial intelligence algorithms for independent assessment of screening mammograms. JAMA Oncol. 2020;6(10):1581-1588. doi:10.1001/jamaoncol.2020.3321
55. Arbabshirani MR, Fornwalt BK, Mongelluzzo GJ, et al. Advanced machine learning in action: identification of intracranial hemorrhage on computed tomography scans of the head with clinical workflow integration. NPJ Digit Med. 2018;1:9. doi:10.1038/s41746-017-0015-z
56. Sheth D, Giger ML. Artificial intelligence in the interpretation of breast cancer on MRI. J Magn Reson Imaging. 2020;51(5):1310-1324. doi:10.1002/jmri.26878
57. McKinney SM, Sieniek M, Godbole V, et al. International evaluation of an AI system for breast cancer screening. Nature. 2020;577(7788):89-94. doi:10.1038/s41586-019-1799-6
58. Booth AL, Abels E, McCaffrey P. Development of a prognostic model for mortality in COVID-19 infection using machine learning. Mod Pathol. 2021;34(3):522-531. doi:10.1038/s41379-020-00700-x
59. Xu B, Kocyigit D, Grimm R, Griffin BP, Cheng F. Applications of artificial intelligence in multimodality cardiovascular imaging: a state-of-the-art review. Prog Cardiovasc Dis. 2020;63(3):367-376. doi:10.1016/j.pcad.2020.03.003
60. Dey D, Slomka PJ, Leeson P, et al. Artificial intelligence in cardiovascular imaging: JACC state-of-the-art review. J Am Coll Cardiol. 2019;73(11):1317-1335. doi:10.1016/j.jacc.2018.12.054
61. Carewell Health. AI powered ECG diagnosis solutions. Accessed November 2, 2020. https://www.carewellhealth.com/products_aiecg.html
62. Strodthoff N, Strodthoff C. Detecting and interpreting myocardial infarction using fully convolutional neural networks. Physiol Meas. 2019;40(1):015001. doi:10.1088/1361-6579/aaf34d
63. Hannun AY, Rajpurkar P, Haghpanahi M, et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med. 2019;25(1):65-69. doi:10.1038/s41591-018-0268-3
64. Kwon JM, Jeon KH, Kim HM, et al. Comparing the performance of artificial intelligence and conventional diagnosis criteria for detecting left ventricular hypertrophy using electrocardiography. Europace. 2020;22(3):412-419. doi:10.1093/europace/euz324
65. Eko. FDA clears Eko’s AFib and heart murmur detection algorithms, making it the first AI-powered stethoscope to screen for serious heart conditions [press release]. Published January 28, 2020. Accessed September 15, 2021. https://www.businesswire.com/news/home/20200128005232/en/FDA-Clears-Eko’s-AFib-and-Heart-Murmur-Detection-Algorithms-Making-It-the-First-AI-Powered-Stethoscope-to-Screen-for-Serious-Heart-Conditions
66. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450. doi:10.1038/srep46450
67. Acs B, Rantalainen M, Hartman J. Artificial intelligence as the next step towards precision pathology. J Intern Med. 2020;288(1):62-81. doi:10.1111/joim.13030
68. Mobadersany P, Yousefi S, Amgad M, et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc Natl Acad Sci U S A. 2018;115(13):E2970-E2979. doi:10.1073/pnas.1717139115
69. Wang X, Janowczyk A, Zhou Y, et al. Prediction of recurrence in early stage non-small cell lung cancer using computer extracted nuclear features from digital H&E images. Sci Rep. 2017;7:13543. doi:10.1038/s41598-017-13773-7
70. Kulkarni PM, Robinson EJ, Pradhan JS, et al. Deep learning based on standard H&E images of primary melanoma tumors identifies patients at risk for visceral recurrence and death. Clin Cancer Res. 2020;26(5):1126-1134. doi:10.1158/1078-0432.CCR-19-1495
71. Du XL, Li WB, Hu BJ. Application of artificial intelligence in ophthalmology. Int J Ophthalmol. 2018;11(9):1555-1561. doi:10.18240/ijo.2018.09.21
72. Gunasekeran DV, Wong TY. Artificial intelligence in ophthalmology in 2020: a technology on the cusp for translation and implementation. Asia Pac J Ophthalmol (Phila). 2020;9(2):61-66. doi:10.1097/01.APO.0000656984.56467.2c
73. Ting DSW, Pasquale LR, Peng L, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2019;103(2):167-175. doi:10.1136/bjophthalmol-2018-313173
74. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2410. doi:10.1001/jama.2016.17216
75. US Food and Drug Administration. FDA permits marketing of artificial intelligence-based device to detect certain diabetes-related eye problems [press release]. Published April 11, 2018. Accessed September 15, 2021. https://www.fda.gov/news-events/press-announcements/fda-permits-marketing-artificial-intelligence-based-device-detect-certain-diabetes-related-eye
76. Long E, Chen J, Wu X, et al. Artificial intelligence manages congenital cataract with individualized prediction and telehealth computing. NPJ Digit Med. 2020;3:112. doi:10.1038/s41746-020-00319-x
77. De Fauw J, Ledsam JR, Romera-Paredes B, et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med. 2018;24(9):1342-1350. doi:10.1038/s41591-018-0107-6
78. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118. doi:10.1038/nature21056
79. Brinker TJ, Hekler A, Enk AH, et al. Deep neural networks are superior to dermatologists in melanoma image classification. Eur J Cancer. 2019;119:11-17. doi:10.1016/j.ejca.2019.05.023
80. Brinker TJ, Hekler A, Enk AH, et al. A convolutional neural network trained with dermoscopic images performed on par with 145 dermatologists in a clinical melanoma image classification task. Eur J Cancer. 2019;111:148-154. doi:10.1016/j.ejca.2019.02.005
81. Haenssle HA, Fink C, Schneiderbauer R, et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol. 2018;29(8):1836-1842. doi:10.1093/annonc/mdy166
82. Li CX, Shen CB, Xue K, et al. Artificial intelligence in dermatology: past, present, and future. Chin Med J (Engl). 2019;132(17):2017-2020. doi:10.1097/CM9.0000000000000372
83. Tschandl P, Codella N, Akay BN, et al. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. Lancet Oncol. 2019;20(7):938-947. doi:10.1016/S1470-2045(19)30333-X
84. Han SS, Park I, Eun Chang SE, et al. Augmented intelligence dermatology: deep neural networks empower medical professionals in diagnosing skin cancer and predicting treatment options for 134 skin disorders. J Invest Dermatol. 2020;140(9):1753-1761. doi:10.1016/j.jid.2020.01.019
85. Freeman K, Dinnes J, Chuchu N, et al. Algorithm based smartphone apps to assess risk of skin cancer in adults: systematic review of diagnostic accuracy studies [published correction appears in BMJ. 2020 Feb 25;368:m645]. BMJ. 2020;368:m127. Published 2020 Feb 10. doi:10.1136/bmj.m127
86. Chen YC, Ke WC, Chiu HW. Risk classification of cancer survival using ANN with gene expression data from multiple laboratories. Comput Biol Med. 2014;48:1-7. doi:10.1016/j.compbiomed.2014.02.006
87. Kim W, Kim KS, Lee JE, et al. Development of novel breast cancer recurrence prediction model using support vector machine. J Breast Cancer. 2012;15(2):230-238. doi:10.4048/jbc.2012.15.2.230
88. Merath K, Hyer JM, Mehta R, et al. Use of machine learning for prediction of patient risk of postoperative complications after liver, pancreatic, and colorectal surgery. J Gastrointest Surg. 2020;24(8):1843-1851. doi:10.1007/s11605-019-04338-2
89. Santos-García G, Varela G, Novoa N, Jiménez MF. Prediction of postoperative morbidity after lung resection using an artificial neural network ensemble. Artif Intell Med. 2004;30(1):61-69. doi:10.1016/S0933-3657(03)00059-9
90. Ibragimov B, Xing L. Segmentation of organs-at-risks in head and neck CT images using convolutional neural networks. Med Phys. 2017;44(2):547-557. doi:10.1002/mp.12045
91. Lou B, Doken S, Zhuang T, et al. An image-based deep learning framework for individualizing radiotherapy dose. Lancet Digit Health. 2019;1(3):e136-e147. doi:10.1016/S2589-7500(19)30058-5
92. Xu J, Yang P, Xue S, et al. Translating cancer genomics into precision medicine with artificial intelligence: applications, challenges and future perspectives. Hum Genet. 2019;138(2):109-124. doi:10.1007/s00439-019-01970-5
93. Patel NM, Michelini VV, Snell JM, et al. Enhancing next‐generation sequencing‐guided cancer care through cognitive computing. Oncologist. 2018;23(2):179-185. doi:10.1634/theoncologist.2017-0170
94. Le Berre C, Sandborn WJ, Aridhi S, et al. Application of artificial intelligence to gastroenterology and hepatology. Gastroenterology. 2020;158(1):76-94.e2. doi:10.1053/j.gastro.2019.08.058
95. Yang YJ, Bang CS. Application of artificial intelligence in gastroenterology. World J Gastroenterol. 2019;25(14):1666-1683. doi:10.3748/wjg.v25.i14.1666
96. Wang P, Berzin TM, Glissen Brown JR, et al. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut. 2019;68(10):1813-1819. doi:10.1136/gutjnl-2018-317500
97. Gupta R, Krishnam SP, Schaefer PW, Lev MH, Gonzalez RG. An East Coast perspective on artificial intelligence and machine learning: part 2: ischemic stroke imaging and triage. Neuroimaging Clin N Am. 2020;30(4):467-478. doi:10.1016/j.nic.2020.08.002
98. Beli M, Bobi V, Badža M, Šolaja N, Duri-Jovii M, Kosti VS. Artificial intelligence for assisting diagnostics and assessment of Parkinson’s disease—a review. Clin Neurol Neurosurg. 2019;184:105442. doi:10.1016/j.clineuro.2019.105442
99. An S, Kang C, Lee HW. Artificial intelligence and computational approaches for epilepsy. J Epilepsy Res. 2020;10(1):8-17. doi:10.14581/jer.20003
100. Pavel AM, Rennie JM, de Vries LS, et al. A machine-learning algorithm for neonatal seizure recognition: a multicentre, randomised, controlled trial. Lancet Child Adolesc Health. 2020;4(10):740-749. doi:10.1016/S2352-4642(20)30239-X
101. Afzal HMR, Luo S, Ramadan S, Lechner-Scott J. The emerging role of artificial intelligence in multiple sclerosis imaging [published online ahead of print, 2020 Oct 28]. Mult Scler. 2020;1352458520966298. doi:10.1177/1352458520966298
102. Bouton CE. Restoring movement in paralysis with a bioelectronic neural bypass approach: current state and future directions. Cold Spring Harb Perspect Med. 2019;9(11):a034306. doi:10.1101/cshperspect.a034306
103. Durstewitz D, Koppe G, Meyer-Lindenberg A. Deep neural networks in psychiatry. Mol Psychiatry. 2019;24(11):1583-1598. doi:10.1038/s41380-019-0365-9
104. Fonseka TM, Bhat V, Kennedy SH. The utility of artificial intelligence in suicide risk prediction and the management of suicidal behaviors. Aust N Z J Psychiatry. 2019;53(10):954-964. doi:10.1177/0004867419864428
105. Kessler RC, Hwang I, Hoffmire CA, et al. Developing a practical suicide risk prediction model for targeting high-risk patients in the Veterans Health Administration. Int J Methods Psychiatr Res. 2017;26(3):e1575. doi:10.1002/mpr.1575
106. Kessler RC, Bauer MS, Bishop TM, et al. Using administrative data to predict suicide after psychiatric hospitalization in the Veterans Health Administration System. Front Psychiatry. 2020;11:390. doi:10.3389/fpsyt.2020.00390
107. Kessler RC, van Loo HM, Wardenaar KJ, et al. Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports. Mol Psychiatry. 2016;21(10):1366-1371. doi:10.1038/mp.2015.198
108. Horng S, Sontag DA, Halpern Y, Jernite Y, Shapiro NI, Nathanson LA. Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning. PLoS One. 2017;12(4):e0174708. doi:10.1371/journal.pone.0174708
109. Soffer S, Klang E, Barash Y, Grossman E, Zimlichman E. Predicting in-hospital mortality at admission to the medical ward: a big-data machine learning model. Am J Med. 2021;134(2):227-234.e4. doi:10.1016/j.amjmed.2020.07.014
110. Labovitz DL, Shafner L, Reyes Gil M, Virmani D, Hanina A. Using artificial intelligence to reduce the risk of nonadherence in patients on anticoagulation therapy. Stroke. 2017;48(5):1416-1419. doi:10.1161/STROKEAHA.116.016281
111. Forlenza GP. Use of artificial intelligence to improve diabetes outcomes in patients using multiple daily injections therapy. Diabetes Technol Ther. 2019;21(S2):S24-S28. doi:10.1089/dia.2019.0077
112. Poser CM. CT scan and the practice of neurology. Arch Neurol. 1977;34(2):132. doi:10.1001/archneur.1977.00500140086023
113. Angus DC. Randomized clinical trials of artificial intelligence. JAMA. 2020;323(11):1043-1045. doi:10.1001/jama.2020.1039
114. Topol EJ. Welcoming new guidelines for AI clinical research. Nat Med. 2020;26(9):1318-1320. doi:10.1038/s41591-020-1042-x
115. Collins GS, Moons KGM. Reporting of artificial intelligence prediction models. Lancet. 2019;393(10181):1577-1579. doi:10.1016/S0140-6736(19)30037-6
116. Cruz Rivera S, Liu X, Chan AW, et al. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension. Nat Med. 2020;26(9):1351-1363. doi:10.1038/s41591-020-1037-7
117. Liu X, Cruz Rivera S, Moher D, Calvert MJ, Denniston AK; SPIRIT-AI and CONSORT-AI Working Group. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat Med. 2020;26(9):1364-1374. doi:10.1038/s41591-020-1034-x
118. McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys. 1943;5(4):115-133. doi:10.1007/BF02478259
119. Samuel AL. Some studies in machine learning using the game of Checkers. IBM J Res Dev. 1959;3(3):535-554. Accessed September 15, 2021. https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.368.2254
120. Sonoda M, Takano M, Miyahara J, Kato H. Computed radiography utilizing scanning laser stimulated luminescence. Radiology. 1983;148(3):833-838. doi:10.1148/radiology.148.3.6878707
121. Dechter R. Learning while searching in constraint-satisfaction-problems. AAAI’86: proceedings of the fifth AAAI national conference on artificial intelligence. Published 1986. Accessed September 15, 2021. https://www.aaai.org/Papers/AAAI/1986/AAAI86-029.pdf
122. Le Cun Y, Jackel LD, Boser B, et al. Handwritten digit recognition: applications of neural network chips and automatic learning. IEEE Commun Mag. 1989;27(11):41-46. doi:10.1109/35.41400
123. US Food and Drug Administration. FDA allows marketing of first whole slide imaging system for digital pathology [press release]. Published April 12, 2017. Accessed September 15, 2021. https://www.fda.gov/news-events/press-announcements/fda-allows-marketing-first-whole-slide-imaging-system-digital-pathology
1. Bini SA. Artificial intelligence, machine learning, deep learning, and cognitive computing: what do these terms mean and how will they impact health care? J Arthroplasty. 2018;33(8):2358-2361. doi:10.1016/j.arth.2018.02.067
2. Benjamens S, Dhunnoo P, Meskó B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. NPJ Digit Med. 2020;3:118. doi:10.1038/s41746-020-00324-0
3. Viz. AI powered synchronized stroke care. Accessed September 15, 2021. https://www.viz.ai/ischemic-stroke
4. Buchanan M. The law of accelerating returns. Nat Phys. 2008;4(7):507. doi:10.1038/nphys1010
5. IBM Watson Health computes a pair of new solutions to improve healthcare data and security. Published September 10, 2015. Accessed October 21, 2020. https://www.techrepublic.com/article/ibm-watson-health-computes-a-pair-of-new-solutions-to-improve-healthcare-data-and-security
6. Borkowski AA, Kardani A, Mastorides SM, Thomas LB. Warfarin pharmacogenomics: recommendations with available patented clinical technologies. Recent Pat Biotechnol. 2014;8(2):110-115. doi:10.2174/1872208309666140904112003
7. Washington University in St. Louis. Warfarin dosing. Accessed September 15, 2021. http://www.warfarindosing.org/Source/Home.aspx
8. He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K. The practical implementation of artificial intelligence technologies in medicine. Nat Med. 2019;25(1):30-36. doi:10.1038/s41591-018-0307-0
9. Jiang F, Jiang Y, Zhi H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243. Published 2017 Jun 21. doi:10.1136/svn-2017-000101
10. Johnson KW, Torres Soto J, Glicksberg BS, et al. Artificial intelligence in cardiology. J Am Coll Cardiol. 2018;71(23):2668-2679. doi:10.1016/j.jacc.2018.03.521
11. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.
12. Cruz-Roa A, Gilmore H, Basavanhally A, et al. High-throughput adaptive sampling for whole-slide histopathology image analysis (HASHI) via convolutional neural networks: application to invasive breast cancer detection. PLoS One. 2018;13(5):e0196828. Published 2018 May 24. doi:10.1371/journal.pone.0196828
13. Nagendran M, Chen Y, Lovejoy CA, et al. Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. BMJ. 2020;368:m689. Published 2020 Mar 25. doi:10.1136/bmj.m689
14. Shimizu H, Nakayama KI. Artificial intelligence in oncology. Cancer Sci. 2020;111(5):1452-1460. doi:10.1111/cas.14377
15. Talebi-Liasi F, Markowitz O. Is artificial intelligence going to replace dermatologists?. Cutis. 2020;105(1):28-31.
16. Valliani AA, Ranti D, Oermann EK. Deep learning and neurology: a systematic review. Neurol Ther. 2019;8(2):351-365. doi:10.1007/s40120-019-00153-8
17. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539
18. Graham S, Depp C, Lee EE, et al. Artificial intelligence for mental health and mental illnesses: an overview. Curr Psychiatry Rep. 2019;21(11):116. Published 2019 Nov 7. doi:10.1007/s11920-019-1094-0
19. Golas SB, Shibahara T, Agboola S, et al. A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data. BMC Med Inform Decis Mak. 2018;18(1):44. Published 2018 Jun 22. doi:10.1186/s12911-018-0620-z
20. Mortazavi BJ, Downing NS, Bucholz EM, et al. Analysis of machine learning techniques for heart failure readmissions. Circ Cardiovasc Qual Outcomes. 2016;9(6):629-640. doi:10.1161/CIRCOUTCOMES.116.003039
21. Meyer-Bäse A, Morra L, Meyer-Bäse U, Pinker K. Current status and future perspectives of artificial intelligence in magnetic resonance breast imaging. Contrast Media Mol Imaging. 2020;2020:6805710. Published 2020 Aug 28. doi:10.1155/2020/6805710
22. Ehteshami Bejnordi B, Veta M, Johannes van Diest P, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210. doi:10.1001/jama.2017.14585
23. Borkowski AA, Viswanadhan NA, Thomas LB, Guzman RD, Deland LA, Mastorides SM. Using artificial intelligence for COVID-19 chest X-ray diagnosis. Fed Pract. 2020;37(9):398-404. doi:10.12788/fp.0045
24. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567. doi:10.1038/s41591-018-0177-5
25. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010
26. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Health. 2019;1(6):e271-e297. doi:10.1016/S2589-7500(19)30123-2
27. Nagpal K, Foote D, Liu Y, et al. Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer [published correction appears in NPJ Digit Med. 2019 Nov 19;2:113]. NPJ Digit Med. 2019;2:48. Published 2019 Jun 7. doi:10.1038/s41746-019-0112-2
28. Nam JG, Park S, Hwang EJ, et al. Development and validation of deep learning-based automatic detection algorithm for malignant pulmonary nodules on chest radiographs. Radiology. 2019;290(1):218-228. doi:10.1148/radiol.2018180237
29. Schwalbe N, Wahl B. Artificial intelligence and the future of global health. Lancet. 2020;395(10236):1579-1586. doi:10.1016/S0140-6736(20)30226-9
30. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT [published correction appears in Radiology. 2021 Apr;299(1):E225]. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491
31. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905
32. Serag A, Ion-Margineanu A, Qureshi H, et al. Translational AI and deep learning in diagnostic pathology. Front Med (Lausanne). 2019;6:185. Published 2019 Oct 1. doi:10.3389/fmed.2019.00185
33. Wang D, Khosla A, Gargeya R, Irshad H, Beck AH. Deep learning for identifying metastatic breast cancer. ArXiv. 2016 June 18:arXiv:1606.05718v1. Published online June 18, 2016. Accessed September 15, 2021. http://arxiv.org/abs/1606.05718
34. Alabdulkareem A. Artificial intelligence and dermatologists: friends or foes? J Dermatology Dermatol Surg. 2019;23(2):57-60. doi:10.4103/jdds.jdds_19_19
35. Mollalo A, Mao L, Rashidi P, Glass GE. A GIS-based artificial neural network model for spatial distribution of tuberculosis across the continental United States. Int J Environ Res Public Health. 2019;16(1):157. Published 2019 Jan 8. doi:10.3390/ijerph16010157
36. Haddawy P, Hasan AHMI, Kasantikul R, et al. Spatiotemporal Bayesian networks for malaria prediction. Artif Intell Med. 2018;84:127-138. doi:10.1016/j.artmed.2017.12.002
37. Laureano-Rosario AE, Duncan AP, Mendez-Lazaro PA, et al. Application of artificial neural networks for dengue fever outbreak predictions in the northwest coast of Yucatan, Mexico and San Juan, Puerto Rico. Trop Med Infect Dis. 2018;3(1):5. Published 2018 Jan 5. doi:10.3390/tropicalmed3010005
38. Buczak AL, Koshute PT, Babin SM, Feighner BH, Lewis SH. A data-driven epidemiological prediction method for dengue outbreaks using local and remote sensing data. BMC Med Inform Decis Mak. 2012;12:124. Published 2012 Nov 5. doi:10.1186/1472-6947-12-124
39. Scavuzzo JM, Trucco F, Espinosa M, et al. Modeling dengue vector population using remotely sensed data and machine learning. Acta Trop. 2018;185:167-175. doi:10.1016/j.actatropica.2018.05.003
40. Xue H, Bai Y, Hu H, Liang H. Influenza activity surveillance based on multiple regression model and artificial neural network. IEEE Access. 2018;6:563-575. doi:10.1109/ACCESS.2017.2771798
41. Jiang D, Hao M, Ding F, Fu J, Li M. Mapping the transmission risk of Zika virus using machine learning models. Acta Trop. 2018;185:391-399. doi:10.1016/j.actatropica.2018.06.021
42. Bragazzi NL, Dai H, Damiani G, Behzadifar M, Martini M, Wu J. How big data and artificial intelligence can help better manage the COVID-19 pandemic. Int J Environ Res Public Health. 2020;17(9):3176. Published 2020 May 2. doi:10.3390/ijerph17093176
43. Lake IR, Colón-González FJ, Barker GC, Morbey RA, Smith GE, Elliot AJ. Machine learning to refine decision making within a syndromic surveillance service. BMC Public Health. 2019;19(1):559. Published 2019 May 14. doi:10.1186/s12889-019-6916-9
44. Khan OF, Bebb G, Alimohamed NA. Artificial intelligence in medicine: what oncologists need to know about its potential-and its limitations. Oncol Exch. 2017;16(4):8-13. Accessed September 1, 2021. http://www.oncologyex.com/pdf/vol16_no4/feature_khan-ai.pdf
45. Badano LP, Keller DM, Muraru D, Torlasco C, Parati G. Artificial intelligence and cardiovascular imaging: A win-win combination. Anatol J Cardiol. 2020;24(4):214-223. doi:10.14744/AnatolJCardiol.2020.94491
46. Murdoch TB, Detsky AS. The inevitable application of big data to health care. JAMA. 2013;309(13):1351-1352. doi:10.1001/jama.2013.393
47. Greatbatch O, Garrett A, Snape K. The impact of artificial intelligence on the current and future practice of clinical cancer genomics. Genet Res (Camb). 2019;101:e9. Published 2019 Oct 31. doi:10.1017/S0016672319000089
48. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44-56. doi:10.1038/s41591-018-0300-7
49. Vollmer S, Mateen BA, Bohner G, et al. Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness [published correction appears in BMJ. 2020 Apr 1;369:m1312]. BMJ. 2020;368:l6927. Published 2020 Mar 20. doi:10.1136/bmj.l6927
50. Lindsey R, Daluiski A, Chopra S, et al. Deep neural network improves fracture detection by clinicians. Proc Natl Acad Sci U S A. 2018;115(45):11591-11596. doi:10.1073/pnas.1806905115
51. Zech JR, Badgeley MA, Liu M, Costa AB, Titano JJ, Oermann EK. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS Med. 2018;15(11):e1002683. doi:10.1371/journal.pmed.1002683
52. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582. doi:10.1148/radiol.2017162326
53. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. ArXiv. 2020 Feb 26:arXiv:2002.11379v2. Revised March 11, 2020. Accessed September 15, 2021. http://arxiv.org/abs/2002.11379
54. Salim M, Wåhlin E, Dembrower K, et al. External evaluation of 3 commercial artificial intelligence algorithms for independent assessment of screening mammograms. JAMA Oncol. 2020;6(10):1581-1588. doi:10.1001/jamaoncol.2020.3321
55. Arbabshirani MR, Fornwalt BK, Mongelluzzo GJ, et al. Advanced machine learning in action: identification of intracranial hemorrhage on computed tomography scans of the head with clinical workflow integration. NPJ Digit Med. 2018;1:9. doi:10.1038/s41746-017-0015-z
56. Sheth D, Giger ML. Artificial intelligence in the interpretation of breast cancer on MRI. J Magn Reson Imaging. 2020;51(5):1310-1324. doi:10.1002/jmri.26878
57. McKinney SM, Sieniek M, Godbole V, et al. International evaluation of an AI system for breast cancer screening. Nature. 2020;577(7788):89-94. doi:10.1038/s41586-019-1799-6
58. Booth AL, Abels E, McCaffrey P. Development of a prognostic model for mortality in COVID-19 infection using machine learning. Mod Pathol. 2021;34(3):522-531. doi:10.1038/s41379-020-00700-x
59. Xu B, Kocyigit D, Grimm R, Griffin BP, Cheng F. Applications of artificial intelligence in multimodality cardiovascular imaging: a state-of-the-art review. Prog Cardiovasc Dis. 2020;63(3):367-376. doi:10.1016/j.pcad.2020.03.003
60. Dey D, Slomka PJ, Leeson P, et al. Artificial intelligence in cardiovascular imaging: JACC state-of-the-art review. J Am Coll Cardiol. 2019;73(11):1317-1335. doi:10.1016/j.jacc.2018.12.054
61. Carewell Health. AI powered ECG diagnosis solutions. Accessed November 2, 2020. https://www.carewellhealth.com/products_aiecg.html
62. Strodthoff N, Strodthoff C. Detecting and interpreting myocardial infarction using fully convolutional neural networks. Physiol Meas. 2019;40(1):015001. doi:10.1088/1361-6579/aaf34d
63. Hannun AY, Rajpurkar P, Haghpanahi M, et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med. 2019;25(1):65-69. doi:10.1038/s41591-018-0268-3
64. Kwon JM, Jeon KH, Kim HM, et al. Comparing the performance of artificial intelligence and conventional diagnosis criteria for detecting left ventricular hypertrophy using electrocardiography. Europace. 2020;22(3):412-419. doi:10.1093/europace/euz324
65. Eko. FDA clears Eko’s AFib and heart murmur detection algorithms, making it the first AI-powered stethoscope to screen for serious heart conditions [press release]. Published January 28, 2020. Accessed September 15, 2021. https://www.businesswire.com/news/home/20200128005232/en/FDA-Clears-Eko’s-AFib-and-Heart-Murmur-Detection-Algorithms-Making-It-the-First-AI-Powered-Stethoscope-to-Screen-for-Serious-Heart-Conditions
66. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450. doi:10.1038/srep46450
67. Acs B, Rantalainen M, Hartman J. Artificial intelligence as the next step towards precision pathology. J Intern Med. 2020;288(1):62-81. doi:10.1111/joim.13030
68. Mobadersany P, Yousefi S, Amgad M, et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc Natl Acad Sci U S A. 2018;115(13):E2970-E2979. doi:10.1073/pnas.1717139115
69. Wang X, Janowczyk A, Zhou Y, et al. Prediction of recurrence in early stage non-small cell lung cancer using computer extracted nuclear features from digital H&E images. Sci Rep. 2017;7:13543. doi:10.1038/s41598-017-13773-7
70. Kulkarni PM, Robinson EJ, Pradhan JS, et al. Deep learning based on standard H&E images of primary melanoma tumors identifies patients at risk for visceral recurrence and death. Clin Cancer Res. 2020;26(5):1126-1134. doi:10.1158/1078-0432.CCR-19-1495
71. Du XL, Li WB, Hu BJ. Application of artificial intelligence in ophthalmology. Int J Ophthalmol. 2018;11(9):1555-1561. doi:10.18240/ijo.2018.09.21
72. Gunasekeran DV, Wong TY. Artificial intelligence in ophthalmology in 2020: a technology on the cusp for translation and implementation. Asia Pac J Ophthalmol (Phila). 2020;9(2):61-66. doi:10.1097/01.APO.0000656984.56467.2c
73. Ting DSW, Pasquale LR, Peng L, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2019;103(2):167-175. doi:10.1136/bjophthalmol-2018-313173
74. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2410. doi:10.1001/jama.2016.17216
75. US Food and Drug Administration. FDA permits marketing of artificial intelligence-based device to detect certain diabetes-related eye problems [press release]. Published April 11, 2018. Accessed September 15, 2021. https://www.fda.gov/news-events/press-announcements/fda-permits-marketing-artificial-intelligence-based-device-detect-certain-diabetes-related-eye
76. Long E, Chen J, Wu X, et al. Artificial intelligence manages congenital cataract with individualized prediction and telehealth computing. NPJ Digit Med. 2020;3:112. doi:10.1038/s41746-020-00319-x
77. De Fauw J, Ledsam JR, Romera-Paredes B, et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med. 2018;24(9):1342-1350. doi:10.1038/s41591-018-0107-6
78. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118. doi:10.1038/nature21056
79. Brinker TJ, Hekler A, Enk AH, et al. Deep neural networks are superior to dermatologists in melanoma image classification. Eur J Cancer. 2019;119:11-17. doi:10.1016/j.ejca.2019.05.023
80. Brinker TJ, Hekler A, Enk AH, et al. A convolutional neural network trained with dermoscopic images performed on par with 145 dermatologists in a clinical melanoma image classification task. Eur J Cancer. 2019;111:148-154. doi:10.1016/j.ejca.2019.02.005
81. Haenssle HA, Fink C, Schneiderbauer R, et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol. 2018;29(8):1836-1842. doi:10.1093/annonc/mdy166
82. Li CX, Shen CB, Xue K, et al. Artificial intelligence in dermatology: past, present, and future. Chin Med J (Engl). 2019;132(17):2017-2020. doi:10.1097/CM9.0000000000000372
83. Tschandl P, Codella N, Akay BN, et al. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. Lancet Oncol. 2019;20(7):938-947. doi:10.1016/S1470-2045(19)30333-X
84. Han SS, Park I, Eun Chang SE, et al. Augmented intelligence dermatology: deep neural networks empower medical professionals in diagnosing skin cancer and predicting treatment options for 134 skin disorders. J Invest Dermatol. 2020;140(9):1753-1761. doi:10.1016/j.jid.2020.01.019
85. Freeman K, Dinnes J, Chuchu N, et al. Algorithm based smartphone apps to assess risk of skin cancer in adults: systematic review of diagnostic accuracy studies [published correction appears in BMJ. 2020 Feb 25;368:m645]. BMJ. 2020;368:m127. Published 2020 Feb 10. doi:10.1136/bmj.m127
86. Chen YC, Ke WC, Chiu HW. Risk classification of cancer survival using ANN with gene expression data from multiple laboratories. Comput Biol Med. 2014;48:1-7. doi:10.1016/j.compbiomed.2014.02.006
87. Kim W, Kim KS, Lee JE, et al. Development of novel breast cancer recurrence prediction model using support vector machine. J Breast Cancer. 2012;15(2):230-238. doi:10.4048/jbc.2012.15.2.230
88. Merath K, Hyer JM, Mehta R, et al. Use of machine learning for prediction of patient risk of postoperative complications after liver, pancreatic, and colorectal surgery. J Gastrointest Surg. 2020;24(8):1843-1851. doi:10.1007/s11605-019-04338-2
89. Santos-García G, Varela G, Novoa N, Jiménez MF. Prediction of postoperative morbidity after lung resection using an artificial neural network ensemble. Artif Intell Med. 2004;30(1):61-69. doi:10.1016/S0933-3657(03)00059-9
90. Ibragimov B, Xing L. Segmentation of organs-at-risks in head and neck CT images using convolutional neural networks. Med Phys. 2017;44(2):547-557. doi:10.1002/mp.12045
91. Lou B, Doken S, Zhuang T, et al. An image-based deep learning framework for individualizing radiotherapy dose. Lancet Digit Health. 2019;1(3):e136-e147. doi:10.1016/S2589-7500(19)30058-5
92. Xu J, Yang P, Xue S, et al. Translating cancer genomics into precision medicine with artificial intelligence: applications, challenges and future perspectives. Hum Genet. 2019;138(2):109-124. doi:10.1007/s00439-019-01970-5
93. Patel NM, Michelini VV, Snell JM, et al. Enhancing next‐generation sequencing‐guided cancer care through cognitive computing. Oncologist. 2018;23(2):179-185. doi:10.1634/theoncologist.2017-0170
94. Le Berre C, Sandborn WJ, Aridhi S, et al. Application of artificial intelligence to gastroenterology and hepatology. Gastroenterology. 2020;158(1):76-94.e2. doi:10.1053/j.gastro.2019.08.058
95. Yang YJ, Bang CS. Application of artificial intelligence in gastroenterology. World J Gastroenterol. 2019;25(14):1666-1683. doi:10.3748/wjg.v25.i14.1666
96. Wang P, Berzin TM, Glissen Brown JR, et al. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut. 2019;68(10):1813-1819. doi:10.1136/gutjnl-2018-317500
97. Gupta R, Krishnam SP, Schaefer PW, Lev MH, Gonzalez RG. An East Coast perspective on artificial intelligence and machine learning: part 2: ischemic stroke imaging and triage. Neuroimaging Clin N Am. 2020;30(4):467-478. doi:10.1016/j.nic.2020.08.002
98. Beli M, Bobi V, Badža M, Šolaja N, Duri-Jovii M, Kosti VS. Artificial intelligence for assisting diagnostics and assessment of Parkinson’s disease—a review. Clin Neurol Neurosurg. 2019;184:105442. doi:10.1016/j.clineuro.2019.105442
99. An S, Kang C, Lee HW. Artificial intelligence and computational approaches for epilepsy. J Epilepsy Res. 2020;10(1):8-17. doi:10.14581/jer.20003
100. Pavel AM, Rennie JM, de Vries LS, et al. A machine-learning algorithm for neonatal seizure recognition: a multicentre, randomised, controlled trial. Lancet Child Adolesc Health. 2020;4(10):740-749. doi:10.1016/S2352-4642(20)30239-X
101. Afzal HMR, Luo S, Ramadan S, Lechner-Scott J. The emerging role of artificial intelligence in multiple sclerosis imaging [published online ahead of print, 2020 Oct 28]. Mult Scler. 2020;1352458520966298. doi:10.1177/1352458520966298
102. Bouton CE. Restoring movement in paralysis with a bioelectronic neural bypass approach: current state and future directions. Cold Spring Harb Perspect Med. 2019;9(11):a034306. doi:10.1101/cshperspect.a034306
103. Durstewitz D, Koppe G, Meyer-Lindenberg A. Deep neural networks in psychiatry. Mol Psychiatry. 2019;24(11):1583-1598. doi:10.1038/s41380-019-0365-9
104. Fonseka TM, Bhat V, Kennedy SH. The utility of artificial intelligence in suicide risk prediction and the management of suicidal behaviors. Aust N Z J Psychiatry. 2019;53(10):954-964. doi:10.1177/0004867419864428
105. Kessler RC, Hwang I, Hoffmire CA, et al. Developing a practical suicide risk prediction model for targeting high-risk patients in the Veterans Health Administration. Int J Methods Psychiatr Res. 2017;26(3):e1575. doi:10.1002/mpr.1575
106. Kessler RC, Bauer MS, Bishop TM, et al. Using administrative data to predict suicide after psychiatric hospitalization in the Veterans Health Administration System. Front Psychiatry. 2020;11:390. doi:10.3389/fpsyt.2020.00390
107. Kessler RC, van Loo HM, Wardenaar KJ, et al. Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports. Mol Psychiatry. 2016;21(10):1366-1371. doi:10.1038/mp.2015.198
108. Horng S, Sontag DA, Halpern Y, Jernite Y, Shapiro NI, Nathanson LA. Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning. PLoS One. 2017;12(4):e0174708. doi:10.1371/journal.pone.0174708
109. Soffer S, Klang E, Barash Y, Grossman E, Zimlichman E. Predicting in-hospital mortality at admission to the medical ward: a big-data machine learning model. Am J Med. 2021;134(2):227-234.e4. doi:10.1016/j.amjmed.2020.07.014
110. Labovitz DL, Shafner L, Reyes Gil M, Virmani D, Hanina A. Using artificial intelligence to reduce the risk of nonadherence in patients on anticoagulation therapy. Stroke. 2017;48(5):1416-1419. doi:10.1161/STROKEAHA.116.016281
111. Forlenza GP. Use of artificial intelligence to improve diabetes outcomes in patients using multiple daily injections therapy. Diabetes Technol Ther. 2019;21(S2):S24-S28. doi:10.1089/dia.2019.0077
112. Poser CM. CT scan and the practice of neurology. Arch Neurol. 1977;34(2):132. doi:10.1001/archneur.1977.00500140086023
113. Angus DC. Randomized clinical trials of artificial intelligence. JAMA. 2020;323(11):1043-1045. doi:10.1001/jama.2020.1039
114. Topol EJ. Welcoming new guidelines for AI clinical research. Nat Med. 2020;26(9):1318-1320. doi:10.1038/s41591-020-1042-x
115. Collins GS, Moons KGM. Reporting of artificial intelligence prediction models. Lancet. 2019;393(10181):1577-1579. doi:10.1016/S0140-6736(19)30037-6
116. Cruz Rivera S, Liu X, Chan AW, et al. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension. Nat Med. 2020;26(9):1351-1363. doi:10.1038/s41591-020-1037-7
117. Liu X, Cruz Rivera S, Moher D, Calvert MJ, Denniston AK; SPIRIT-AI and CONSORT-AI Working Group. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat Med. 2020;26(9):1364-1374. doi:10.1038/s41591-020-1034-x
118. McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys. 1943;5(4):115-133. doi:10.1007/BF02478259
119. Samuel AL. Some studies in machine learning using the game of Checkers. IBM J Res Dev. 1959;3(3):535-554. Accessed September 15, 2021. https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.368.2254
120. Sonoda M, Takano M, Miyahara J, Kato H. Computed radiography utilizing scanning laser stimulated luminescence. Radiology. 1983;148(3):833-838. doi:10.1148/radiology.148.3.6878707
121. Dechter R. Learning while searching in constraint-satisfaction-problems. AAAI’86: proceedings of the fifth AAAI national conference on artificial intelligence. Published 1986. Accessed September 15, 2021. https://www.aaai.org/Papers/AAAI/1986/AAAI86-029.pdf
122. Le Cun Y, Jackel LD, Boser B, et al. Handwritten digit recognition: applications of neural network chips and automatic learning. IEEE Commun Mag. 1989;27(11):41-46. doi:10.1109/35.41400
123. US Food and Drug Administration. FDA allows marketing of first whole slide imaging system for digital pathology [press release]. Published April 12, 2017. Accessed September 15, 2021. https://www.fda.gov/news-events/press-announcements/fda-allows-marketing-first-whole-slide-imaging-system-digital-pathology
Using Artificial Intelligence for COVID-19 Chest X-ray Diagnosis
The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARSCoV- 2), which causes the respiratory disease coronavirus disease-19 (COVID- 19), was first identified as a cluster of cases of pneumonia in Wuhan, Hubei Province of China on December 31, 2019.1 Within a month, the disease had spread significantly, leading the World Health Organization (WHO) to designate COVID-19 a public health emergency of international concern. On March 11, 2020, the WHO declared COVID-19 a global pandemic.2 As of August 18, 2020, the virus has infected > 21 million people, with > 750,000 deaths worldwide.3 The spread of COVID-19 has had a dramatic impact on social, economic, and health care issues throughout the world, which has been discussed elsewhere.4
Prior to the this century, members of the coronavirus family had minimal impact on human health.5 However, in the past 20 years, outbreaks have highlighted an emerging importance of coronaviruses in morbidity and mortality on a global scale. Although less prevalent than COVID-19, severe acute respiratory syndrome (SARS) in 2002 to 2003 and Middle East respiratory syndrome (MERS) in 2012 likely had higher mortality rates than the current pandemic.5 Based on this recent history, it is reasonable to assume that we will continue to see novel diseases with similar significant health and societal implications. The challenges presented to health care providers (HCPs) by such novel viral pathogens are numerous, including methods for rapid diagnosis, prevention, and treatment. In the current study, we focus on diagnosis issues, which were evident with COVID-19 with the time required to develop rapid and effective diagnostic modalities.
We have previously reported the utility of using artificial intelligence (AI) in the histopathologic diagnosis of cancer.6-8 AI was first described in 1956 and involves the field of computer science in which machines are trained to learn from experience.9 Machine learning (ML) is a subset of AI and is achieved by using mathematic models to compute sample datasets.10 Current ML employs deep learning with neural network algorithms, which can recognize patterns and achieve complex computational tasks often far quicker and with increased precision than can humans.11-13 In addition to applications in pathology, ML algorithms have both prognostic and diagnostic applications in multiple medical specialties, such as radiology, dermatology, ophthalmology, and cardiology.6 It is predicted that AI will impact almost every aspect of health care in the future.14
In this article, we examine the potential for AI to diagnose patients with COVID-19 pneumonia using chest radiographs (CXR) alone. This is done using Microsoft CustomVision (www.customvision.ai), a readily available, automated ML platform. Employing AI to both screen and diagnose emerging health emergencies such as COVID-19 has the potential to dramatically change how we approach medical care in the future. In addition, we describe the creation of a publicly available website (interknowlogy-covid-19 .azurewebsites.net) that could augment COVID-19 pneumonia CXR diagnosis.
Methods
For the training dataset, 103 CXR images of COVID-19 were downloaded from GitHub covid-chest-xray dataset.15 Five hundred images of non-COVID-19 pneumonia and 500 images of the normal lung were downloaded from the Kaggle RSNA Pneumonia Detection Challenge dataset.16 To balance the dataset, we expanded the COVID-19 dataset to 500 images by slight rotation (probability = 1, max rotation = 5) and zooming (probability = 0.5, percentage area = 0.9) of the original images using the Augmentor Python package.17
Validation Dataset
For the validation dataset 30 random CXR images were obtained from the US Department of Veterans Affairs (VA) PACS (picture archiving and communication system). This dataset included 10 CXR images from hospitalized patients with COVID-19, 10 CXR pneumonia images from patients without COVID-19, and 10 normal CXRs. COVID-19 diagnoses were confirmed with a positive test result from the Xpert Xpress SARS-CoV-2 polymerase chain reaction (PCR) platform.18
Microsoft Custom
Vision Microsoft CustomVision is an automated image classification and object detection system that is a part of Microsoft Azure Cognitive Services (azure.microsoft.com). It has a pay-as-you-go model with fees depending on the computing needs and usage. It offers a free trial to users for 2 initial projects. The service is online with an easy-to-follow graphical user interface. No coding skills are necessary.
We created a new classification project in CustomVision and chose a compact general domain for small size and easy export to TensorFlow. js model format. TensorFlow.js is a JavaScript library that enables dynamic download and execution of ML models. After the project was created, we proceeded to upload our image dataset. Each class was uploaded separately and tagged with the appropriate label (covid pneumonia, non-covid pneumonia, or normal lung). The system rejected 16 COVID-19 images as duplicates. The final CustomVision training dataset consisted of 484 images of COVID-19 pneumonia, 500 images of non-COVID-19 pneumonia, and 500 images of normal lungs. Once uploaded, CustomVision self-trains using the dataset upon initiating the program (Figure 1).
Website Creation
CustomVision was used to train the model. It can be used to execute the model continuously, or the model can be compacted and decoupled from CustomVision. In this case, the model was compacted and decoupled for use in an online application. An Angular online application was created with TensorFlow.js. Within a user’s web browser, the model is executed when an image of a CXR is submitted. Confidence values for each classification are returned. In this design, after the initial webpage and model is downloaded, the webpage no longer needs to access any server components and performs all operations in the browser. Although the solution works well on mobile phone browsers and in low bandwidth situations, the quality of predictions may depend on the browser and device used. At no time does an image get submitted to the cloud.
Result
Overall, our trained model showed 92.9% precision and recall. Precision and recall results for each label were 98.9% and 94.8%, respectively for COVID-19 pneumonia; 91.8% and 89%, respectively, for non- COVID-19 pneumonia; and 88.8% and 95%, respectively, for normal lung (Figure 2). Next, we proceeded to validate the training model on the VA data by making individual predictions on 30 images from the VA dataset. Our model performed well with 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value (Table).
Discussion
We successfully demonstrated the potential of using AI algorithms in assessing CXRs for COVID-19. We first trained the CustomVision automated image classification and object detection system to differentiate cases of COVID-19 from pneumonia from other etiologies as well as normal lung CXRs. We then tested our model against known patients from the James A. Haley Veterans’ Hospital in Tampa, Florida. The program achieved 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value in differentiating the 3 scenarios. Using the trained ML model, we proceeded to create a website that could augment COVID-19 CXR diagnosis.19 The website works on mobile as well as desktop platforms. A health care provider can take a CXR photo with a mobile phone or upload the image file. The ML algorithm would provide the probability of COVID-19 pneumonia, non-COVID-19 pneumonia, or normal lung diagnosis (Figure 3).
Emerging diseases such as COVID-19 present numerous challenges to HCPs, governments, and businesses, as well as to individual members of society. As evidenced with COVID-19, the time from first recognition of an emerging pathogen to the development of methods for reliable diagnosis and treatment can be months, even with a concerted international effort. The gold standard for diagnosis of COVID-19 is by reverse transcriptase PCR (RT-PCR) technologies; however, early RT-PCR testing produced less than optimal results.20-22 Even after the development of reliable tests for detection, making test kits readily available to health care providers on an adequate scale presents an additional challenge as evident with COVID-19.
Use of X-ray vs Computed Tomography
The lack of availability of diagnostic RTPCR with COVID-19 initially placed increased reliability on presumptive diagnoses via imaging in some situations.23 Most of the literature evaluating radiographs of patients with COVID-19 focuses on chest computed tomography (CT) findings, with initial results suggesting CT was more accurate than early RT-PCR methodologies.21,22,24 The Radiological Society of North America Expert consensus statement on chest CT for COVID-19 states that CT findings can even precede positivity on RT-PCR in some cases.22 However, currently it does not recommend the use of CT scanning as a screening tool. Furthermore, the actual sensitivity and specificity of CT interpretation by radiologists for COVID-19 are unknown.22
Characteristic CT findings include ground-glass opacities (GGOs) and consolidation most commonly in the lung periphery, though a diffuse distribution was found in a minority of patients.21,23,25-27 Lomoro and colleagues recently summarized the CT findings from several reports that described abnormalities as most often bilateral and peripheral, subpleural, and affecting the lower lobes.26 Not surprisingly, CT appears more sensitive at detecting changes with COVID-19 than does CXR, with reports that a minority of patients exhibited CT changes before changes were visible on CXR.23,26
We focused our study on the potential of AI in the examination of CXRs in patients with COVID-19, as there are several limitations to the routine use of CT scans with conditions such as COVID-19. Aside from the more considerable time required to obtain CTs, there are issues with contamination of CT suites, sometimes requiring a dedicated COVID-19 CT scanner.23,28 The time constraints of decontamination or limited utilization of CT suites can delay or disrupt services for patients with and without COVID-19. Because of these factors, CXR may be a better resource to minimize the risk of infection to other patients. Also, accurate assessment of abnormalities on CXR for COVID-19 may identify patients in whom the CXR was performed for other purposes.23 CXR is more readily available than CT, especially in more remote or underdeveloped areas.28 Finally, as with CT, CXR abnormalities are reported to have appeared before RT-PCR tests became positive for a minority of patients.23
CXR findings described in patients with COVID-19 are similar to those of CT and include GGOs, consolidation, and hazy increased opacities.23,25,26,28,29 Like CT, the majority of patients who received CXR demonstrated greater involvement in the lower zones and peripherally.23,25,26,28,29 Most patients showed bilateral involvement. However, while these findings are common in patients with COVID-19, they are not specific and can be seen in other conditions, such as other viral pneumonia, bacterial pneumonia, injury from drug toxicity, inhalation injury, connective tissue disease, and idiopathic conditions.
Application of AI for COVID-19
Applications of AI in interpreting radiographs of various types are numerous, and extensive literature has been written on the topic.30 Using deep learning algorithms, AI has multiple possible roles to augment traditional radiograph interpretation. These include the potential for screening, triaging, and increasing the speed to render diagnoses. It also can provide a rapid “second opinion” to the radiologist to support the final interpretation. In areas with critical shortages of radiologists, AI potentially can be used to render the definitive diagnosis. In COVID- 19, imaging studies have been shown to correlate with disease severity and mortality, and AI could assist in monitoring the course of the disease as it progresses and potentially identify patients at greatest risk.27 Furthermore, early results from PCR have been considered suboptimal, and it is known that patients with COVID-19 can test negative initially even by reliable testing methodologies. As AI technology progresses, interpretation can detect and guide triage and treatment of patients with high suspicions of COVID-19 but negative initial PCR results, or in situations where test availability is limited or results are delayed. There are numerous potential benefits should a rapid diagnostic test as simple as a CXR be able to reliably impact containment and prevention of the spread of contagions such as COVID- 19 early in its course.
Few studies have assessed using AI in the radiologic diagnosis of COVID-19, most of which use CT scanning. Bai and colleagues demonstrated increased accuracy, sensitivity, and specificity in distinguishing chest CTs of COVID-19 patients from other types of pneumonia.21,31 A separate study demonstrated the utility of using AI to differentiate COVID-19 from community-acquired pneumonia with CT.32 However, the effective utility of AI for CXR interpretation also has been demonstrated.14,33 Implementation of convolutional neural network layers has allowed for reliable differentiation of viral and bacterial pneumonia with CXR imaging.34 Evidence suggests that there is great potential in the application of AI in the interpretation of radiographs of all types.
Finally, we have developed a publicly available website based on our studies.18 This website is for research use only as it is based on data from our preliminary investigation. To appear within the website, images must have protected health information removed before uploading. The information on the website, including text, graphics, images, or other material, is for research and may not be appropriate for all circumstances. The website does not provide medical, professional, or licensed advice and is not a substitute for consultation with a HCP. Medical advice should be sought from a qualified HCP for any questions, and the website should not be used for medical diagnosis or treatment.
Limitations
In our preliminary study, we have demonstrated the potential impact AI can have in multiple aspects of patient care for emerging pathogens such as COVID-19 using a test as readily available as a CXR. However, several limitations to this investigation should be mentioned. The study is retrospective in nature with limited sample size and with X-rays from patients with various stages of COVID-19 pneumonia. Also, cases of non-COVID-19 pneumonia are not stratified into different types or etiologies. We intend to demonstrate the potential of AI in differentiating COVID-19 pneumonia from non-COVID-19 pneumonia of any etiology, though future studies should address comparison of COVID-19 cases to more specific types of pneumonias, such as of bacterial or viral origin. Furthermore, the present study does not address any potential effects of additional radiographic findings from coexistent conditions, such as pulmonary edema as seen in congestive heart failure, pleural effusions (which can be seen with COVID-19 pneumonia, though rarely), interstitial lung disease, etc. Future studies are required to address these issues. Ultimately, prospective studies to assess AI-assisted radiographic interpretation in conditions such as COVID-19 are required to demonstrate the impact on diagnosis, treatment, outcome, and patient safety as these technologies are implemented.
Conclusions
We have used a readily available, commercial platform to demonstrate the potential of AI to assist in the successful diagnosis of COVID-19 pneumonia on CXR images. While this technology has numerous applications in radiology, we have focused on the potential impact on future world health crises such as COVID-19. The findings have implications for screening and triage, initial diagnosis, monitoring disease progression, and identifying patients at increased risk of morbidity and mortality. Based on the data, a website was created to demonstrate how such technologies could be shared and distributed to others to combat entities such as COVID-19 moving forward. Our study offers a small window into the potential for how AI will likely dramatically change the practice of medicine in the future.
1. World Health Organization. Coronavirus disease (COVID- 19) pandemic. https://www.who.int/emergencies/diseases /novel-coronavirus2019. Updated August 23, 2020. Accessed August 24, 2020.
2. World Health Organization. WHO Director-General’s opening remarks at the media briefing on COVID-19 - 11 March 2020. https://www.who.int/dg/speeches/detail/who -director-general-sopening-remarks-at-the-media-briefing -on-covid-19---11-march2020. Published March 11, 2020. Accessed August 24, 2020.
3. World Health Organization. Coronavirus disease (COVID- 19): situation report--209. https://www.who.int/docs /default-source/coronaviruse/situation-reports/20200816 -covid-19-sitrep-209.pdf. Updated August 16, 2020. Accessed August 24, 2020.
4. Nicola M, Alsafi Z, Sohrabi C, et al. The socio-economic implications of the coronavirus pandemic (COVID-19): a review. Int J Surg. 2020;78:185-193. doi:10.1016/j.ijsu.2020.04.018
5. da Costa VG, Moreli ML, Saivish MV. The emergence of SARS, MERS and novel SARS-2 coronaviruses in the 21st century. Arch Virol. 2020;165(7):1517-1526. doi:10.1007/s00705-020-04628-0
6. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.
7. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Updated January 15, 2019. Accessed August 24, 2020.
8. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. http:// arxiv.org/abs/1808.08230. Updated January 15, 2019. Accessed August 24, 2020.
9. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87. doi:10.1609/AIMAG.V27I4.1911
10. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229. doi:10.1147/rd.33.0210
11. Sarle WS. Neural networks and statistical models https:// people.orie.cornell.edu/davidr/or474/nn_sas.pdf. Published April 1994. Accessed August 24, 2020.
12. Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85-117. doi:10.1016/j.neunet.2014.09.003
13. 13. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539
14. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44- 56. doi:10.1038/s41591-018-0300-7
15. Cohen JP, Morrison P, Dao L. COVID-19 Image Data Collection. Published online March 25, 2020. Accessed May 13, 2020. http://arxiv.org/abs/2003.11597
16. Radiological Society of America. RSNA pneumonia detection challenge. https://www.kaggle.com/c/rsnapneumonia- detectionchallenge. Accessed August 24, 2020.
17. Bloice MD, Roth PM, Holzinger A. Biomedical image augmentation using Augmentor. Bioinformatics. 2019;35(21):4522-4524. doi:10.1093/bioinformatics/btz259
18. Cepheid. Xpert Xpress SARS-CoV-2. https://www.cepheid .com/coronavirus. Accessed August 24, 2020.
19. Interknowlogy. COVID-19 detection in chest X-rays. https://interknowlogy-covid-19.azurewebsites.net. Accessed August 27, 2020.
20. Bernheim A, Mei X, Huang M, et al. Chest CT Findings in Coronavirus Disease-19 (COVID-19): Relationship to Duration of Infection. Radiology. 2020;295(3):200463. doi:10.1148/radiol.2020200463
21. Ai T, Yang Z, Hou H, et al. Correlation of Chest CT and RTPCR Testing for Coronavirus Disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology. 2020;296(2):E32- E40. doi:10.1148/radiol.2020200642
22. Simpson S, Kay FU, Abbara S, et al. Radiological Society of North America Expert Consensus Statement on Reporting Chest CT Findings Related to COVID-19. Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA - Secondary Publication. J Thorac Imaging. 2020;35(4):219-227. doi:10.1097/RTI.0000000000000524
23. Wong HYF, Lam HYS, Fong AH, et al. Frequency and distribution of chest radiographic findings in patients positive for COVID-19. Radiology. 2020;296(2):E72-E78. doi:10.1148/radiol.2020201160
24. Fang Y, Zhang H, Xie J, et al. Sensitivity of chest CT for COVID-19: comparison to RT-PCR. Radiology. 2020;296(2):E115-E117. doi:10.1148/radiol.2020200432
25. Chen N, Zhou M, Dong X, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. 2020;395(10223):507-513. doi:10.1016/S0140-6736(20)30211-7
26. Lomoro P, Verde F, Zerboni F, et al. COVID-19 pneumonia manifestations at the admission on chest ultrasound, radiographs, and CT: single-center study and comprehensive radiologic literature review. Eur J Radiol Open. 2020;7:100231. doi:10.1016/j.ejro.2020.100231
27. Salehi S, Abedi A, Balakrishnan S, Gholamrezanezhad A. Coronavirus disease 2019 (COVID-19) imaging reporting and data system (COVID-RADS) and common lexicon: a proposal based on the imaging data of 37 studies. Eur Radiol. 2020;30(9):4930-4942. doi:10.1007/s00330-020-06863-0
28. Jacobi A, Chung M, Bernheim A, Eber C. Portable chest X-ray in coronavirus disease-19 (COVID- 19): a pictorial review. Clin Imaging. 2020;64:35-42. doi:10.1016/j.clinimag.2020.04.001
29. Bhat R, Hamid A, Kunin JR, et al. Chest imaging in patients hospitalized With COVID-19 infection - a case series. Curr Probl Diagn Radiol. 2020;49(4):294-301. doi:10.1067/j.cpradiol.2020.04.001
30. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Heal. 2019;1(6):E271- E297. doi:10.1016/S2589-7500(19)30123-2
31. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491
32. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905
33. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. http://arxiv.org /abs/2002.11379. Updated March 11, 2020. Accessed August 24, 2020.
34. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by imagebased deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010
The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARSCoV- 2), which causes the respiratory disease coronavirus disease-19 (COVID- 19), was first identified as a cluster of cases of pneumonia in Wuhan, Hubei Province of China on December 31, 2019.1 Within a month, the disease had spread significantly, leading the World Health Organization (WHO) to designate COVID-19 a public health emergency of international concern. On March 11, 2020, the WHO declared COVID-19 a global pandemic.2 As of August 18, 2020, the virus has infected > 21 million people, with > 750,000 deaths worldwide.3 The spread of COVID-19 has had a dramatic impact on social, economic, and health care issues throughout the world, which has been discussed elsewhere.4
Prior to the this century, members of the coronavirus family had minimal impact on human health.5 However, in the past 20 years, outbreaks have highlighted an emerging importance of coronaviruses in morbidity and mortality on a global scale. Although less prevalent than COVID-19, severe acute respiratory syndrome (SARS) in 2002 to 2003 and Middle East respiratory syndrome (MERS) in 2012 likely had higher mortality rates than the current pandemic.5 Based on this recent history, it is reasonable to assume that we will continue to see novel diseases with similar significant health and societal implications. The challenges presented to health care providers (HCPs) by such novel viral pathogens are numerous, including methods for rapid diagnosis, prevention, and treatment. In the current study, we focus on diagnosis issues, which were evident with COVID-19 with the time required to develop rapid and effective diagnostic modalities.
We have previously reported the utility of using artificial intelligence (AI) in the histopathologic diagnosis of cancer.6-8 AI was first described in 1956 and involves the field of computer science in which machines are trained to learn from experience.9 Machine learning (ML) is a subset of AI and is achieved by using mathematic models to compute sample datasets.10 Current ML employs deep learning with neural network algorithms, which can recognize patterns and achieve complex computational tasks often far quicker and with increased precision than can humans.11-13 In addition to applications in pathology, ML algorithms have both prognostic and diagnostic applications in multiple medical specialties, such as radiology, dermatology, ophthalmology, and cardiology.6 It is predicted that AI will impact almost every aspect of health care in the future.14
In this article, we examine the potential for AI to diagnose patients with COVID-19 pneumonia using chest radiographs (CXR) alone. This is done using Microsoft CustomVision (www.customvision.ai), a readily available, automated ML platform. Employing AI to both screen and diagnose emerging health emergencies such as COVID-19 has the potential to dramatically change how we approach medical care in the future. In addition, we describe the creation of a publicly available website (interknowlogy-covid-19 .azurewebsites.net) that could augment COVID-19 pneumonia CXR diagnosis.
Methods
For the training dataset, 103 CXR images of COVID-19 were downloaded from GitHub covid-chest-xray dataset.15 Five hundred images of non-COVID-19 pneumonia and 500 images of the normal lung were downloaded from the Kaggle RSNA Pneumonia Detection Challenge dataset.16 To balance the dataset, we expanded the COVID-19 dataset to 500 images by slight rotation (probability = 1, max rotation = 5) and zooming (probability = 0.5, percentage area = 0.9) of the original images using the Augmentor Python package.17
Validation Dataset
For the validation dataset 30 random CXR images were obtained from the US Department of Veterans Affairs (VA) PACS (picture archiving and communication system). This dataset included 10 CXR images from hospitalized patients with COVID-19, 10 CXR pneumonia images from patients without COVID-19, and 10 normal CXRs. COVID-19 diagnoses were confirmed with a positive test result from the Xpert Xpress SARS-CoV-2 polymerase chain reaction (PCR) platform.18
Microsoft Custom
Vision Microsoft CustomVision is an automated image classification and object detection system that is a part of Microsoft Azure Cognitive Services (azure.microsoft.com). It has a pay-as-you-go model with fees depending on the computing needs and usage. It offers a free trial to users for 2 initial projects. The service is online with an easy-to-follow graphical user interface. No coding skills are necessary.
We created a new classification project in CustomVision and chose a compact general domain for small size and easy export to TensorFlow. js model format. TensorFlow.js is a JavaScript library that enables dynamic download and execution of ML models. After the project was created, we proceeded to upload our image dataset. Each class was uploaded separately and tagged with the appropriate label (covid pneumonia, non-covid pneumonia, or normal lung). The system rejected 16 COVID-19 images as duplicates. The final CustomVision training dataset consisted of 484 images of COVID-19 pneumonia, 500 images of non-COVID-19 pneumonia, and 500 images of normal lungs. Once uploaded, CustomVision self-trains using the dataset upon initiating the program (Figure 1).
Website Creation
CustomVision was used to train the model. It can be used to execute the model continuously, or the model can be compacted and decoupled from CustomVision. In this case, the model was compacted and decoupled for use in an online application. An Angular online application was created with TensorFlow.js. Within a user’s web browser, the model is executed when an image of a CXR is submitted. Confidence values for each classification are returned. In this design, after the initial webpage and model is downloaded, the webpage no longer needs to access any server components and performs all operations in the browser. Although the solution works well on mobile phone browsers and in low bandwidth situations, the quality of predictions may depend on the browser and device used. At no time does an image get submitted to the cloud.
Result
Overall, our trained model showed 92.9% precision and recall. Precision and recall results for each label were 98.9% and 94.8%, respectively for COVID-19 pneumonia; 91.8% and 89%, respectively, for non- COVID-19 pneumonia; and 88.8% and 95%, respectively, for normal lung (Figure 2). Next, we proceeded to validate the training model on the VA data by making individual predictions on 30 images from the VA dataset. Our model performed well with 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value (Table).
Discussion
We successfully demonstrated the potential of using AI algorithms in assessing CXRs for COVID-19. We first trained the CustomVision automated image classification and object detection system to differentiate cases of COVID-19 from pneumonia from other etiologies as well as normal lung CXRs. We then tested our model against known patients from the James A. Haley Veterans’ Hospital in Tampa, Florida. The program achieved 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value in differentiating the 3 scenarios. Using the trained ML model, we proceeded to create a website that could augment COVID-19 CXR diagnosis.19 The website works on mobile as well as desktop platforms. A health care provider can take a CXR photo with a mobile phone or upload the image file. The ML algorithm would provide the probability of COVID-19 pneumonia, non-COVID-19 pneumonia, or normal lung diagnosis (Figure 3).
Emerging diseases such as COVID-19 present numerous challenges to HCPs, governments, and businesses, as well as to individual members of society. As evidenced with COVID-19, the time from first recognition of an emerging pathogen to the development of methods for reliable diagnosis and treatment can be months, even with a concerted international effort. The gold standard for diagnosis of COVID-19 is by reverse transcriptase PCR (RT-PCR) technologies; however, early RT-PCR testing produced less than optimal results.20-22 Even after the development of reliable tests for detection, making test kits readily available to health care providers on an adequate scale presents an additional challenge as evident with COVID-19.
Use of X-ray vs Computed Tomography
The lack of availability of diagnostic RTPCR with COVID-19 initially placed increased reliability on presumptive diagnoses via imaging in some situations.23 Most of the literature evaluating radiographs of patients with COVID-19 focuses on chest computed tomography (CT) findings, with initial results suggesting CT was more accurate than early RT-PCR methodologies.21,22,24 The Radiological Society of North America Expert consensus statement on chest CT for COVID-19 states that CT findings can even precede positivity on RT-PCR in some cases.22 However, currently it does not recommend the use of CT scanning as a screening tool. Furthermore, the actual sensitivity and specificity of CT interpretation by radiologists for COVID-19 are unknown.22
Characteristic CT findings include ground-glass opacities (GGOs) and consolidation most commonly in the lung periphery, though a diffuse distribution was found in a minority of patients.21,23,25-27 Lomoro and colleagues recently summarized the CT findings from several reports that described abnormalities as most often bilateral and peripheral, subpleural, and affecting the lower lobes.26 Not surprisingly, CT appears more sensitive at detecting changes with COVID-19 than does CXR, with reports that a minority of patients exhibited CT changes before changes were visible on CXR.23,26
We focused our study on the potential of AI in the examination of CXRs in patients with COVID-19, as there are several limitations to the routine use of CT scans with conditions such as COVID-19. Aside from the more considerable time required to obtain CTs, there are issues with contamination of CT suites, sometimes requiring a dedicated COVID-19 CT scanner.23,28 The time constraints of decontamination or limited utilization of CT suites can delay or disrupt services for patients with and without COVID-19. Because of these factors, CXR may be a better resource to minimize the risk of infection to other patients. Also, accurate assessment of abnormalities on CXR for COVID-19 may identify patients in whom the CXR was performed for other purposes.23 CXR is more readily available than CT, especially in more remote or underdeveloped areas.28 Finally, as with CT, CXR abnormalities are reported to have appeared before RT-PCR tests became positive for a minority of patients.23
CXR findings described in patients with COVID-19 are similar to those of CT and include GGOs, consolidation, and hazy increased opacities.23,25,26,28,29 Like CT, the majority of patients who received CXR demonstrated greater involvement in the lower zones and peripherally.23,25,26,28,29 Most patients showed bilateral involvement. However, while these findings are common in patients with COVID-19, they are not specific and can be seen in other conditions, such as other viral pneumonia, bacterial pneumonia, injury from drug toxicity, inhalation injury, connective tissue disease, and idiopathic conditions.
Application of AI for COVID-19
Applications of AI in interpreting radiographs of various types are numerous, and extensive literature has been written on the topic.30 Using deep learning algorithms, AI has multiple possible roles to augment traditional radiograph interpretation. These include the potential for screening, triaging, and increasing the speed to render diagnoses. It also can provide a rapid “second opinion” to the radiologist to support the final interpretation. In areas with critical shortages of radiologists, AI potentially can be used to render the definitive diagnosis. In COVID- 19, imaging studies have been shown to correlate with disease severity and mortality, and AI could assist in monitoring the course of the disease as it progresses and potentially identify patients at greatest risk.27 Furthermore, early results from PCR have been considered suboptimal, and it is known that patients with COVID-19 can test negative initially even by reliable testing methodologies. As AI technology progresses, interpretation can detect and guide triage and treatment of patients with high suspicions of COVID-19 but negative initial PCR results, or in situations where test availability is limited or results are delayed. There are numerous potential benefits should a rapid diagnostic test as simple as a CXR be able to reliably impact containment and prevention of the spread of contagions such as COVID- 19 early in its course.
Few studies have assessed using AI in the radiologic diagnosis of COVID-19, most of which use CT scanning. Bai and colleagues demonstrated increased accuracy, sensitivity, and specificity in distinguishing chest CTs of COVID-19 patients from other types of pneumonia.21,31 A separate study demonstrated the utility of using AI to differentiate COVID-19 from community-acquired pneumonia with CT.32 However, the effective utility of AI for CXR interpretation also has been demonstrated.14,33 Implementation of convolutional neural network layers has allowed for reliable differentiation of viral and bacterial pneumonia with CXR imaging.34 Evidence suggests that there is great potential in the application of AI in the interpretation of radiographs of all types.
Finally, we have developed a publicly available website based on our studies.18 This website is for research use only as it is based on data from our preliminary investigation. To appear within the website, images must have protected health information removed before uploading. The information on the website, including text, graphics, images, or other material, is for research and may not be appropriate for all circumstances. The website does not provide medical, professional, or licensed advice and is not a substitute for consultation with a HCP. Medical advice should be sought from a qualified HCP for any questions, and the website should not be used for medical diagnosis or treatment.
Limitations
In our preliminary study, we have demonstrated the potential impact AI can have in multiple aspects of patient care for emerging pathogens such as COVID-19 using a test as readily available as a CXR. However, several limitations to this investigation should be mentioned. The study is retrospective in nature with limited sample size and with X-rays from patients with various stages of COVID-19 pneumonia. Also, cases of non-COVID-19 pneumonia are not stratified into different types or etiologies. We intend to demonstrate the potential of AI in differentiating COVID-19 pneumonia from non-COVID-19 pneumonia of any etiology, though future studies should address comparison of COVID-19 cases to more specific types of pneumonias, such as of bacterial or viral origin. Furthermore, the present study does not address any potential effects of additional radiographic findings from coexistent conditions, such as pulmonary edema as seen in congestive heart failure, pleural effusions (which can be seen with COVID-19 pneumonia, though rarely), interstitial lung disease, etc. Future studies are required to address these issues. Ultimately, prospective studies to assess AI-assisted radiographic interpretation in conditions such as COVID-19 are required to demonstrate the impact on diagnosis, treatment, outcome, and patient safety as these technologies are implemented.
Conclusions
We have used a readily available, commercial platform to demonstrate the potential of AI to assist in the successful diagnosis of COVID-19 pneumonia on CXR images. While this technology has numerous applications in radiology, we have focused on the potential impact on future world health crises such as COVID-19. The findings have implications for screening and triage, initial diagnosis, monitoring disease progression, and identifying patients at increased risk of morbidity and mortality. Based on the data, a website was created to demonstrate how such technologies could be shared and distributed to others to combat entities such as COVID-19 moving forward. Our study offers a small window into the potential for how AI will likely dramatically change the practice of medicine in the future.
The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARSCoV- 2), which causes the respiratory disease coronavirus disease-19 (COVID- 19), was first identified as a cluster of cases of pneumonia in Wuhan, Hubei Province of China on December 31, 2019.1 Within a month, the disease had spread significantly, leading the World Health Organization (WHO) to designate COVID-19 a public health emergency of international concern. On March 11, 2020, the WHO declared COVID-19 a global pandemic.2 As of August 18, 2020, the virus has infected > 21 million people, with > 750,000 deaths worldwide.3 The spread of COVID-19 has had a dramatic impact on social, economic, and health care issues throughout the world, which has been discussed elsewhere.4
Prior to the this century, members of the coronavirus family had minimal impact on human health.5 However, in the past 20 years, outbreaks have highlighted an emerging importance of coronaviruses in morbidity and mortality on a global scale. Although less prevalent than COVID-19, severe acute respiratory syndrome (SARS) in 2002 to 2003 and Middle East respiratory syndrome (MERS) in 2012 likely had higher mortality rates than the current pandemic.5 Based on this recent history, it is reasonable to assume that we will continue to see novel diseases with similar significant health and societal implications. The challenges presented to health care providers (HCPs) by such novel viral pathogens are numerous, including methods for rapid diagnosis, prevention, and treatment. In the current study, we focus on diagnosis issues, which were evident with COVID-19 with the time required to develop rapid and effective diagnostic modalities.
We have previously reported the utility of using artificial intelligence (AI) in the histopathologic diagnosis of cancer.6-8 AI was first described in 1956 and involves the field of computer science in which machines are trained to learn from experience.9 Machine learning (ML) is a subset of AI and is achieved by using mathematic models to compute sample datasets.10 Current ML employs deep learning with neural network algorithms, which can recognize patterns and achieve complex computational tasks often far quicker and with increased precision than can humans.11-13 In addition to applications in pathology, ML algorithms have both prognostic and diagnostic applications in multiple medical specialties, such as radiology, dermatology, ophthalmology, and cardiology.6 It is predicted that AI will impact almost every aspect of health care in the future.14
In this article, we examine the potential for AI to diagnose patients with COVID-19 pneumonia using chest radiographs (CXR) alone. This is done using Microsoft CustomVision (www.customvision.ai), a readily available, automated ML platform. Employing AI to both screen and diagnose emerging health emergencies such as COVID-19 has the potential to dramatically change how we approach medical care in the future. In addition, we describe the creation of a publicly available website (interknowlogy-covid-19 .azurewebsites.net) that could augment COVID-19 pneumonia CXR diagnosis.
Methods
For the training dataset, 103 CXR images of COVID-19 were downloaded from GitHub covid-chest-xray dataset.15 Five hundred images of non-COVID-19 pneumonia and 500 images of the normal lung were downloaded from the Kaggle RSNA Pneumonia Detection Challenge dataset.16 To balance the dataset, we expanded the COVID-19 dataset to 500 images by slight rotation (probability = 1, max rotation = 5) and zooming (probability = 0.5, percentage area = 0.9) of the original images using the Augmentor Python package.17
Validation Dataset
For the validation dataset 30 random CXR images were obtained from the US Department of Veterans Affairs (VA) PACS (picture archiving and communication system). This dataset included 10 CXR images from hospitalized patients with COVID-19, 10 CXR pneumonia images from patients without COVID-19, and 10 normal CXRs. COVID-19 diagnoses were confirmed with a positive test result from the Xpert Xpress SARS-CoV-2 polymerase chain reaction (PCR) platform.18
Microsoft Custom
Vision Microsoft CustomVision is an automated image classification and object detection system that is a part of Microsoft Azure Cognitive Services (azure.microsoft.com). It has a pay-as-you-go model with fees depending on the computing needs and usage. It offers a free trial to users for 2 initial projects. The service is online with an easy-to-follow graphical user interface. No coding skills are necessary.
We created a new classification project in CustomVision and chose a compact general domain for small size and easy export to TensorFlow. js model format. TensorFlow.js is a JavaScript library that enables dynamic download and execution of ML models. After the project was created, we proceeded to upload our image dataset. Each class was uploaded separately and tagged with the appropriate label (covid pneumonia, non-covid pneumonia, or normal lung). The system rejected 16 COVID-19 images as duplicates. The final CustomVision training dataset consisted of 484 images of COVID-19 pneumonia, 500 images of non-COVID-19 pneumonia, and 500 images of normal lungs. Once uploaded, CustomVision self-trains using the dataset upon initiating the program (Figure 1).
Website Creation
CustomVision was used to train the model. It can be used to execute the model continuously, or the model can be compacted and decoupled from CustomVision. In this case, the model was compacted and decoupled for use in an online application. An Angular online application was created with TensorFlow.js. Within a user’s web browser, the model is executed when an image of a CXR is submitted. Confidence values for each classification are returned. In this design, after the initial webpage and model is downloaded, the webpage no longer needs to access any server components and performs all operations in the browser. Although the solution works well on mobile phone browsers and in low bandwidth situations, the quality of predictions may depend on the browser and device used. At no time does an image get submitted to the cloud.
Result
Overall, our trained model showed 92.9% precision and recall. Precision and recall results for each label were 98.9% and 94.8%, respectively for COVID-19 pneumonia; 91.8% and 89%, respectively, for non- COVID-19 pneumonia; and 88.8% and 95%, respectively, for normal lung (Figure 2). Next, we proceeded to validate the training model on the VA data by making individual predictions on 30 images from the VA dataset. Our model performed well with 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value (Table).
Discussion
We successfully demonstrated the potential of using AI algorithms in assessing CXRs for COVID-19. We first trained the CustomVision automated image classification and object detection system to differentiate cases of COVID-19 from pneumonia from other etiologies as well as normal lung CXRs. We then tested our model against known patients from the James A. Haley Veterans’ Hospital in Tampa, Florida. The program achieved 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value in differentiating the 3 scenarios. Using the trained ML model, we proceeded to create a website that could augment COVID-19 CXR diagnosis.19 The website works on mobile as well as desktop platforms. A health care provider can take a CXR photo with a mobile phone or upload the image file. The ML algorithm would provide the probability of COVID-19 pneumonia, non-COVID-19 pneumonia, or normal lung diagnosis (Figure 3).
Emerging diseases such as COVID-19 present numerous challenges to HCPs, governments, and businesses, as well as to individual members of society. As evidenced with COVID-19, the time from first recognition of an emerging pathogen to the development of methods for reliable diagnosis and treatment can be months, even with a concerted international effort. The gold standard for diagnosis of COVID-19 is by reverse transcriptase PCR (RT-PCR) technologies; however, early RT-PCR testing produced less than optimal results.20-22 Even after the development of reliable tests for detection, making test kits readily available to health care providers on an adequate scale presents an additional challenge as evident with COVID-19.
Use of X-ray vs Computed Tomography
The lack of availability of diagnostic RTPCR with COVID-19 initially placed increased reliability on presumptive diagnoses via imaging in some situations.23 Most of the literature evaluating radiographs of patients with COVID-19 focuses on chest computed tomography (CT) findings, with initial results suggesting CT was more accurate than early RT-PCR methodologies.21,22,24 The Radiological Society of North America Expert consensus statement on chest CT for COVID-19 states that CT findings can even precede positivity on RT-PCR in some cases.22 However, currently it does not recommend the use of CT scanning as a screening tool. Furthermore, the actual sensitivity and specificity of CT interpretation by radiologists for COVID-19 are unknown.22
Characteristic CT findings include ground-glass opacities (GGOs) and consolidation most commonly in the lung periphery, though a diffuse distribution was found in a minority of patients.21,23,25-27 Lomoro and colleagues recently summarized the CT findings from several reports that described abnormalities as most often bilateral and peripheral, subpleural, and affecting the lower lobes.26 Not surprisingly, CT appears more sensitive at detecting changes with COVID-19 than does CXR, with reports that a minority of patients exhibited CT changes before changes were visible on CXR.23,26
We focused our study on the potential of AI in the examination of CXRs in patients with COVID-19, as there are several limitations to the routine use of CT scans with conditions such as COVID-19. Aside from the more considerable time required to obtain CTs, there are issues with contamination of CT suites, sometimes requiring a dedicated COVID-19 CT scanner.23,28 The time constraints of decontamination or limited utilization of CT suites can delay or disrupt services for patients with and without COVID-19. Because of these factors, CXR may be a better resource to minimize the risk of infection to other patients. Also, accurate assessment of abnormalities on CXR for COVID-19 may identify patients in whom the CXR was performed for other purposes.23 CXR is more readily available than CT, especially in more remote or underdeveloped areas.28 Finally, as with CT, CXR abnormalities are reported to have appeared before RT-PCR tests became positive for a minority of patients.23
CXR findings described in patients with COVID-19 are similar to those of CT and include GGOs, consolidation, and hazy increased opacities.23,25,26,28,29 Like CT, the majority of patients who received CXR demonstrated greater involvement in the lower zones and peripherally.23,25,26,28,29 Most patients showed bilateral involvement. However, while these findings are common in patients with COVID-19, they are not specific and can be seen in other conditions, such as other viral pneumonia, bacterial pneumonia, injury from drug toxicity, inhalation injury, connective tissue disease, and idiopathic conditions.
Application of AI for COVID-19
Applications of AI in interpreting radiographs of various types are numerous, and extensive literature has been written on the topic.30 Using deep learning algorithms, AI has multiple possible roles to augment traditional radiograph interpretation. These include the potential for screening, triaging, and increasing the speed to render diagnoses. It also can provide a rapid “second opinion” to the radiologist to support the final interpretation. In areas with critical shortages of radiologists, AI potentially can be used to render the definitive diagnosis. In COVID- 19, imaging studies have been shown to correlate with disease severity and mortality, and AI could assist in monitoring the course of the disease as it progresses and potentially identify patients at greatest risk.27 Furthermore, early results from PCR have been considered suboptimal, and it is known that patients with COVID-19 can test negative initially even by reliable testing methodologies. As AI technology progresses, interpretation can detect and guide triage and treatment of patients with high suspicions of COVID-19 but negative initial PCR results, or in situations where test availability is limited or results are delayed. There are numerous potential benefits should a rapid diagnostic test as simple as a CXR be able to reliably impact containment and prevention of the spread of contagions such as COVID- 19 early in its course.
Few studies have assessed using AI in the radiologic diagnosis of COVID-19, most of which use CT scanning. Bai and colleagues demonstrated increased accuracy, sensitivity, and specificity in distinguishing chest CTs of COVID-19 patients from other types of pneumonia.21,31 A separate study demonstrated the utility of using AI to differentiate COVID-19 from community-acquired pneumonia with CT.32 However, the effective utility of AI for CXR interpretation also has been demonstrated.14,33 Implementation of convolutional neural network layers has allowed for reliable differentiation of viral and bacterial pneumonia with CXR imaging.34 Evidence suggests that there is great potential in the application of AI in the interpretation of radiographs of all types.
Finally, we have developed a publicly available website based on our studies.18 This website is for research use only as it is based on data from our preliminary investigation. To appear within the website, images must have protected health information removed before uploading. The information on the website, including text, graphics, images, or other material, is for research and may not be appropriate for all circumstances. The website does not provide medical, professional, or licensed advice and is not a substitute for consultation with a HCP. Medical advice should be sought from a qualified HCP for any questions, and the website should not be used for medical diagnosis or treatment.
Limitations
In our preliminary study, we have demonstrated the potential impact AI can have in multiple aspects of patient care for emerging pathogens such as COVID-19 using a test as readily available as a CXR. However, several limitations to this investigation should be mentioned. The study is retrospective in nature with limited sample size and with X-rays from patients with various stages of COVID-19 pneumonia. Also, cases of non-COVID-19 pneumonia are not stratified into different types or etiologies. We intend to demonstrate the potential of AI in differentiating COVID-19 pneumonia from non-COVID-19 pneumonia of any etiology, though future studies should address comparison of COVID-19 cases to more specific types of pneumonias, such as of bacterial or viral origin. Furthermore, the present study does not address any potential effects of additional radiographic findings from coexistent conditions, such as pulmonary edema as seen in congestive heart failure, pleural effusions (which can be seen with COVID-19 pneumonia, though rarely), interstitial lung disease, etc. Future studies are required to address these issues. Ultimately, prospective studies to assess AI-assisted radiographic interpretation in conditions such as COVID-19 are required to demonstrate the impact on diagnosis, treatment, outcome, and patient safety as these technologies are implemented.
Conclusions
We have used a readily available, commercial platform to demonstrate the potential of AI to assist in the successful diagnosis of COVID-19 pneumonia on CXR images. While this technology has numerous applications in radiology, we have focused on the potential impact on future world health crises such as COVID-19. The findings have implications for screening and triage, initial diagnosis, monitoring disease progression, and identifying patients at increased risk of morbidity and mortality. Based on the data, a website was created to demonstrate how such technologies could be shared and distributed to others to combat entities such as COVID-19 moving forward. Our study offers a small window into the potential for how AI will likely dramatically change the practice of medicine in the future.
1. World Health Organization. Coronavirus disease (COVID- 19) pandemic. https://www.who.int/emergencies/diseases /novel-coronavirus2019. Updated August 23, 2020. Accessed August 24, 2020.
2. World Health Organization. WHO Director-General’s opening remarks at the media briefing on COVID-19 - 11 March 2020. https://www.who.int/dg/speeches/detail/who -director-general-sopening-remarks-at-the-media-briefing -on-covid-19---11-march2020. Published March 11, 2020. Accessed August 24, 2020.
3. World Health Organization. Coronavirus disease (COVID- 19): situation report--209. https://www.who.int/docs /default-source/coronaviruse/situation-reports/20200816 -covid-19-sitrep-209.pdf. Updated August 16, 2020. Accessed August 24, 2020.
4. Nicola M, Alsafi Z, Sohrabi C, et al. The socio-economic implications of the coronavirus pandemic (COVID-19): a review. Int J Surg. 2020;78:185-193. doi:10.1016/j.ijsu.2020.04.018
5. da Costa VG, Moreli ML, Saivish MV. The emergence of SARS, MERS and novel SARS-2 coronaviruses in the 21st century. Arch Virol. 2020;165(7):1517-1526. doi:10.1007/s00705-020-04628-0
6. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.
7. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Updated January 15, 2019. Accessed August 24, 2020.
8. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. http:// arxiv.org/abs/1808.08230. Updated January 15, 2019. Accessed August 24, 2020.
9. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87. doi:10.1609/AIMAG.V27I4.1911
10. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229. doi:10.1147/rd.33.0210
11. Sarle WS. Neural networks and statistical models https:// people.orie.cornell.edu/davidr/or474/nn_sas.pdf. Published April 1994. Accessed August 24, 2020.
12. Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85-117. doi:10.1016/j.neunet.2014.09.003
13. 13. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539
14. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44- 56. doi:10.1038/s41591-018-0300-7
15. Cohen JP, Morrison P, Dao L. COVID-19 Image Data Collection. Published online March 25, 2020. Accessed May 13, 2020. http://arxiv.org/abs/2003.11597
16. Radiological Society of America. RSNA pneumonia detection challenge. https://www.kaggle.com/c/rsnapneumonia- detectionchallenge. Accessed August 24, 2020.
17. Bloice MD, Roth PM, Holzinger A. Biomedical image augmentation using Augmentor. Bioinformatics. 2019;35(21):4522-4524. doi:10.1093/bioinformatics/btz259
18. Cepheid. Xpert Xpress SARS-CoV-2. https://www.cepheid .com/coronavirus. Accessed August 24, 2020.
19. Interknowlogy. COVID-19 detection in chest X-rays. https://interknowlogy-covid-19.azurewebsites.net. Accessed August 27, 2020.
20. Bernheim A, Mei X, Huang M, et al. Chest CT Findings in Coronavirus Disease-19 (COVID-19): Relationship to Duration of Infection. Radiology. 2020;295(3):200463. doi:10.1148/radiol.2020200463
21. Ai T, Yang Z, Hou H, et al. Correlation of Chest CT and RTPCR Testing for Coronavirus Disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology. 2020;296(2):E32- E40. doi:10.1148/radiol.2020200642
22. Simpson S, Kay FU, Abbara S, et al. Radiological Society of North America Expert Consensus Statement on Reporting Chest CT Findings Related to COVID-19. Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA - Secondary Publication. J Thorac Imaging. 2020;35(4):219-227. doi:10.1097/RTI.0000000000000524
23. Wong HYF, Lam HYS, Fong AH, et al. Frequency and distribution of chest radiographic findings in patients positive for COVID-19. Radiology. 2020;296(2):E72-E78. doi:10.1148/radiol.2020201160
24. Fang Y, Zhang H, Xie J, et al. Sensitivity of chest CT for COVID-19: comparison to RT-PCR. Radiology. 2020;296(2):E115-E117. doi:10.1148/radiol.2020200432
25. Chen N, Zhou M, Dong X, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. 2020;395(10223):507-513. doi:10.1016/S0140-6736(20)30211-7
26. Lomoro P, Verde F, Zerboni F, et al. COVID-19 pneumonia manifestations at the admission on chest ultrasound, radiographs, and CT: single-center study and comprehensive radiologic literature review. Eur J Radiol Open. 2020;7:100231. doi:10.1016/j.ejro.2020.100231
27. Salehi S, Abedi A, Balakrishnan S, Gholamrezanezhad A. Coronavirus disease 2019 (COVID-19) imaging reporting and data system (COVID-RADS) and common lexicon: a proposal based on the imaging data of 37 studies. Eur Radiol. 2020;30(9):4930-4942. doi:10.1007/s00330-020-06863-0
28. Jacobi A, Chung M, Bernheim A, Eber C. Portable chest X-ray in coronavirus disease-19 (COVID- 19): a pictorial review. Clin Imaging. 2020;64:35-42. doi:10.1016/j.clinimag.2020.04.001
29. Bhat R, Hamid A, Kunin JR, et al. Chest imaging in patients hospitalized With COVID-19 infection - a case series. Curr Probl Diagn Radiol. 2020;49(4):294-301. doi:10.1067/j.cpradiol.2020.04.001
30. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Heal. 2019;1(6):E271- E297. doi:10.1016/S2589-7500(19)30123-2
31. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491
32. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905
33. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. http://arxiv.org /abs/2002.11379. Updated March 11, 2020. Accessed August 24, 2020.
34. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by imagebased deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010
1. World Health Organization. Coronavirus disease (COVID- 19) pandemic. https://www.who.int/emergencies/diseases /novel-coronavirus2019. Updated August 23, 2020. Accessed August 24, 2020.
2. World Health Organization. WHO Director-General’s opening remarks at the media briefing on COVID-19 - 11 March 2020. https://www.who.int/dg/speeches/detail/who -director-general-sopening-remarks-at-the-media-briefing -on-covid-19---11-march2020. Published March 11, 2020. Accessed August 24, 2020.
3. World Health Organization. Coronavirus disease (COVID- 19): situation report--209. https://www.who.int/docs /default-source/coronaviruse/situation-reports/20200816 -covid-19-sitrep-209.pdf. Updated August 16, 2020. Accessed August 24, 2020.
4. Nicola M, Alsafi Z, Sohrabi C, et al. The socio-economic implications of the coronavirus pandemic (COVID-19): a review. Int J Surg. 2020;78:185-193. doi:10.1016/j.ijsu.2020.04.018
5. da Costa VG, Moreli ML, Saivish MV. The emergence of SARS, MERS and novel SARS-2 coronaviruses in the 21st century. Arch Virol. 2020;165(7):1517-1526. doi:10.1007/s00705-020-04628-0
6. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.
7. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Updated January 15, 2019. Accessed August 24, 2020.
8. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. http:// arxiv.org/abs/1808.08230. Updated January 15, 2019. Accessed August 24, 2020.
9. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87. doi:10.1609/AIMAG.V27I4.1911
10. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229. doi:10.1147/rd.33.0210
11. Sarle WS. Neural networks and statistical models https:// people.orie.cornell.edu/davidr/or474/nn_sas.pdf. Published April 1994. Accessed August 24, 2020.
12. Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85-117. doi:10.1016/j.neunet.2014.09.003
13. 13. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539
14. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44- 56. doi:10.1038/s41591-018-0300-7
15. Cohen JP, Morrison P, Dao L. COVID-19 Image Data Collection. Published online March 25, 2020. Accessed May 13, 2020. http://arxiv.org/abs/2003.11597
16. Radiological Society of America. RSNA pneumonia detection challenge. https://www.kaggle.com/c/rsnapneumonia- detectionchallenge. Accessed August 24, 2020.
17. Bloice MD, Roth PM, Holzinger A. Biomedical image augmentation using Augmentor. Bioinformatics. 2019;35(21):4522-4524. doi:10.1093/bioinformatics/btz259
18. Cepheid. Xpert Xpress SARS-CoV-2. https://www.cepheid .com/coronavirus. Accessed August 24, 2020.
19. Interknowlogy. COVID-19 detection in chest X-rays. https://interknowlogy-covid-19.azurewebsites.net. Accessed August 27, 2020.
20. Bernheim A, Mei X, Huang M, et al. Chest CT Findings in Coronavirus Disease-19 (COVID-19): Relationship to Duration of Infection. Radiology. 2020;295(3):200463. doi:10.1148/radiol.2020200463
21. Ai T, Yang Z, Hou H, et al. Correlation of Chest CT and RTPCR Testing for Coronavirus Disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology. 2020;296(2):E32- E40. doi:10.1148/radiol.2020200642
22. Simpson S, Kay FU, Abbara S, et al. Radiological Society of North America Expert Consensus Statement on Reporting Chest CT Findings Related to COVID-19. Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA - Secondary Publication. J Thorac Imaging. 2020;35(4):219-227. doi:10.1097/RTI.0000000000000524
23. Wong HYF, Lam HYS, Fong AH, et al. Frequency and distribution of chest radiographic findings in patients positive for COVID-19. Radiology. 2020;296(2):E72-E78. doi:10.1148/radiol.2020201160
24. Fang Y, Zhang H, Xie J, et al. Sensitivity of chest CT for COVID-19: comparison to RT-PCR. Radiology. 2020;296(2):E115-E117. doi:10.1148/radiol.2020200432
25. Chen N, Zhou M, Dong X, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. 2020;395(10223):507-513. doi:10.1016/S0140-6736(20)30211-7
26. Lomoro P, Verde F, Zerboni F, et al. COVID-19 pneumonia manifestations at the admission on chest ultrasound, radiographs, and CT: single-center study and comprehensive radiologic literature review. Eur J Radiol Open. 2020;7:100231. doi:10.1016/j.ejro.2020.100231
27. Salehi S, Abedi A, Balakrishnan S, Gholamrezanezhad A. Coronavirus disease 2019 (COVID-19) imaging reporting and data system (COVID-RADS) and common lexicon: a proposal based on the imaging data of 37 studies. Eur Radiol. 2020;30(9):4930-4942. doi:10.1007/s00330-020-06863-0
28. Jacobi A, Chung M, Bernheim A, Eber C. Portable chest X-ray in coronavirus disease-19 (COVID- 19): a pictorial review. Clin Imaging. 2020;64:35-42. doi:10.1016/j.clinimag.2020.04.001
29. Bhat R, Hamid A, Kunin JR, et al. Chest imaging in patients hospitalized With COVID-19 infection - a case series. Curr Probl Diagn Radiol. 2020;49(4):294-301. doi:10.1067/j.cpradiol.2020.04.001
30. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Heal. 2019;1(6):E271- E297. doi:10.1016/S2589-7500(19)30123-2
31. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491
32. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905
33. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. http://arxiv.org /abs/2002.11379. Updated March 11, 2020. Accessed August 24, 2020.
34. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by imagebased deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010
Comparing Artificial Intelligence Platforms for Histopathologic Cancer Diagnosis
Artificial intelligence (AI), first described in 1956, encompasses the field of computer science in which machines are trained to learn from experience. The term was popularized by the 1956 Dartmouth College Summer Research Project on Artificial Intelligence.1 The field of AI is rapidly growing and has the potential to affect many aspects of our lives. The emerging importance of AI is demonstrated by a February 2019 executive order that launched the American AI Initiative, allocating resources and funding for AI development.2 The executive order stresses the potential impact of AI in the health care field, including its potential utility to diagnose disease. Federal agencies were directed to invest in AI research and development to promote rapid breakthroughs in AI technology that may impact multiple areas of society.
Machine learning (ML), a subset of AI, was defined in 1959 by Arthur Samuel and is achieved by employing mathematic models to compute sample data sets.3 Originating from statistical linear models, neural networks were conceived to accomplish these tasks.4 These pioneering scientific achievements led to recent developments of deep neural networks. These models are developed to recognize patterns and achieve complex computational tasks within a matter of minutes, often far exceeding human ability.5 ML can increase efficiency with decreased computation time, high precision, and recall when compared with that of human decision making.6
ML has the potential for numerous applications in the health care field.7-9 One promising application is in the field of anatomic pathology. ML allows representative images to be used to train a computer to recognize patterns from labeled photographs. Based on a set of images selected to represent a specific tissue or disease process, the computer can be trained to evaluate and recognize new and unique images from patients and render a diagnosis.10 Prior to modern ML models, users would have to import many thousands of training images to produce algorithms that could recognize patterns with high accuracy. Modern ML algorithms allow for a model known as transfer learning, such that far fewer images are required for training.11-13
Two novel ML platforms available for public use are offered through Google (Mountain View, CA) and Apple (Cupertino, CA).14,15 They each offer a user-friendly interface with minimal experience required in computer science. Google AutoML uses ML via cloud services to store and retrieve data with ease. No coding knowledge is required. The Apple Create ML Module provides computer-based ML, requiring only a few lines of code.
The Veterans Health Administration (VHA) is the largest single health care system in the US, and nearly 50 000 cancer cases are diagnosed at the VHA annually.16 Cancers of the lung and colon are among the most common sources of invasive cancer and are the 2 most common causes of cancer deaths in America.16 We have previously reported using Apple ML in detecting non-small cell lung cancers (NSCLCs), including adenocarcinomas and squamous cell carcinomas (SCCs); and colon cancers with accuracy.17,18 In the present study, we expand on these findings by comparing Apple and Google ML platforms in a variety of common pathologic scenarios in veteran patients. Using limited training data, both programs are compared for precision and recall in differentiating conditions involving lung and colon pathology.
In the first 4 experiments, we evaluated the ability of the platforms to differentiate normal lung tissue from cancerous lung tissue, to distinguish lung adenocarcinoma from SCC, and to differentiate colon adenocarcinoma from normal colon tissue. Next, cases of colon adenocarcinoma were assessed to determine whether the presence or absence of the KRAS proto-oncogene could be determined histologically using the AI platforms. KRAS is found in a variety of cancers, including about 40% of colon adenocarcinomas.19 For colon cancers, the presence or absence of the mutation in KRAS has important implications for patients as it determines whether the tumor will respond to specific chemotherapy agents.20 The presence of the KRAS gene is currently determined by complex molecular testing of tumor tissue.21 However, we assessed the potential of ML to determine whether the mutation is present by computerized morphologic analysis alone. Our last experiment examined the ability of the Apple and Google platforms to differentiate between adenocarcinomas of lung origin vs colon origin. This has potential utility in determining the site of origin of metastatic carcinoma.22
Methods
Fifty cases of lung SCC, 50 cases of lung adenocarcinoma, and 50 cases of colon adenocarcinoma were randomly retrieved from our molecular database. Twenty-five colon adenocarcinoma cases were positive for mutation in KRAS, while 25 cases were negative for mutation in KRAS. Seven hundred fifty total images of lung tissue (250 benign lung tissue, 250 lung adenocarcinomas, and 250 lung SCCs) and 500 total images of colon tissue (250 benign colon tissue and 250 colon adenocarcinoma) were obtained using a Leica Microscope MC190 HD Camera (Wetzlar, Germany) connected to an Olympus BX41 microscope (Center Valley, PA) and the Leica Acquire 9072 software for Apple computers. All the images were captured at a resolution of 1024 x 768 pixels using a 60x dry objective. Lung tissue images were captured and saved on a 2012 Apple MacBook Pro computer, and colon images were captured and saved on a 2011 Apple iMac computer. Both computers were running macOS v10.13.
Creating Image Classifier Models Using Apple Create ML
Apple Create ML is a suite of products that use various tools to create and train custom ML models on Apple computers.15 The suite contains many features, including image classification to train a ML model to classify images, natural language processing to classify natural language text, and tabular data to train models that deal with labeling information or estimating new quantities. We used Create ML Image Classification to create image classifier models for our project (Appendix A).
Creating ML Modules Using Google Cloud AutoML Vision Beta
Google Cloud AutoML is a suite of machine learning products, including AutoML Vision, AutoML Natural Language and AutoML Translation.14 All Cloud AutoML machine learning products were in beta version at the time of experimentation. We used Cloud AutoML Vision beta to create ML modules for our project. Unlike Apple Create ML, which is run on a local Apple computer, the Google Cloud AutoML is run online using a Google Cloud account. There are no minimum specifications requirements for the local computer since it is using the cloud-based architecture (Appendix B).
Experiment 1
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect and subclassify NSCLC based on the histopathologic images. We created 3 classes of images (250 images each): benign lung tissue, lung adenocarcinoma, and lung SCC.
Experiment 2
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between normal lung tissue and NSCLC histopathologic images with 50/50 mixture of lung adenocarcinoma and lung SCC. We created 2 classes of images (250 images each): benign lung tissue and lung NSCLC.
Experiment 3
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and lung SCC histopathologic images. We created 2 classes of images (250 images each): adenocarcinoma and SCC.
Experiment 4
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect colon cancer histopathologic images regardless of mutation in KRAS status. We created 2 classes of images (250 images each): benign colon tissue and colon adenocarcinoma.
Experiment 5
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between colon adenocarcinoma with mutations in KRAS and colon adenocarcinoma without the mutation in KRAS histopathologic images. We created 2 classes of images (125 images each): colon adenocarcinoma cases with mutation in KRAS and colon adenocarcinoma cases without the mutation in KRAS.
Experiment 6
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and colon adenocarcinoma histopathologic images. We created 2 classes of images (250 images each): colon adenocarcinoma lung adenocarcinoma.
Results
Twelve machine learning models were created in 6 experiments using the Apple Create ML and the Google AutoML (Table). To investigate recall and precision differences between the Apple and the Google ML algorithms, we performed 2-tailed distribution, paired t tests. No statistically significant differences were found (P = .52 for recall and .60 for precision).
Overall, each model performed well in distinguishing between normal and neoplastic tissue for both lung and colon cancers. In subclassifying NSCLC into adenocarcinoma and SCC, the models were shown to have high levels of precision and recall. The models also were successful in distinguishing between lung and colonic origin of adenocarcinoma (Figures 1-4). However, both systems had trouble discerning colon adenocarcinoma with mutations in KRAS from adenocarcinoma without mutations in KRAS.
Discussion
Image classifier models using ML algorithms hold a promising future to revolutionize the health care field. ML products, such as those modules offered by Apple and Google, are easy to use and have a simple graphic user interface to allow individuals to train models to perform humanlike tasks in real time. In our experiments, we compared multiple algorithms to determine their ability to differentiate and subclassify histopathologic images with high precision and recall using common scenarios in treating veteran patients.
Analysis of the results revealed high precision and recall values illustrating the models’ ability to differentiate and detect benign lung tissue from lung SCC and lung adenocarcinoma in ML model 1, benign lung from NSCLC carcinoma in ML model 2, and benign colon from colonic adenocarcinoma in ML model 4. In ML model 3 and 6, both ML algorithms performed at a high level to differentiate lung SCC from lung adenocarcinoma and lung adenocarcinoma from colonic adenocarcinoma, respectively. Of note, ML model 5 had the lowest precision and recall values across both algorithms demonstrating the models’ limited utility in predicting molecular profiles, such as mutations in KRAS as tested here. This is not surprising as pathologists currently require complex molecular tests to detect mutations in KRAS reliably in colon cancer.
Both modules require minimal programming experience and are easy to use. In our comparison, we demonstrated critical distinguishing characteristics that differentiate the 2 products.
Apple Create ML image classifier is available for use on local Mac computers that use Xcode version 10 and macOS 10.14 or later, with just 3 lines of code required to perform computations. Although this product is limited to Apple computers, it is free to use, and images are stored on the computer hard drive. Of unique significance on the Apple system platform, images can be augmented to alter their appearance to enhance model training. For example, imported images can be cropped, rotated, blurred, and flipped, in order to optimize the model’s training abilities to recognize test images and perform pattern recognition. This feature is not as readily available on the Google platform. Apple Create ML Image classifier’s default training set consists of 75% of total imported images with 5% of the total images being randomly used as a validation set. The remaining 20% of images comprise the testing set. The module’s computational analysis to train the model is achieved in about 2 minutes on average. The score threshold is set at 50% and cannot be manipulated for each image class as in Google AutoML Vision.
Google AutoML Vision is open and can be accessed from many devices. It stores images on remote Google servers but requires computing fees after a $300 credit for 12 months. On AutoML Vision, random 80% of the total images are used in the training set, 10% are used in the validation set, and 10% are used in the testing set. It is important to highlight the different percentages used in the default settings on the respective modules. The time to train the Google AutoML Vision with default computational power is longer on average than Apple Create ML, with about 8 minutes required to train the machine learning module. However, it is possible to choose more computational power for an additional fee and decrease module training time. The user will receive e-mail alerts when the computer time begins and is completed. The computation time is calculated by subtracting the time of the initial e-mail from the final e-mail.
Based on our calculations, we determined there was no significant difference between the 2 machine learning algorithms tested at the default settings with recall and precision values obtained. These findings demonstrate the promise of using a ML algorithm to assist in the performance of human tasks and behaviors, specifically the diagnosis of histopathologic images. These results have numerous potential uses in clinical medicine. ML algorithms have been successfully applied to diagnostic and prognostic endeavors in pathology,23-28 dermatology,29-31 ophthalmology,32 cardiology,33 and radiology.34-36
Pathologists often use additional tests, such as special staining of tissues or molecular tests, to assist with accurate classification of tumors. ML platforms offer the potential of an additional tool for pathologists to use along with human microscopic interpretation.37,38 In addition, the number of pathologists in the US is dramatically decreasing, and many other countries have marked physician shortages, especially in fields of specialized training such as pathology.39-42 These models could readily assist physicians in underserved countries and impact shortages of pathologists elsewhere by providing more specific diagnoses in an expedited manner.43
Finally, although we have explored the application of these platforms in common cancer scenarios, great potential exists to use similar techniques in the detection of other conditions. These include the potential for classification and risk assessment of precancerous lesions, infectious processes in tissue (eg, detection of tuberculosis or malaria),24,44 inflammatory conditions (eg, arthritis subtypes, gout),45 blood disorders (eg, abnormal blood cell morphology),46 and many others. The potential of these technologies to improve health care delivery to veteran patients seems to be limited only by the imagination of the user.47
Regarding the limited effectiveness in determining the presence or absence of mutations in KRAS for colon adenocarcinoma, it is mentioned that currently pathologists rely on complex molecular tests to detect the mutations at the DNA level.21 It is possible that the use of more extensive training data sets may improve recall and precision in cases such as these and warrants further study. Our experiments were limited to the stipulations placed by the free trial software agreements; no costs were expended to use the algorithms, though an Apple computer was required.
Conclusion
We have demonstrated the successful application of 2 readily available ML platforms in providing diagnostic guidance in differentiation between common cancer conditions in veteran patient populations. Although both platforms performed very well with no statistically significant differences in results, some distinctions are worth noting. Apple Create ML can be used on local computers but is limited to an Apple operating system. Google AutoML is not platform-specific but runs only via Google Cloud with associated computational fees. Using these readily available models, we demonstrated the vast potential of AI in diagnostic pathology. The application of AI to clinical medicine remains in the very early stages. The VA is uniquely poised to provide leadership as AI technologies will continue to dramatically change the future of health care, both in veteran and nonveteran patients nationwide.
Acknowledgments
The authors thank Paul Borkowski for his constructive criticism and proofreading of this manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.
1. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87-91.
2. Trump D. Accelerating America’s leadership in artificial intelligence. https://www.whitehouse.gov/articles/accelerating-americas-leadership-in-artificial-intelligence. Published February 11, 2019. Accessed September 4, 2019.
3. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229.
4. SAS Users Group International. Neural networks and statistical models. In: Sarle WS. Proceedings of the Nineteenth Annual SAS Users Group International Conference. SAS Institute: Cary, North Carolina; 1994:1538-1550. http://www.sascommunity.org/sugi/SUGI94/Sugi-94-255%20Sarle.pdf. Accessed September 16, 2019.
5. Schmidhuber J. Deep learning in neural networks: an overview. Neural Networks. 2015;61:85-117.
6. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444.
7. Jiang F, Jiang Y, Li H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243.
8. Erickson BJ, Korfiatis P, Akkus Z, Kline TL. Machine learning for medical imaging. Radiographics. 2017;37(2):505-515.
9. Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920-1930.
10. Janowczyk A, Madabhushi A. Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases. J Pathol Inform. 2016;7(1):29.
11. Oquab M, Bottou L, Laptev I, Sivic J. Learning and transferring mid-level image representations using convolutional neural networks. Presented at: IEEE Conference on Computer Vision and Pattern Recognition, 2014. http://openaccess.thecvf.com/content_cvpr_2014/html/Oquab_Learning_and_Transferring_2014_CVPR_paper.html. Accessed September 4, 2019.
12. Shin HC, Roth HR, Gao M, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging. 2016;35(5):1285-1298.
13. Tajbakhsh N, Shin JY, Gurudu SR, et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging. 2016;35(5):1299-1312.
14. Cloud AutoML. https://cloud.google.com/automl. Accessed September 4, 2019.
15. Create ML. https://developer.apple.com/documentation/createml. Accessed September 4, 2019.
16. Zullig LL, Sims KJ, McNeil R, et al. Cancer incidence among patients of the U.S. Veterans Affairs Health Care System: 2010 Update. Mil Med. 2017;182(7):e1883-e1891. 17. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. https://arxiv.org/ftp/arxiv/papers/1808/1808.08230.pdf. Accessed September 4, 2019.
18. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Revised January 15,2019. Accessed September 4, 2019.
19. Armaghany T, Wilson JD, Chu Q, Mills G. Genetic alterations in colorectal cancer. Gastrointest Cancer Res. 2012;5(1):19-27.
20. Herzig DO, Tsikitis VL. Molecular markers for colon diagnosis, prognosis and targeted therapy. J Surg Oncol. 2015;111(1):96-102.
21. Ma W, Brodie S, Agersborg S, Funari VA, Albitar M. Significant improvement in detecting BRAF, KRAS, and EGFR mutations using next-generation sequencing as compared with FDA-cleared kits. Mol Diagn Ther. 2017;21(5):571-579.
22. Greco FA. Molecular diagnosis of the tissue of origin in cancer of unknown primary site: useful in patient management. Curr Treat Options Oncol. 2013;14(4):634-642.
23. Bejnordi BE, Veta M, van Diest PJ, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210.
24. Xiong Y, Ba X, Hou A, Zhang K, Chen L, Li T. Automatic detection of mycobacterium tuberculosis using artificial intelligence. J Thorac Dis. 2018;10(3):1936-1940.
25. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450.
26. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567.
27. Ertosun MG, Rubin DL. Automated grading of gliomas using deep learning in digital pathology images: a modular approach with ensemble of convolutional neural networks. AMIA Annu Symp Proc. 2015;2015:1899-1908.
28. Wahab N, Khan A, Lee YS. Two-phase deep convolutional neural network for reducing class skewness in histopathological images based breast cancer detection. Comput Biol Med. 2017;85:86-97.
29. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118.
30. Han SS, Park GH, Lim W, et al. Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: automatic construction of onychomycosis datasets by region-based convolutional deep neural network. PLoS One. 2018;13(1):e0191493.
31. Fujisawa Y, Otomo Y, Ogata Y, et al. Deep-learning-based, computer-aided classifier developed with a small dataset of clinical images surpasses board-certified dermatologists in skin tumour diagnosis. Br J Dermatol. 2019;180(2):373-381.
32. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2010.
33. Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS One. 2017;12(4):e0174944.
34. Cheng J-Z, Ni D, Chou Y-H, et al. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep. 2016;6(1):24454.
35. Wang X, Yang W, Weinreb J, et al. Searching for prostate cancer by fully automated magnetic resonance imaging classification: deep learning versus non-deep learning. Sci Rep. 2017;7(1):15415.
36. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582.
37. Bardou D, Zhang K, Ahmad SM. Classification of breast cancer based on histology images using convolutional neural networks. IEEE Access. 2018;6(6):24680-24693.
38. Sheikhzadeh F, Ward RK, van Niekerk D, Guillaud M. Automatic labeling of molecular biomarkers of immunohistochemistry images using fully convolutional networks. PLoS One. 2018;13(1):e0190783.
39. Metter DM, Colgan TJ, Leung ST, Timmons CF, Park JY. Trends in the US and Canadian pathologist workforces from 2007 to 2017. JAMA Netw Open. 2019;2(5):e194337.
40. Benediktsson, H, Whitelaw J, Roy I. Pathology services in developing countries: a challenge. Arch Pathol Lab Med. 2007;131(11):1636-1639.
41. Graves D. The impact of the pathology workforce crisis on acute health care. Aust Health Rev. 2007;31(suppl 1):S28-S30.
42. NHS pathology shortages cause cancer diagnosis delays. https://www.gmjournal.co.uk/nhs-pathology-shortages-are-causing-cancer-diagnosis-delays. Published September 18, 2018. Accessed September 4, 2019.
43. Abbott LM, Smith SD. Smartphone apps for skin cancer diagnosis: Implications for patients and practitioners. Australas J Dermatol. 2018;59(3):168-170.
44. Poostchi M, Silamut K, Maude RJ, Jaeger S, Thoma G. Image analysis and machine learning for detecting malaria. Transl Res. 2018;194:36-55.
45. Orange DE, Agius P, DiCarlo EF, et al. Identification of three rheumatoid arthritis disease subtypes by machine learning integration of synovial histologic features and RNA sequencing data. Arthritis Rheumatol. 2018;70(5):690-701.
46. Rodellar J, Alférez S, Acevedo A, Molina A, Merino A. Image processing and machine learning in the morphological analysis of blood cells. Int J Lab Hematol. 2018;40(suppl 1):46-53.
47. Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60-88.
Artificial intelligence (AI), first described in 1956, encompasses the field of computer science in which machines are trained to learn from experience. The term was popularized by the 1956 Dartmouth College Summer Research Project on Artificial Intelligence.1 The field of AI is rapidly growing and has the potential to affect many aspects of our lives. The emerging importance of AI is demonstrated by a February 2019 executive order that launched the American AI Initiative, allocating resources and funding for AI development.2 The executive order stresses the potential impact of AI in the health care field, including its potential utility to diagnose disease. Federal agencies were directed to invest in AI research and development to promote rapid breakthroughs in AI technology that may impact multiple areas of society.
Machine learning (ML), a subset of AI, was defined in 1959 by Arthur Samuel and is achieved by employing mathematic models to compute sample data sets.3 Originating from statistical linear models, neural networks were conceived to accomplish these tasks.4 These pioneering scientific achievements led to recent developments of deep neural networks. These models are developed to recognize patterns and achieve complex computational tasks within a matter of minutes, often far exceeding human ability.5 ML can increase efficiency with decreased computation time, high precision, and recall when compared with that of human decision making.6
ML has the potential for numerous applications in the health care field.7-9 One promising application is in the field of anatomic pathology. ML allows representative images to be used to train a computer to recognize patterns from labeled photographs. Based on a set of images selected to represent a specific tissue or disease process, the computer can be trained to evaluate and recognize new and unique images from patients and render a diagnosis.10 Prior to modern ML models, users would have to import many thousands of training images to produce algorithms that could recognize patterns with high accuracy. Modern ML algorithms allow for a model known as transfer learning, such that far fewer images are required for training.11-13
Two novel ML platforms available for public use are offered through Google (Mountain View, CA) and Apple (Cupertino, CA).14,15 They each offer a user-friendly interface with minimal experience required in computer science. Google AutoML uses ML via cloud services to store and retrieve data with ease. No coding knowledge is required. The Apple Create ML Module provides computer-based ML, requiring only a few lines of code.
The Veterans Health Administration (VHA) is the largest single health care system in the US, and nearly 50 000 cancer cases are diagnosed at the VHA annually.16 Cancers of the lung and colon are among the most common sources of invasive cancer and are the 2 most common causes of cancer deaths in America.16 We have previously reported using Apple ML in detecting non-small cell lung cancers (NSCLCs), including adenocarcinomas and squamous cell carcinomas (SCCs); and colon cancers with accuracy.17,18 In the present study, we expand on these findings by comparing Apple and Google ML platforms in a variety of common pathologic scenarios in veteran patients. Using limited training data, both programs are compared for precision and recall in differentiating conditions involving lung and colon pathology.
In the first 4 experiments, we evaluated the ability of the platforms to differentiate normal lung tissue from cancerous lung tissue, to distinguish lung adenocarcinoma from SCC, and to differentiate colon adenocarcinoma from normal colon tissue. Next, cases of colon adenocarcinoma were assessed to determine whether the presence or absence of the KRAS proto-oncogene could be determined histologically using the AI platforms. KRAS is found in a variety of cancers, including about 40% of colon adenocarcinomas.19 For colon cancers, the presence or absence of the mutation in KRAS has important implications for patients as it determines whether the tumor will respond to specific chemotherapy agents.20 The presence of the KRAS gene is currently determined by complex molecular testing of tumor tissue.21 However, we assessed the potential of ML to determine whether the mutation is present by computerized morphologic analysis alone. Our last experiment examined the ability of the Apple and Google platforms to differentiate between adenocarcinomas of lung origin vs colon origin. This has potential utility in determining the site of origin of metastatic carcinoma.22
Methods
Fifty cases of lung SCC, 50 cases of lung adenocarcinoma, and 50 cases of colon adenocarcinoma were randomly retrieved from our molecular database. Twenty-five colon adenocarcinoma cases were positive for mutation in KRAS, while 25 cases were negative for mutation in KRAS. Seven hundred fifty total images of lung tissue (250 benign lung tissue, 250 lung adenocarcinomas, and 250 lung SCCs) and 500 total images of colon tissue (250 benign colon tissue and 250 colon adenocarcinoma) were obtained using a Leica Microscope MC190 HD Camera (Wetzlar, Germany) connected to an Olympus BX41 microscope (Center Valley, PA) and the Leica Acquire 9072 software for Apple computers. All the images were captured at a resolution of 1024 x 768 pixels using a 60x dry objective. Lung tissue images were captured and saved on a 2012 Apple MacBook Pro computer, and colon images were captured and saved on a 2011 Apple iMac computer. Both computers were running macOS v10.13.
Creating Image Classifier Models Using Apple Create ML
Apple Create ML is a suite of products that use various tools to create and train custom ML models on Apple computers.15 The suite contains many features, including image classification to train a ML model to classify images, natural language processing to classify natural language text, and tabular data to train models that deal with labeling information or estimating new quantities. We used Create ML Image Classification to create image classifier models for our project (Appendix A).
Creating ML Modules Using Google Cloud AutoML Vision Beta
Google Cloud AutoML is a suite of machine learning products, including AutoML Vision, AutoML Natural Language and AutoML Translation.14 All Cloud AutoML machine learning products were in beta version at the time of experimentation. We used Cloud AutoML Vision beta to create ML modules for our project. Unlike Apple Create ML, which is run on a local Apple computer, the Google Cloud AutoML is run online using a Google Cloud account. There are no minimum specifications requirements for the local computer since it is using the cloud-based architecture (Appendix B).
Experiment 1
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect and subclassify NSCLC based on the histopathologic images. We created 3 classes of images (250 images each): benign lung tissue, lung adenocarcinoma, and lung SCC.
Experiment 2
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between normal lung tissue and NSCLC histopathologic images with 50/50 mixture of lung adenocarcinoma and lung SCC. We created 2 classes of images (250 images each): benign lung tissue and lung NSCLC.
Experiment 3
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and lung SCC histopathologic images. We created 2 classes of images (250 images each): adenocarcinoma and SCC.
Experiment 4
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect colon cancer histopathologic images regardless of mutation in KRAS status. We created 2 classes of images (250 images each): benign colon tissue and colon adenocarcinoma.
Experiment 5
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between colon adenocarcinoma with mutations in KRAS and colon adenocarcinoma without the mutation in KRAS histopathologic images. We created 2 classes of images (125 images each): colon adenocarcinoma cases with mutation in KRAS and colon adenocarcinoma cases without the mutation in KRAS.
Experiment 6
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and colon adenocarcinoma histopathologic images. We created 2 classes of images (250 images each): colon adenocarcinoma lung adenocarcinoma.
Results
Twelve machine learning models were created in 6 experiments using the Apple Create ML and the Google AutoML (Table). To investigate recall and precision differences between the Apple and the Google ML algorithms, we performed 2-tailed distribution, paired t tests. No statistically significant differences were found (P = .52 for recall and .60 for precision).
Overall, each model performed well in distinguishing between normal and neoplastic tissue for both lung and colon cancers. In subclassifying NSCLC into adenocarcinoma and SCC, the models were shown to have high levels of precision and recall. The models also were successful in distinguishing between lung and colonic origin of adenocarcinoma (Figures 1-4). However, both systems had trouble discerning colon adenocarcinoma with mutations in KRAS from adenocarcinoma without mutations in KRAS.
Discussion
Image classifier models using ML algorithms hold a promising future to revolutionize the health care field. ML products, such as those modules offered by Apple and Google, are easy to use and have a simple graphic user interface to allow individuals to train models to perform humanlike tasks in real time. In our experiments, we compared multiple algorithms to determine their ability to differentiate and subclassify histopathologic images with high precision and recall using common scenarios in treating veteran patients.
Analysis of the results revealed high precision and recall values illustrating the models’ ability to differentiate and detect benign lung tissue from lung SCC and lung adenocarcinoma in ML model 1, benign lung from NSCLC carcinoma in ML model 2, and benign colon from colonic adenocarcinoma in ML model 4. In ML model 3 and 6, both ML algorithms performed at a high level to differentiate lung SCC from lung adenocarcinoma and lung adenocarcinoma from colonic adenocarcinoma, respectively. Of note, ML model 5 had the lowest precision and recall values across both algorithms demonstrating the models’ limited utility in predicting molecular profiles, such as mutations in KRAS as tested here. This is not surprising as pathologists currently require complex molecular tests to detect mutations in KRAS reliably in colon cancer.
Both modules require minimal programming experience and are easy to use. In our comparison, we demonstrated critical distinguishing characteristics that differentiate the 2 products.
Apple Create ML image classifier is available for use on local Mac computers that use Xcode version 10 and macOS 10.14 or later, with just 3 lines of code required to perform computations. Although this product is limited to Apple computers, it is free to use, and images are stored on the computer hard drive. Of unique significance on the Apple system platform, images can be augmented to alter their appearance to enhance model training. For example, imported images can be cropped, rotated, blurred, and flipped, in order to optimize the model’s training abilities to recognize test images and perform pattern recognition. This feature is not as readily available on the Google platform. Apple Create ML Image classifier’s default training set consists of 75% of total imported images with 5% of the total images being randomly used as a validation set. The remaining 20% of images comprise the testing set. The module’s computational analysis to train the model is achieved in about 2 minutes on average. The score threshold is set at 50% and cannot be manipulated for each image class as in Google AutoML Vision.
Google AutoML Vision is open and can be accessed from many devices. It stores images on remote Google servers but requires computing fees after a $300 credit for 12 months. On AutoML Vision, random 80% of the total images are used in the training set, 10% are used in the validation set, and 10% are used in the testing set. It is important to highlight the different percentages used in the default settings on the respective modules. The time to train the Google AutoML Vision with default computational power is longer on average than Apple Create ML, with about 8 minutes required to train the machine learning module. However, it is possible to choose more computational power for an additional fee and decrease module training time. The user will receive e-mail alerts when the computer time begins and is completed. The computation time is calculated by subtracting the time of the initial e-mail from the final e-mail.
Based on our calculations, we determined there was no significant difference between the 2 machine learning algorithms tested at the default settings with recall and precision values obtained. These findings demonstrate the promise of using a ML algorithm to assist in the performance of human tasks and behaviors, specifically the diagnosis of histopathologic images. These results have numerous potential uses in clinical medicine. ML algorithms have been successfully applied to diagnostic and prognostic endeavors in pathology,23-28 dermatology,29-31 ophthalmology,32 cardiology,33 and radiology.34-36
Pathologists often use additional tests, such as special staining of tissues or molecular tests, to assist with accurate classification of tumors. ML platforms offer the potential of an additional tool for pathologists to use along with human microscopic interpretation.37,38 In addition, the number of pathologists in the US is dramatically decreasing, and many other countries have marked physician shortages, especially in fields of specialized training such as pathology.39-42 These models could readily assist physicians in underserved countries and impact shortages of pathologists elsewhere by providing more specific diagnoses in an expedited manner.43
Finally, although we have explored the application of these platforms in common cancer scenarios, great potential exists to use similar techniques in the detection of other conditions. These include the potential for classification and risk assessment of precancerous lesions, infectious processes in tissue (eg, detection of tuberculosis or malaria),24,44 inflammatory conditions (eg, arthritis subtypes, gout),45 blood disorders (eg, abnormal blood cell morphology),46 and many others. The potential of these technologies to improve health care delivery to veteran patients seems to be limited only by the imagination of the user.47
Regarding the limited effectiveness in determining the presence or absence of mutations in KRAS for colon adenocarcinoma, it is mentioned that currently pathologists rely on complex molecular tests to detect the mutations at the DNA level.21 It is possible that the use of more extensive training data sets may improve recall and precision in cases such as these and warrants further study. Our experiments were limited to the stipulations placed by the free trial software agreements; no costs were expended to use the algorithms, though an Apple computer was required.
Conclusion
We have demonstrated the successful application of 2 readily available ML platforms in providing diagnostic guidance in differentiation between common cancer conditions in veteran patient populations. Although both platforms performed very well with no statistically significant differences in results, some distinctions are worth noting. Apple Create ML can be used on local computers but is limited to an Apple operating system. Google AutoML is not platform-specific but runs only via Google Cloud with associated computational fees. Using these readily available models, we demonstrated the vast potential of AI in diagnostic pathology. The application of AI to clinical medicine remains in the very early stages. The VA is uniquely poised to provide leadership as AI technologies will continue to dramatically change the future of health care, both in veteran and nonveteran patients nationwide.
Acknowledgments
The authors thank Paul Borkowski for his constructive criticism and proofreading of this manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.
Artificial intelligence (AI), first described in 1956, encompasses the field of computer science in which machines are trained to learn from experience. The term was popularized by the 1956 Dartmouth College Summer Research Project on Artificial Intelligence.1 The field of AI is rapidly growing and has the potential to affect many aspects of our lives. The emerging importance of AI is demonstrated by a February 2019 executive order that launched the American AI Initiative, allocating resources and funding for AI development.2 The executive order stresses the potential impact of AI in the health care field, including its potential utility to diagnose disease. Federal agencies were directed to invest in AI research and development to promote rapid breakthroughs in AI technology that may impact multiple areas of society.
Machine learning (ML), a subset of AI, was defined in 1959 by Arthur Samuel and is achieved by employing mathematic models to compute sample data sets.3 Originating from statistical linear models, neural networks were conceived to accomplish these tasks.4 These pioneering scientific achievements led to recent developments of deep neural networks. These models are developed to recognize patterns and achieve complex computational tasks within a matter of minutes, often far exceeding human ability.5 ML can increase efficiency with decreased computation time, high precision, and recall when compared with that of human decision making.6
ML has the potential for numerous applications in the health care field.7-9 One promising application is in the field of anatomic pathology. ML allows representative images to be used to train a computer to recognize patterns from labeled photographs. Based on a set of images selected to represent a specific tissue or disease process, the computer can be trained to evaluate and recognize new and unique images from patients and render a diagnosis.10 Prior to modern ML models, users would have to import many thousands of training images to produce algorithms that could recognize patterns with high accuracy. Modern ML algorithms allow for a model known as transfer learning, such that far fewer images are required for training.11-13
Two novel ML platforms available for public use are offered through Google (Mountain View, CA) and Apple (Cupertino, CA).14,15 They each offer a user-friendly interface with minimal experience required in computer science. Google AutoML uses ML via cloud services to store and retrieve data with ease. No coding knowledge is required. The Apple Create ML Module provides computer-based ML, requiring only a few lines of code.
The Veterans Health Administration (VHA) is the largest single health care system in the US, and nearly 50 000 cancer cases are diagnosed at the VHA annually.16 Cancers of the lung and colon are among the most common sources of invasive cancer and are the 2 most common causes of cancer deaths in America.16 We have previously reported using Apple ML in detecting non-small cell lung cancers (NSCLCs), including adenocarcinomas and squamous cell carcinomas (SCCs); and colon cancers with accuracy.17,18 In the present study, we expand on these findings by comparing Apple and Google ML platforms in a variety of common pathologic scenarios in veteran patients. Using limited training data, both programs are compared for precision and recall in differentiating conditions involving lung and colon pathology.
In the first 4 experiments, we evaluated the ability of the platforms to differentiate normal lung tissue from cancerous lung tissue, to distinguish lung adenocarcinoma from SCC, and to differentiate colon adenocarcinoma from normal colon tissue. Next, cases of colon adenocarcinoma were assessed to determine whether the presence or absence of the KRAS proto-oncogene could be determined histologically using the AI platforms. KRAS is found in a variety of cancers, including about 40% of colon adenocarcinomas.19 For colon cancers, the presence or absence of the mutation in KRAS has important implications for patients as it determines whether the tumor will respond to specific chemotherapy agents.20 The presence of the KRAS gene is currently determined by complex molecular testing of tumor tissue.21 However, we assessed the potential of ML to determine whether the mutation is present by computerized morphologic analysis alone. Our last experiment examined the ability of the Apple and Google platforms to differentiate between adenocarcinomas of lung origin vs colon origin. This has potential utility in determining the site of origin of metastatic carcinoma.22
Methods
Fifty cases of lung SCC, 50 cases of lung adenocarcinoma, and 50 cases of colon adenocarcinoma were randomly retrieved from our molecular database. Twenty-five colon adenocarcinoma cases were positive for mutation in KRAS, while 25 cases were negative for mutation in KRAS. Seven hundred fifty total images of lung tissue (250 benign lung tissue, 250 lung adenocarcinomas, and 250 lung SCCs) and 500 total images of colon tissue (250 benign colon tissue and 250 colon adenocarcinoma) were obtained using a Leica Microscope MC190 HD Camera (Wetzlar, Germany) connected to an Olympus BX41 microscope (Center Valley, PA) and the Leica Acquire 9072 software for Apple computers. All the images were captured at a resolution of 1024 x 768 pixels using a 60x dry objective. Lung tissue images were captured and saved on a 2012 Apple MacBook Pro computer, and colon images were captured and saved on a 2011 Apple iMac computer. Both computers were running macOS v10.13.
Creating Image Classifier Models Using Apple Create ML
Apple Create ML is a suite of products that use various tools to create and train custom ML models on Apple computers.15 The suite contains many features, including image classification to train a ML model to classify images, natural language processing to classify natural language text, and tabular data to train models that deal with labeling information or estimating new quantities. We used Create ML Image Classification to create image classifier models for our project (Appendix A).
Creating ML Modules Using Google Cloud AutoML Vision Beta
Google Cloud AutoML is a suite of machine learning products, including AutoML Vision, AutoML Natural Language and AutoML Translation.14 All Cloud AutoML machine learning products were in beta version at the time of experimentation. We used Cloud AutoML Vision beta to create ML modules for our project. Unlike Apple Create ML, which is run on a local Apple computer, the Google Cloud AutoML is run online using a Google Cloud account. There are no minimum specifications requirements for the local computer since it is using the cloud-based architecture (Appendix B).
Experiment 1
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect and subclassify NSCLC based on the histopathologic images. We created 3 classes of images (250 images each): benign lung tissue, lung adenocarcinoma, and lung SCC.
Experiment 2
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between normal lung tissue and NSCLC histopathologic images with 50/50 mixture of lung adenocarcinoma and lung SCC. We created 2 classes of images (250 images each): benign lung tissue and lung NSCLC.
Experiment 3
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and lung SCC histopathologic images. We created 2 classes of images (250 images each): adenocarcinoma and SCC.
Experiment 4
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect colon cancer histopathologic images regardless of mutation in KRAS status. We created 2 classes of images (250 images each): benign colon tissue and colon adenocarcinoma.
Experiment 5
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between colon adenocarcinoma with mutations in KRAS and colon adenocarcinoma without the mutation in KRAS histopathologic images. We created 2 classes of images (125 images each): colon adenocarcinoma cases with mutation in KRAS and colon adenocarcinoma cases without the mutation in KRAS.
Experiment 6
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and colon adenocarcinoma histopathologic images. We created 2 classes of images (250 images each): colon adenocarcinoma lung adenocarcinoma.
Results
Twelve machine learning models were created in 6 experiments using the Apple Create ML and the Google AutoML (Table). To investigate recall and precision differences between the Apple and the Google ML algorithms, we performed 2-tailed distribution, paired t tests. No statistically significant differences were found (P = .52 for recall and .60 for precision).
Overall, each model performed well in distinguishing between normal and neoplastic tissue for both lung and colon cancers. In subclassifying NSCLC into adenocarcinoma and SCC, the models were shown to have high levels of precision and recall. The models also were successful in distinguishing between lung and colonic origin of adenocarcinoma (Figures 1-4). However, both systems had trouble discerning colon adenocarcinoma with mutations in KRAS from adenocarcinoma without mutations in KRAS.
Discussion
Image classifier models using ML algorithms hold a promising future to revolutionize the health care field. ML products, such as those modules offered by Apple and Google, are easy to use and have a simple graphic user interface to allow individuals to train models to perform humanlike tasks in real time. In our experiments, we compared multiple algorithms to determine their ability to differentiate and subclassify histopathologic images with high precision and recall using common scenarios in treating veteran patients.
Analysis of the results revealed high precision and recall values illustrating the models’ ability to differentiate and detect benign lung tissue from lung SCC and lung adenocarcinoma in ML model 1, benign lung from NSCLC carcinoma in ML model 2, and benign colon from colonic adenocarcinoma in ML model 4. In ML model 3 and 6, both ML algorithms performed at a high level to differentiate lung SCC from lung adenocarcinoma and lung adenocarcinoma from colonic adenocarcinoma, respectively. Of note, ML model 5 had the lowest precision and recall values across both algorithms demonstrating the models’ limited utility in predicting molecular profiles, such as mutations in KRAS as tested here. This is not surprising as pathologists currently require complex molecular tests to detect mutations in KRAS reliably in colon cancer.
Both modules require minimal programming experience and are easy to use. In our comparison, we demonstrated critical distinguishing characteristics that differentiate the 2 products.
Apple Create ML image classifier is available for use on local Mac computers that use Xcode version 10 and macOS 10.14 or later, with just 3 lines of code required to perform computations. Although this product is limited to Apple computers, it is free to use, and images are stored on the computer hard drive. Of unique significance on the Apple system platform, images can be augmented to alter their appearance to enhance model training. For example, imported images can be cropped, rotated, blurred, and flipped, in order to optimize the model’s training abilities to recognize test images and perform pattern recognition. This feature is not as readily available on the Google platform. Apple Create ML Image classifier’s default training set consists of 75% of total imported images with 5% of the total images being randomly used as a validation set. The remaining 20% of images comprise the testing set. The module’s computational analysis to train the model is achieved in about 2 minutes on average. The score threshold is set at 50% and cannot be manipulated for each image class as in Google AutoML Vision.
Google AutoML Vision is open and can be accessed from many devices. It stores images on remote Google servers but requires computing fees after a $300 credit for 12 months. On AutoML Vision, random 80% of the total images are used in the training set, 10% are used in the validation set, and 10% are used in the testing set. It is important to highlight the different percentages used in the default settings on the respective modules. The time to train the Google AutoML Vision with default computational power is longer on average than Apple Create ML, with about 8 minutes required to train the machine learning module. However, it is possible to choose more computational power for an additional fee and decrease module training time. The user will receive e-mail alerts when the computer time begins and is completed. The computation time is calculated by subtracting the time of the initial e-mail from the final e-mail.
Based on our calculations, we determined there was no significant difference between the 2 machine learning algorithms tested at the default settings with recall and precision values obtained. These findings demonstrate the promise of using a ML algorithm to assist in the performance of human tasks and behaviors, specifically the diagnosis of histopathologic images. These results have numerous potential uses in clinical medicine. ML algorithms have been successfully applied to diagnostic and prognostic endeavors in pathology,23-28 dermatology,29-31 ophthalmology,32 cardiology,33 and radiology.34-36
Pathologists often use additional tests, such as special staining of tissues or molecular tests, to assist with accurate classification of tumors. ML platforms offer the potential of an additional tool for pathologists to use along with human microscopic interpretation.37,38 In addition, the number of pathologists in the US is dramatically decreasing, and many other countries have marked physician shortages, especially in fields of specialized training such as pathology.39-42 These models could readily assist physicians in underserved countries and impact shortages of pathologists elsewhere by providing more specific diagnoses in an expedited manner.43
Finally, although we have explored the application of these platforms in common cancer scenarios, great potential exists to use similar techniques in the detection of other conditions. These include the potential for classification and risk assessment of precancerous lesions, infectious processes in tissue (eg, detection of tuberculosis or malaria),24,44 inflammatory conditions (eg, arthritis subtypes, gout),45 blood disorders (eg, abnormal blood cell morphology),46 and many others. The potential of these technologies to improve health care delivery to veteran patients seems to be limited only by the imagination of the user.47
Regarding the limited effectiveness in determining the presence or absence of mutations in KRAS for colon adenocarcinoma, it is mentioned that currently pathologists rely on complex molecular tests to detect the mutations at the DNA level.21 It is possible that the use of more extensive training data sets may improve recall and precision in cases such as these and warrants further study. Our experiments were limited to the stipulations placed by the free trial software agreements; no costs were expended to use the algorithms, though an Apple computer was required.
Conclusion
We have demonstrated the successful application of 2 readily available ML platforms in providing diagnostic guidance in differentiation between common cancer conditions in veteran patient populations. Although both platforms performed very well with no statistically significant differences in results, some distinctions are worth noting. Apple Create ML can be used on local computers but is limited to an Apple operating system. Google AutoML is not platform-specific but runs only via Google Cloud with associated computational fees. Using these readily available models, we demonstrated the vast potential of AI in diagnostic pathology. The application of AI to clinical medicine remains in the very early stages. The VA is uniquely poised to provide leadership as AI technologies will continue to dramatically change the future of health care, both in veteran and nonveteran patients nationwide.
Acknowledgments
The authors thank Paul Borkowski for his constructive criticism and proofreading of this manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.
1. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87-91.
2. Trump D. Accelerating America’s leadership in artificial intelligence. https://www.whitehouse.gov/articles/accelerating-americas-leadership-in-artificial-intelligence. Published February 11, 2019. Accessed September 4, 2019.
3. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229.
4. SAS Users Group International. Neural networks and statistical models. In: Sarle WS. Proceedings of the Nineteenth Annual SAS Users Group International Conference. SAS Institute: Cary, North Carolina; 1994:1538-1550. http://www.sascommunity.org/sugi/SUGI94/Sugi-94-255%20Sarle.pdf. Accessed September 16, 2019.
5. Schmidhuber J. Deep learning in neural networks: an overview. Neural Networks. 2015;61:85-117.
6. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444.
7. Jiang F, Jiang Y, Li H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243.
8. Erickson BJ, Korfiatis P, Akkus Z, Kline TL. Machine learning for medical imaging. Radiographics. 2017;37(2):505-515.
9. Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920-1930.
10. Janowczyk A, Madabhushi A. Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases. J Pathol Inform. 2016;7(1):29.
11. Oquab M, Bottou L, Laptev I, Sivic J. Learning and transferring mid-level image representations using convolutional neural networks. Presented at: IEEE Conference on Computer Vision and Pattern Recognition, 2014. http://openaccess.thecvf.com/content_cvpr_2014/html/Oquab_Learning_and_Transferring_2014_CVPR_paper.html. Accessed September 4, 2019.
12. Shin HC, Roth HR, Gao M, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging. 2016;35(5):1285-1298.
13. Tajbakhsh N, Shin JY, Gurudu SR, et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging. 2016;35(5):1299-1312.
14. Cloud AutoML. https://cloud.google.com/automl. Accessed September 4, 2019.
15. Create ML. https://developer.apple.com/documentation/createml. Accessed September 4, 2019.
16. Zullig LL, Sims KJ, McNeil R, et al. Cancer incidence among patients of the U.S. Veterans Affairs Health Care System: 2010 Update. Mil Med. 2017;182(7):e1883-e1891. 17. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. https://arxiv.org/ftp/arxiv/papers/1808/1808.08230.pdf. Accessed September 4, 2019.
18. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Revised January 15,2019. Accessed September 4, 2019.
19. Armaghany T, Wilson JD, Chu Q, Mills G. Genetic alterations in colorectal cancer. Gastrointest Cancer Res. 2012;5(1):19-27.
20. Herzig DO, Tsikitis VL. Molecular markers for colon diagnosis, prognosis and targeted therapy. J Surg Oncol. 2015;111(1):96-102.
21. Ma W, Brodie S, Agersborg S, Funari VA, Albitar M. Significant improvement in detecting BRAF, KRAS, and EGFR mutations using next-generation sequencing as compared with FDA-cleared kits. Mol Diagn Ther. 2017;21(5):571-579.
22. Greco FA. Molecular diagnosis of the tissue of origin in cancer of unknown primary site: useful in patient management. Curr Treat Options Oncol. 2013;14(4):634-642.
23. Bejnordi BE, Veta M, van Diest PJ, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210.
24. Xiong Y, Ba X, Hou A, Zhang K, Chen L, Li T. Automatic detection of mycobacterium tuberculosis using artificial intelligence. J Thorac Dis. 2018;10(3):1936-1940.
25. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450.
26. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567.
27. Ertosun MG, Rubin DL. Automated grading of gliomas using deep learning in digital pathology images: a modular approach with ensemble of convolutional neural networks. AMIA Annu Symp Proc. 2015;2015:1899-1908.
28. Wahab N, Khan A, Lee YS. Two-phase deep convolutional neural network for reducing class skewness in histopathological images based breast cancer detection. Comput Biol Med. 2017;85:86-97.
29. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118.
30. Han SS, Park GH, Lim W, et al. Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: automatic construction of onychomycosis datasets by region-based convolutional deep neural network. PLoS One. 2018;13(1):e0191493.
31. Fujisawa Y, Otomo Y, Ogata Y, et al. Deep-learning-based, computer-aided classifier developed with a small dataset of clinical images surpasses board-certified dermatologists in skin tumour diagnosis. Br J Dermatol. 2019;180(2):373-381.
32. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2010.
33. Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS One. 2017;12(4):e0174944.
34. Cheng J-Z, Ni D, Chou Y-H, et al. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep. 2016;6(1):24454.
35. Wang X, Yang W, Weinreb J, et al. Searching for prostate cancer by fully automated magnetic resonance imaging classification: deep learning versus non-deep learning. Sci Rep. 2017;7(1):15415.
36. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582.
37. Bardou D, Zhang K, Ahmad SM. Classification of breast cancer based on histology images using convolutional neural networks. IEEE Access. 2018;6(6):24680-24693.
38. Sheikhzadeh F, Ward RK, van Niekerk D, Guillaud M. Automatic labeling of molecular biomarkers of immunohistochemistry images using fully convolutional networks. PLoS One. 2018;13(1):e0190783.
39. Metter DM, Colgan TJ, Leung ST, Timmons CF, Park JY. Trends in the US and Canadian pathologist workforces from 2007 to 2017. JAMA Netw Open. 2019;2(5):e194337.
40. Benediktsson, H, Whitelaw J, Roy I. Pathology services in developing countries: a challenge. Arch Pathol Lab Med. 2007;131(11):1636-1639.
41. Graves D. The impact of the pathology workforce crisis on acute health care. Aust Health Rev. 2007;31(suppl 1):S28-S30.
42. NHS pathology shortages cause cancer diagnosis delays. https://www.gmjournal.co.uk/nhs-pathology-shortages-are-causing-cancer-diagnosis-delays. Published September 18, 2018. Accessed September 4, 2019.
43. Abbott LM, Smith SD. Smartphone apps for skin cancer diagnosis: Implications for patients and practitioners. Australas J Dermatol. 2018;59(3):168-170.
44. Poostchi M, Silamut K, Maude RJ, Jaeger S, Thoma G. Image analysis and machine learning for detecting malaria. Transl Res. 2018;194:36-55.
45. Orange DE, Agius P, DiCarlo EF, et al. Identification of three rheumatoid arthritis disease subtypes by machine learning integration of synovial histologic features and RNA sequencing data. Arthritis Rheumatol. 2018;70(5):690-701.
46. Rodellar J, Alférez S, Acevedo A, Molina A, Merino A. Image processing and machine learning in the morphological analysis of blood cells. Int J Lab Hematol. 2018;40(suppl 1):46-53.
47. Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60-88.
1. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87-91.
2. Trump D. Accelerating America’s leadership in artificial intelligence. https://www.whitehouse.gov/articles/accelerating-americas-leadership-in-artificial-intelligence. Published February 11, 2019. Accessed September 4, 2019.
3. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229.
4. SAS Users Group International. Neural networks and statistical models. In: Sarle WS. Proceedings of the Nineteenth Annual SAS Users Group International Conference. SAS Institute: Cary, North Carolina; 1994:1538-1550. http://www.sascommunity.org/sugi/SUGI94/Sugi-94-255%20Sarle.pdf. Accessed September 16, 2019.
5. Schmidhuber J. Deep learning in neural networks: an overview. Neural Networks. 2015;61:85-117.
6. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444.
7. Jiang F, Jiang Y, Li H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243.
8. Erickson BJ, Korfiatis P, Akkus Z, Kline TL. Machine learning for medical imaging. Radiographics. 2017;37(2):505-515.
9. Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920-1930.
10. Janowczyk A, Madabhushi A. Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases. J Pathol Inform. 2016;7(1):29.
11. Oquab M, Bottou L, Laptev I, Sivic J. Learning and transferring mid-level image representations using convolutional neural networks. Presented at: IEEE Conference on Computer Vision and Pattern Recognition, 2014. http://openaccess.thecvf.com/content_cvpr_2014/html/Oquab_Learning_and_Transferring_2014_CVPR_paper.html. Accessed September 4, 2019.
12. Shin HC, Roth HR, Gao M, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging. 2016;35(5):1285-1298.
13. Tajbakhsh N, Shin JY, Gurudu SR, et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging. 2016;35(5):1299-1312.
14. Cloud AutoML. https://cloud.google.com/automl. Accessed September 4, 2019.
15. Create ML. https://developer.apple.com/documentation/createml. Accessed September 4, 2019.
16. Zullig LL, Sims KJ, McNeil R, et al. Cancer incidence among patients of the U.S. Veterans Affairs Health Care System: 2010 Update. Mil Med. 2017;182(7):e1883-e1891. 17. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. https://arxiv.org/ftp/arxiv/papers/1808/1808.08230.pdf. Accessed September 4, 2019.
18. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Revised January 15,2019. Accessed September 4, 2019.
19. Armaghany T, Wilson JD, Chu Q, Mills G. Genetic alterations in colorectal cancer. Gastrointest Cancer Res. 2012;5(1):19-27.
20. Herzig DO, Tsikitis VL. Molecular markers for colon diagnosis, prognosis and targeted therapy. J Surg Oncol. 2015;111(1):96-102.
21. Ma W, Brodie S, Agersborg S, Funari VA, Albitar M. Significant improvement in detecting BRAF, KRAS, and EGFR mutations using next-generation sequencing as compared with FDA-cleared kits. Mol Diagn Ther. 2017;21(5):571-579.
22. Greco FA. Molecular diagnosis of the tissue of origin in cancer of unknown primary site: useful in patient management. Curr Treat Options Oncol. 2013;14(4):634-642.
23. Bejnordi BE, Veta M, van Diest PJ, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210.
24. Xiong Y, Ba X, Hou A, Zhang K, Chen L, Li T. Automatic detection of mycobacterium tuberculosis using artificial intelligence. J Thorac Dis. 2018;10(3):1936-1940.
25. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450.
26. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567.
27. Ertosun MG, Rubin DL. Automated grading of gliomas using deep learning in digital pathology images: a modular approach with ensemble of convolutional neural networks. AMIA Annu Symp Proc. 2015;2015:1899-1908.
28. Wahab N, Khan A, Lee YS. Two-phase deep convolutional neural network for reducing class skewness in histopathological images based breast cancer detection. Comput Biol Med. 2017;85:86-97.
29. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118.
30. Han SS, Park GH, Lim W, et al. Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: automatic construction of onychomycosis datasets by region-based convolutional deep neural network. PLoS One. 2018;13(1):e0191493.
31. Fujisawa Y, Otomo Y, Ogata Y, et al. Deep-learning-based, computer-aided classifier developed with a small dataset of clinical images surpasses board-certified dermatologists in skin tumour diagnosis. Br J Dermatol. 2019;180(2):373-381.
32. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2010.
33. Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS One. 2017;12(4):e0174944.
34. Cheng J-Z, Ni D, Chou Y-H, et al. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep. 2016;6(1):24454.
35. Wang X, Yang W, Weinreb J, et al. Searching for prostate cancer by fully automated magnetic resonance imaging classification: deep learning versus non-deep learning. Sci Rep. 2017;7(1):15415.
36. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582.
37. Bardou D, Zhang K, Ahmad SM. Classification of breast cancer based on histology images using convolutional neural networks. IEEE Access. 2018;6(6):24680-24693.
38. Sheikhzadeh F, Ward RK, van Niekerk D, Guillaud M. Automatic labeling of molecular biomarkers of immunohistochemistry images using fully convolutional networks. PLoS One. 2018;13(1):e0190783.
39. Metter DM, Colgan TJ, Leung ST, Timmons CF, Park JY. Trends in the US and Canadian pathologist workforces from 2007 to 2017. JAMA Netw Open. 2019;2(5):e194337.
40. Benediktsson, H, Whitelaw J, Roy I. Pathology services in developing countries: a challenge. Arch Pathol Lab Med. 2007;131(11):1636-1639.
41. Graves D. The impact of the pathology workforce crisis on acute health care. Aust Health Rev. 2007;31(suppl 1):S28-S30.
42. NHS pathology shortages cause cancer diagnosis delays. https://www.gmjournal.co.uk/nhs-pathology-shortages-are-causing-cancer-diagnosis-delays. Published September 18, 2018. Accessed September 4, 2019.
43. Abbott LM, Smith SD. Smartphone apps for skin cancer diagnosis: Implications for patients and practitioners. Australas J Dermatol. 2018;59(3):168-170.
44. Poostchi M, Silamut K, Maude RJ, Jaeger S, Thoma G. Image analysis and machine learning for detecting malaria. Transl Res. 2018;194:36-55.
45. Orange DE, Agius P, DiCarlo EF, et al. Identification of three rheumatoid arthritis disease subtypes by machine learning integration of synovial histologic features and RNA sequencing data. Arthritis Rheumatol. 2018;70(5):690-701.
46. Rodellar J, Alférez S, Acevedo A, Molina A, Merino A. Image processing and machine learning in the morphological analysis of blood cells. Int J Lab Hematol. 2018;40(suppl 1):46-53.
47. Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60-88.