User login
Using Artificial Intelligence for COVID-19 Chest X-ray Diagnosis
The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARSCoV- 2), which causes the respiratory disease coronavirus disease-19 (COVID- 19), was first identified as a cluster of cases of pneumonia in Wuhan, Hubei Province of China on December 31, 2019.1 Within a month, the disease had spread significantly, leading the World Health Organization (WHO) to designate COVID-19 a public health emergency of international concern. On March 11, 2020, the WHO declared COVID-19 a global pandemic.2 As of August 18, 2020, the virus has infected > 21 million people, with > 750,000 deaths worldwide.3 The spread of COVID-19 has had a dramatic impact on social, economic, and health care issues throughout the world, which has been discussed elsewhere.4
Prior to the this century, members of the coronavirus family had minimal impact on human health.5 However, in the past 20 years, outbreaks have highlighted an emerging importance of coronaviruses in morbidity and mortality on a global scale. Although less prevalent than COVID-19, severe acute respiratory syndrome (SARS) in 2002 to 2003 and Middle East respiratory syndrome (MERS) in 2012 likely had higher mortality rates than the current pandemic.5 Based on this recent history, it is reasonable to assume that we will continue to see novel diseases with similar significant health and societal implications. The challenges presented to health care providers (HCPs) by such novel viral pathogens are numerous, including methods for rapid diagnosis, prevention, and treatment. In the current study, we focus on diagnosis issues, which were evident with COVID-19 with the time required to develop rapid and effective diagnostic modalities.
We have previously reported the utility of using artificial intelligence (AI) in the histopathologic diagnosis of cancer.6-8 AI was first described in 1956 and involves the field of computer science in which machines are trained to learn from experience.9 Machine learning (ML) is a subset of AI and is achieved by using mathematic models to compute sample datasets.10 Current ML employs deep learning with neural network algorithms, which can recognize patterns and achieve complex computational tasks often far quicker and with increased precision than can humans.11-13 In addition to applications in pathology, ML algorithms have both prognostic and diagnostic applications in multiple medical specialties, such as radiology, dermatology, ophthalmology, and cardiology.6 It is predicted that AI will impact almost every aspect of health care in the future.14
In this article, we examine the potential for AI to diagnose patients with COVID-19 pneumonia using chest radiographs (CXR) alone. This is done using Microsoft CustomVision (www.customvision.ai), a readily available, automated ML platform. Employing AI to both screen and diagnose emerging health emergencies such as COVID-19 has the potential to dramatically change how we approach medical care in the future. In addition, we describe the creation of a publicly available website (interknowlogy-covid-19 .azurewebsites.net) that could augment COVID-19 pneumonia CXR diagnosis.
Methods
For the training dataset, 103 CXR images of COVID-19 were downloaded from GitHub covid-chest-xray dataset.15 Five hundred images of non-COVID-19 pneumonia and 500 images of the normal lung were downloaded from the Kaggle RSNA Pneumonia Detection Challenge dataset.16 To balance the dataset, we expanded the COVID-19 dataset to 500 images by slight rotation (probability = 1, max rotation = 5) and zooming (probability = 0.5, percentage area = 0.9) of the original images using the Augmentor Python package.17
Validation Dataset
For the validation dataset 30 random CXR images were obtained from the US Department of Veterans Affairs (VA) PACS (picture archiving and communication system). This dataset included 10 CXR images from hospitalized patients with COVID-19, 10 CXR pneumonia images from patients without COVID-19, and 10 normal CXRs. COVID-19 diagnoses were confirmed with a positive test result from the Xpert Xpress SARS-CoV-2 polymerase chain reaction (PCR) platform.18
Microsoft Custom
Vision Microsoft CustomVision is an automated image classification and object detection system that is a part of Microsoft Azure Cognitive Services (azure.microsoft.com). It has a pay-as-you-go model with fees depending on the computing needs and usage. It offers a free trial to users for 2 initial projects. The service is online with an easy-to-follow graphical user interface. No coding skills are necessary.
We created a new classification project in CustomVision and chose a compact general domain for small size and easy export to TensorFlow. js model format. TensorFlow.js is a JavaScript library that enables dynamic download and execution of ML models. After the project was created, we proceeded to upload our image dataset. Each class was uploaded separately and tagged with the appropriate label (covid pneumonia, non-covid pneumonia, or normal lung). The system rejected 16 COVID-19 images as duplicates. The final CustomVision training dataset consisted of 484 images of COVID-19 pneumonia, 500 images of non-COVID-19 pneumonia, and 500 images of normal lungs. Once uploaded, CustomVision self-trains using the dataset upon initiating the program (Figure 1).
Website Creation
CustomVision was used to train the model. It can be used to execute the model continuously, or the model can be compacted and decoupled from CustomVision. In this case, the model was compacted and decoupled for use in an online application. An Angular online application was created with TensorFlow.js. Within a user’s web browser, the model is executed when an image of a CXR is submitted. Confidence values for each classification are returned. In this design, after the initial webpage and model is downloaded, the webpage no longer needs to access any server components and performs all operations in the browser. Although the solution works well on mobile phone browsers and in low bandwidth situations, the quality of predictions may depend on the browser and device used. At no time does an image get submitted to the cloud.
Result
Overall, our trained model showed 92.9% precision and recall. Precision and recall results for each label were 98.9% and 94.8%, respectively for COVID-19 pneumonia; 91.8% and 89%, respectively, for non- COVID-19 pneumonia; and 88.8% and 95%, respectively, for normal lung (Figure 2). Next, we proceeded to validate the training model on the VA data by making individual predictions on 30 images from the VA dataset. Our model performed well with 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value (Table).
Discussion
We successfully demonstrated the potential of using AI algorithms in assessing CXRs for COVID-19. We first trained the CustomVision automated image classification and object detection system to differentiate cases of COVID-19 from pneumonia from other etiologies as well as normal lung CXRs. We then tested our model against known patients from the James A. Haley Veterans’ Hospital in Tampa, Florida. The program achieved 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value in differentiating the 3 scenarios. Using the trained ML model, we proceeded to create a website that could augment COVID-19 CXR diagnosis.19 The website works on mobile as well as desktop platforms. A health care provider can take a CXR photo with a mobile phone or upload the image file. The ML algorithm would provide the probability of COVID-19 pneumonia, non-COVID-19 pneumonia, or normal lung diagnosis (Figure 3).
Emerging diseases such as COVID-19 present numerous challenges to HCPs, governments, and businesses, as well as to individual members of society. As evidenced with COVID-19, the time from first recognition of an emerging pathogen to the development of methods for reliable diagnosis and treatment can be months, even with a concerted international effort. The gold standard for diagnosis of COVID-19 is by reverse transcriptase PCR (RT-PCR) technologies; however, early RT-PCR testing produced less than optimal results.20-22 Even after the development of reliable tests for detection, making test kits readily available to health care providers on an adequate scale presents an additional challenge as evident with COVID-19.
Use of X-ray vs Computed Tomography
The lack of availability of diagnostic RTPCR with COVID-19 initially placed increased reliability on presumptive diagnoses via imaging in some situations.23 Most of the literature evaluating radiographs of patients with COVID-19 focuses on chest computed tomography (CT) findings, with initial results suggesting CT was more accurate than early RT-PCR methodologies.21,22,24 The Radiological Society of North America Expert consensus statement on chest CT for COVID-19 states that CT findings can even precede positivity on RT-PCR in some cases.22 However, currently it does not recommend the use of CT scanning as a screening tool. Furthermore, the actual sensitivity and specificity of CT interpretation by radiologists for COVID-19 are unknown.22
Characteristic CT findings include ground-glass opacities (GGOs) and consolidation most commonly in the lung periphery, though a diffuse distribution was found in a minority of patients.21,23,25-27 Lomoro and colleagues recently summarized the CT findings from several reports that described abnormalities as most often bilateral and peripheral, subpleural, and affecting the lower lobes.26 Not surprisingly, CT appears more sensitive at detecting changes with COVID-19 than does CXR, with reports that a minority of patients exhibited CT changes before changes were visible on CXR.23,26
We focused our study on the potential of AI in the examination of CXRs in patients with COVID-19, as there are several limitations to the routine use of CT scans with conditions such as COVID-19. Aside from the more considerable time required to obtain CTs, there are issues with contamination of CT suites, sometimes requiring a dedicated COVID-19 CT scanner.23,28 The time constraints of decontamination or limited utilization of CT suites can delay or disrupt services for patients with and without COVID-19. Because of these factors, CXR may be a better resource to minimize the risk of infection to other patients. Also, accurate assessment of abnormalities on CXR for COVID-19 may identify patients in whom the CXR was performed for other purposes.23 CXR is more readily available than CT, especially in more remote or underdeveloped areas.28 Finally, as with CT, CXR abnormalities are reported to have appeared before RT-PCR tests became positive for a minority of patients.23
CXR findings described in patients with COVID-19 are similar to those of CT and include GGOs, consolidation, and hazy increased opacities.23,25,26,28,29 Like CT, the majority of patients who received CXR demonstrated greater involvement in the lower zones and peripherally.23,25,26,28,29 Most patients showed bilateral involvement. However, while these findings are common in patients with COVID-19, they are not specific and can be seen in other conditions, such as other viral pneumonia, bacterial pneumonia, injury from drug toxicity, inhalation injury, connective tissue disease, and idiopathic conditions.
Application of AI for COVID-19
Applications of AI in interpreting radiographs of various types are numerous, and extensive literature has been written on the topic.30 Using deep learning algorithms, AI has multiple possible roles to augment traditional radiograph interpretation. These include the potential for screening, triaging, and increasing the speed to render diagnoses. It also can provide a rapid “second opinion” to the radiologist to support the final interpretation. In areas with critical shortages of radiologists, AI potentially can be used to render the definitive diagnosis. In COVID- 19, imaging studies have been shown to correlate with disease severity and mortality, and AI could assist in monitoring the course of the disease as it progresses and potentially identify patients at greatest risk.27 Furthermore, early results from PCR have been considered suboptimal, and it is known that patients with COVID-19 can test negative initially even by reliable testing methodologies. As AI technology progresses, interpretation can detect and guide triage and treatment of patients with high suspicions of COVID-19 but negative initial PCR results, or in situations where test availability is limited or results are delayed. There are numerous potential benefits should a rapid diagnostic test as simple as a CXR be able to reliably impact containment and prevention of the spread of contagions such as COVID- 19 early in its course.
Few studies have assessed using AI in the radiologic diagnosis of COVID-19, most of which use CT scanning. Bai and colleagues demonstrated increased accuracy, sensitivity, and specificity in distinguishing chest CTs of COVID-19 patients from other types of pneumonia.21,31 A separate study demonstrated the utility of using AI to differentiate COVID-19 from community-acquired pneumonia with CT.32 However, the effective utility of AI for CXR interpretation also has been demonstrated.14,33 Implementation of convolutional neural network layers has allowed for reliable differentiation of viral and bacterial pneumonia with CXR imaging.34 Evidence suggests that there is great potential in the application of AI in the interpretation of radiographs of all types.
Finally, we have developed a publicly available website based on our studies.18 This website is for research use only as it is based on data from our preliminary investigation. To appear within the website, images must have protected health information removed before uploading. The information on the website, including text, graphics, images, or other material, is for research and may not be appropriate for all circumstances. The website does not provide medical, professional, or licensed advice and is not a substitute for consultation with a HCP. Medical advice should be sought from a qualified HCP for any questions, and the website should not be used for medical diagnosis or treatment.
Limitations
In our preliminary study, we have demonstrated the potential impact AI can have in multiple aspects of patient care for emerging pathogens such as COVID-19 using a test as readily available as a CXR. However, several limitations to this investigation should be mentioned. The study is retrospective in nature with limited sample size and with X-rays from patients with various stages of COVID-19 pneumonia. Also, cases of non-COVID-19 pneumonia are not stratified into different types or etiologies. We intend to demonstrate the potential of AI in differentiating COVID-19 pneumonia from non-COVID-19 pneumonia of any etiology, though future studies should address comparison of COVID-19 cases to more specific types of pneumonias, such as of bacterial or viral origin. Furthermore, the present study does not address any potential effects of additional radiographic findings from coexistent conditions, such as pulmonary edema as seen in congestive heart failure, pleural effusions (which can be seen with COVID-19 pneumonia, though rarely), interstitial lung disease, etc. Future studies are required to address these issues. Ultimately, prospective studies to assess AI-assisted radiographic interpretation in conditions such as COVID-19 are required to demonstrate the impact on diagnosis, treatment, outcome, and patient safety as these technologies are implemented.
Conclusions
We have used a readily available, commercial platform to demonstrate the potential of AI to assist in the successful diagnosis of COVID-19 pneumonia on CXR images. While this technology has numerous applications in radiology, we have focused on the potential impact on future world health crises such as COVID-19. The findings have implications for screening and triage, initial diagnosis, monitoring disease progression, and identifying patients at increased risk of morbidity and mortality. Based on the data, a website was created to demonstrate how such technologies could be shared and distributed to others to combat entities such as COVID-19 moving forward. Our study offers a small window into the potential for how AI will likely dramatically change the practice of medicine in the future.
1. World Health Organization. Coronavirus disease (COVID- 19) pandemic. https://www.who.int/emergencies/diseases /novel-coronavirus2019. Updated August 23, 2020. Accessed August 24, 2020.
2. World Health Organization. WHO Director-General’s opening remarks at the media briefing on COVID-19 - 11 March 2020. https://www.who.int/dg/speeches/detail/who -director-general-sopening-remarks-at-the-media-briefing -on-covid-19---11-march2020. Published March 11, 2020. Accessed August 24, 2020.
3. World Health Organization. Coronavirus disease (COVID- 19): situation report--209. https://www.who.int/docs /default-source/coronaviruse/situation-reports/20200816 -covid-19-sitrep-209.pdf. Updated August 16, 2020. Accessed August 24, 2020.
4. Nicola M, Alsafi Z, Sohrabi C, et al. The socio-economic implications of the coronavirus pandemic (COVID-19): a review. Int J Surg. 2020;78:185-193. doi:10.1016/j.ijsu.2020.04.018
5. da Costa VG, Moreli ML, Saivish MV. The emergence of SARS, MERS and novel SARS-2 coronaviruses in the 21st century. Arch Virol. 2020;165(7):1517-1526. doi:10.1007/s00705-020-04628-0
6. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.
7. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Updated January 15, 2019. Accessed August 24, 2020.
8. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. http:// arxiv.org/abs/1808.08230. Updated January 15, 2019. Accessed August 24, 2020.
9. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87. doi:10.1609/AIMAG.V27I4.1911
10. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229. doi:10.1147/rd.33.0210
11. Sarle WS. Neural networks and statistical models https:// people.orie.cornell.edu/davidr/or474/nn_sas.pdf. Published April 1994. Accessed August 24, 2020.
12. Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85-117. doi:10.1016/j.neunet.2014.09.003
13. 13. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539
14. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44- 56. doi:10.1038/s41591-018-0300-7
15. Cohen JP, Morrison P, Dao L. COVID-19 Image Data Collection. Published online March 25, 2020. Accessed May 13, 2020. http://arxiv.org/abs/2003.11597
16. Radiological Society of America. RSNA pneumonia detection challenge. https://www.kaggle.com/c/rsnapneumonia- detectionchallenge. Accessed August 24, 2020.
17. Bloice MD, Roth PM, Holzinger A. Biomedical image augmentation using Augmentor. Bioinformatics. 2019;35(21):4522-4524. doi:10.1093/bioinformatics/btz259
18. Cepheid. Xpert Xpress SARS-CoV-2. https://www.cepheid .com/coronavirus. Accessed August 24, 2020.
19. Interknowlogy. COVID-19 detection in chest X-rays. https://interknowlogy-covid-19.azurewebsites.net. Accessed August 27, 2020.
20. Bernheim A, Mei X, Huang M, et al. Chest CT Findings in Coronavirus Disease-19 (COVID-19): Relationship to Duration of Infection. Radiology. 2020;295(3):200463. doi:10.1148/radiol.2020200463
21. Ai T, Yang Z, Hou H, et al. Correlation of Chest CT and RTPCR Testing for Coronavirus Disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology. 2020;296(2):E32- E40. doi:10.1148/radiol.2020200642
22. Simpson S, Kay FU, Abbara S, et al. Radiological Society of North America Expert Consensus Statement on Reporting Chest CT Findings Related to COVID-19. Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA - Secondary Publication. J Thorac Imaging. 2020;35(4):219-227. doi:10.1097/RTI.0000000000000524
23. Wong HYF, Lam HYS, Fong AH, et al. Frequency and distribution of chest radiographic findings in patients positive for COVID-19. Radiology. 2020;296(2):E72-E78. doi:10.1148/radiol.2020201160
24. Fang Y, Zhang H, Xie J, et al. Sensitivity of chest CT for COVID-19: comparison to RT-PCR. Radiology. 2020;296(2):E115-E117. doi:10.1148/radiol.2020200432
25. Chen N, Zhou M, Dong X, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. 2020;395(10223):507-513. doi:10.1016/S0140-6736(20)30211-7
26. Lomoro P, Verde F, Zerboni F, et al. COVID-19 pneumonia manifestations at the admission on chest ultrasound, radiographs, and CT: single-center study and comprehensive radiologic literature review. Eur J Radiol Open. 2020;7:100231. doi:10.1016/j.ejro.2020.100231
27. Salehi S, Abedi A, Balakrishnan S, Gholamrezanezhad A. Coronavirus disease 2019 (COVID-19) imaging reporting and data system (COVID-RADS) and common lexicon: a proposal based on the imaging data of 37 studies. Eur Radiol. 2020;30(9):4930-4942. doi:10.1007/s00330-020-06863-0
28. Jacobi A, Chung M, Bernheim A, Eber C. Portable chest X-ray in coronavirus disease-19 (COVID- 19): a pictorial review. Clin Imaging. 2020;64:35-42. doi:10.1016/j.clinimag.2020.04.001
29. Bhat R, Hamid A, Kunin JR, et al. Chest imaging in patients hospitalized With COVID-19 infection - a case series. Curr Probl Diagn Radiol. 2020;49(4):294-301. doi:10.1067/j.cpradiol.2020.04.001
30. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Heal. 2019;1(6):E271- E297. doi:10.1016/S2589-7500(19)30123-2
31. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491
32. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905
33. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. http://arxiv.org /abs/2002.11379. Updated March 11, 2020. Accessed August 24, 2020.
34. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by imagebased deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010
The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARSCoV- 2), which causes the respiratory disease coronavirus disease-19 (COVID- 19), was first identified as a cluster of cases of pneumonia in Wuhan, Hubei Province of China on December 31, 2019.1 Within a month, the disease had spread significantly, leading the World Health Organization (WHO) to designate COVID-19 a public health emergency of international concern. On March 11, 2020, the WHO declared COVID-19 a global pandemic.2 As of August 18, 2020, the virus has infected > 21 million people, with > 750,000 deaths worldwide.3 The spread of COVID-19 has had a dramatic impact on social, economic, and health care issues throughout the world, which has been discussed elsewhere.4
Prior to the this century, members of the coronavirus family had minimal impact on human health.5 However, in the past 20 years, outbreaks have highlighted an emerging importance of coronaviruses in morbidity and mortality on a global scale. Although less prevalent than COVID-19, severe acute respiratory syndrome (SARS) in 2002 to 2003 and Middle East respiratory syndrome (MERS) in 2012 likely had higher mortality rates than the current pandemic.5 Based on this recent history, it is reasonable to assume that we will continue to see novel diseases with similar significant health and societal implications. The challenges presented to health care providers (HCPs) by such novel viral pathogens are numerous, including methods for rapid diagnosis, prevention, and treatment. In the current study, we focus on diagnosis issues, which were evident with COVID-19 with the time required to develop rapid and effective diagnostic modalities.
We have previously reported the utility of using artificial intelligence (AI) in the histopathologic diagnosis of cancer.6-8 AI was first described in 1956 and involves the field of computer science in which machines are trained to learn from experience.9 Machine learning (ML) is a subset of AI and is achieved by using mathematic models to compute sample datasets.10 Current ML employs deep learning with neural network algorithms, which can recognize patterns and achieve complex computational tasks often far quicker and with increased precision than can humans.11-13 In addition to applications in pathology, ML algorithms have both prognostic and diagnostic applications in multiple medical specialties, such as radiology, dermatology, ophthalmology, and cardiology.6 It is predicted that AI will impact almost every aspect of health care in the future.14
In this article, we examine the potential for AI to diagnose patients with COVID-19 pneumonia using chest radiographs (CXR) alone. This is done using Microsoft CustomVision (www.customvision.ai), a readily available, automated ML platform. Employing AI to both screen and diagnose emerging health emergencies such as COVID-19 has the potential to dramatically change how we approach medical care in the future. In addition, we describe the creation of a publicly available website (interknowlogy-covid-19 .azurewebsites.net) that could augment COVID-19 pneumonia CXR diagnosis.
Methods
For the training dataset, 103 CXR images of COVID-19 were downloaded from GitHub covid-chest-xray dataset.15 Five hundred images of non-COVID-19 pneumonia and 500 images of the normal lung were downloaded from the Kaggle RSNA Pneumonia Detection Challenge dataset.16 To balance the dataset, we expanded the COVID-19 dataset to 500 images by slight rotation (probability = 1, max rotation = 5) and zooming (probability = 0.5, percentage area = 0.9) of the original images using the Augmentor Python package.17
Validation Dataset
For the validation dataset 30 random CXR images were obtained from the US Department of Veterans Affairs (VA) PACS (picture archiving and communication system). This dataset included 10 CXR images from hospitalized patients with COVID-19, 10 CXR pneumonia images from patients without COVID-19, and 10 normal CXRs. COVID-19 diagnoses were confirmed with a positive test result from the Xpert Xpress SARS-CoV-2 polymerase chain reaction (PCR) platform.18
Microsoft Custom
Vision Microsoft CustomVision is an automated image classification and object detection system that is a part of Microsoft Azure Cognitive Services (azure.microsoft.com). It has a pay-as-you-go model with fees depending on the computing needs and usage. It offers a free trial to users for 2 initial projects. The service is online with an easy-to-follow graphical user interface. No coding skills are necessary.
We created a new classification project in CustomVision and chose a compact general domain for small size and easy export to TensorFlow. js model format. TensorFlow.js is a JavaScript library that enables dynamic download and execution of ML models. After the project was created, we proceeded to upload our image dataset. Each class was uploaded separately and tagged with the appropriate label (covid pneumonia, non-covid pneumonia, or normal lung). The system rejected 16 COVID-19 images as duplicates. The final CustomVision training dataset consisted of 484 images of COVID-19 pneumonia, 500 images of non-COVID-19 pneumonia, and 500 images of normal lungs. Once uploaded, CustomVision self-trains using the dataset upon initiating the program (Figure 1).
Website Creation
CustomVision was used to train the model. It can be used to execute the model continuously, or the model can be compacted and decoupled from CustomVision. In this case, the model was compacted and decoupled for use in an online application. An Angular online application was created with TensorFlow.js. Within a user’s web browser, the model is executed when an image of a CXR is submitted. Confidence values for each classification are returned. In this design, after the initial webpage and model is downloaded, the webpage no longer needs to access any server components and performs all operations in the browser. Although the solution works well on mobile phone browsers and in low bandwidth situations, the quality of predictions may depend on the browser and device used. At no time does an image get submitted to the cloud.
Result
Overall, our trained model showed 92.9% precision and recall. Precision and recall results for each label were 98.9% and 94.8%, respectively for COVID-19 pneumonia; 91.8% and 89%, respectively, for non- COVID-19 pneumonia; and 88.8% and 95%, respectively, for normal lung (Figure 2). Next, we proceeded to validate the training model on the VA data by making individual predictions on 30 images from the VA dataset. Our model performed well with 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value (Table).
Discussion
We successfully demonstrated the potential of using AI algorithms in assessing CXRs for COVID-19. We first trained the CustomVision automated image classification and object detection system to differentiate cases of COVID-19 from pneumonia from other etiologies as well as normal lung CXRs. We then tested our model against known patients from the James A. Haley Veterans’ Hospital in Tampa, Florida. The program achieved 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value in differentiating the 3 scenarios. Using the trained ML model, we proceeded to create a website that could augment COVID-19 CXR diagnosis.19 The website works on mobile as well as desktop platforms. A health care provider can take a CXR photo with a mobile phone or upload the image file. The ML algorithm would provide the probability of COVID-19 pneumonia, non-COVID-19 pneumonia, or normal lung diagnosis (Figure 3).
Emerging diseases such as COVID-19 present numerous challenges to HCPs, governments, and businesses, as well as to individual members of society. As evidenced with COVID-19, the time from first recognition of an emerging pathogen to the development of methods for reliable diagnosis and treatment can be months, even with a concerted international effort. The gold standard for diagnosis of COVID-19 is by reverse transcriptase PCR (RT-PCR) technologies; however, early RT-PCR testing produced less than optimal results.20-22 Even after the development of reliable tests for detection, making test kits readily available to health care providers on an adequate scale presents an additional challenge as evident with COVID-19.
Use of X-ray vs Computed Tomography
The lack of availability of diagnostic RTPCR with COVID-19 initially placed increased reliability on presumptive diagnoses via imaging in some situations.23 Most of the literature evaluating radiographs of patients with COVID-19 focuses on chest computed tomography (CT) findings, with initial results suggesting CT was more accurate than early RT-PCR methodologies.21,22,24 The Radiological Society of North America Expert consensus statement on chest CT for COVID-19 states that CT findings can even precede positivity on RT-PCR in some cases.22 However, currently it does not recommend the use of CT scanning as a screening tool. Furthermore, the actual sensitivity and specificity of CT interpretation by radiologists for COVID-19 are unknown.22
Characteristic CT findings include ground-glass opacities (GGOs) and consolidation most commonly in the lung periphery, though a diffuse distribution was found in a minority of patients.21,23,25-27 Lomoro and colleagues recently summarized the CT findings from several reports that described abnormalities as most often bilateral and peripheral, subpleural, and affecting the lower lobes.26 Not surprisingly, CT appears more sensitive at detecting changes with COVID-19 than does CXR, with reports that a minority of patients exhibited CT changes before changes were visible on CXR.23,26
We focused our study on the potential of AI in the examination of CXRs in patients with COVID-19, as there are several limitations to the routine use of CT scans with conditions such as COVID-19. Aside from the more considerable time required to obtain CTs, there are issues with contamination of CT suites, sometimes requiring a dedicated COVID-19 CT scanner.23,28 The time constraints of decontamination or limited utilization of CT suites can delay or disrupt services for patients with and without COVID-19. Because of these factors, CXR may be a better resource to minimize the risk of infection to other patients. Also, accurate assessment of abnormalities on CXR for COVID-19 may identify patients in whom the CXR was performed for other purposes.23 CXR is more readily available than CT, especially in more remote or underdeveloped areas.28 Finally, as with CT, CXR abnormalities are reported to have appeared before RT-PCR tests became positive for a minority of patients.23
CXR findings described in patients with COVID-19 are similar to those of CT and include GGOs, consolidation, and hazy increased opacities.23,25,26,28,29 Like CT, the majority of patients who received CXR demonstrated greater involvement in the lower zones and peripherally.23,25,26,28,29 Most patients showed bilateral involvement. However, while these findings are common in patients with COVID-19, they are not specific and can be seen in other conditions, such as other viral pneumonia, bacterial pneumonia, injury from drug toxicity, inhalation injury, connective tissue disease, and idiopathic conditions.
Application of AI for COVID-19
Applications of AI in interpreting radiographs of various types are numerous, and extensive literature has been written on the topic.30 Using deep learning algorithms, AI has multiple possible roles to augment traditional radiograph interpretation. These include the potential for screening, triaging, and increasing the speed to render diagnoses. It also can provide a rapid “second opinion” to the radiologist to support the final interpretation. In areas with critical shortages of radiologists, AI potentially can be used to render the definitive diagnosis. In COVID- 19, imaging studies have been shown to correlate with disease severity and mortality, and AI could assist in monitoring the course of the disease as it progresses and potentially identify patients at greatest risk.27 Furthermore, early results from PCR have been considered suboptimal, and it is known that patients with COVID-19 can test negative initially even by reliable testing methodologies. As AI technology progresses, interpretation can detect and guide triage and treatment of patients with high suspicions of COVID-19 but negative initial PCR results, or in situations where test availability is limited or results are delayed. There are numerous potential benefits should a rapid diagnostic test as simple as a CXR be able to reliably impact containment and prevention of the spread of contagions such as COVID- 19 early in its course.
Few studies have assessed using AI in the radiologic diagnosis of COVID-19, most of which use CT scanning. Bai and colleagues demonstrated increased accuracy, sensitivity, and specificity in distinguishing chest CTs of COVID-19 patients from other types of pneumonia.21,31 A separate study demonstrated the utility of using AI to differentiate COVID-19 from community-acquired pneumonia with CT.32 However, the effective utility of AI for CXR interpretation also has been demonstrated.14,33 Implementation of convolutional neural network layers has allowed for reliable differentiation of viral and bacterial pneumonia with CXR imaging.34 Evidence suggests that there is great potential in the application of AI in the interpretation of radiographs of all types.
Finally, we have developed a publicly available website based on our studies.18 This website is for research use only as it is based on data from our preliminary investigation. To appear within the website, images must have protected health information removed before uploading. The information on the website, including text, graphics, images, or other material, is for research and may not be appropriate for all circumstances. The website does not provide medical, professional, or licensed advice and is not a substitute for consultation with a HCP. Medical advice should be sought from a qualified HCP for any questions, and the website should not be used for medical diagnosis or treatment.
Limitations
In our preliminary study, we have demonstrated the potential impact AI can have in multiple aspects of patient care for emerging pathogens such as COVID-19 using a test as readily available as a CXR. However, several limitations to this investigation should be mentioned. The study is retrospective in nature with limited sample size and with X-rays from patients with various stages of COVID-19 pneumonia. Also, cases of non-COVID-19 pneumonia are not stratified into different types or etiologies. We intend to demonstrate the potential of AI in differentiating COVID-19 pneumonia from non-COVID-19 pneumonia of any etiology, though future studies should address comparison of COVID-19 cases to more specific types of pneumonias, such as of bacterial or viral origin. Furthermore, the present study does not address any potential effects of additional radiographic findings from coexistent conditions, such as pulmonary edema as seen in congestive heart failure, pleural effusions (which can be seen with COVID-19 pneumonia, though rarely), interstitial lung disease, etc. Future studies are required to address these issues. Ultimately, prospective studies to assess AI-assisted radiographic interpretation in conditions such as COVID-19 are required to demonstrate the impact on diagnosis, treatment, outcome, and patient safety as these technologies are implemented.
Conclusions
We have used a readily available, commercial platform to demonstrate the potential of AI to assist in the successful diagnosis of COVID-19 pneumonia on CXR images. While this technology has numerous applications in radiology, we have focused on the potential impact on future world health crises such as COVID-19. The findings have implications for screening and triage, initial diagnosis, monitoring disease progression, and identifying patients at increased risk of morbidity and mortality. Based on the data, a website was created to demonstrate how such technologies could be shared and distributed to others to combat entities such as COVID-19 moving forward. Our study offers a small window into the potential for how AI will likely dramatically change the practice of medicine in the future.
The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARSCoV- 2), which causes the respiratory disease coronavirus disease-19 (COVID- 19), was first identified as a cluster of cases of pneumonia in Wuhan, Hubei Province of China on December 31, 2019.1 Within a month, the disease had spread significantly, leading the World Health Organization (WHO) to designate COVID-19 a public health emergency of international concern. On March 11, 2020, the WHO declared COVID-19 a global pandemic.2 As of August 18, 2020, the virus has infected > 21 million people, with > 750,000 deaths worldwide.3 The spread of COVID-19 has had a dramatic impact on social, economic, and health care issues throughout the world, which has been discussed elsewhere.4
Prior to the this century, members of the coronavirus family had minimal impact on human health.5 However, in the past 20 years, outbreaks have highlighted an emerging importance of coronaviruses in morbidity and mortality on a global scale. Although less prevalent than COVID-19, severe acute respiratory syndrome (SARS) in 2002 to 2003 and Middle East respiratory syndrome (MERS) in 2012 likely had higher mortality rates than the current pandemic.5 Based on this recent history, it is reasonable to assume that we will continue to see novel diseases with similar significant health and societal implications. The challenges presented to health care providers (HCPs) by such novel viral pathogens are numerous, including methods for rapid diagnosis, prevention, and treatment. In the current study, we focus on diagnosis issues, which were evident with COVID-19 with the time required to develop rapid and effective diagnostic modalities.
We have previously reported the utility of using artificial intelligence (AI) in the histopathologic diagnosis of cancer.6-8 AI was first described in 1956 and involves the field of computer science in which machines are trained to learn from experience.9 Machine learning (ML) is a subset of AI and is achieved by using mathematic models to compute sample datasets.10 Current ML employs deep learning with neural network algorithms, which can recognize patterns and achieve complex computational tasks often far quicker and with increased precision than can humans.11-13 In addition to applications in pathology, ML algorithms have both prognostic and diagnostic applications in multiple medical specialties, such as radiology, dermatology, ophthalmology, and cardiology.6 It is predicted that AI will impact almost every aspect of health care in the future.14
In this article, we examine the potential for AI to diagnose patients with COVID-19 pneumonia using chest radiographs (CXR) alone. This is done using Microsoft CustomVision (www.customvision.ai), a readily available, automated ML platform. Employing AI to both screen and diagnose emerging health emergencies such as COVID-19 has the potential to dramatically change how we approach medical care in the future. In addition, we describe the creation of a publicly available website (interknowlogy-covid-19 .azurewebsites.net) that could augment COVID-19 pneumonia CXR diagnosis.
Methods
For the training dataset, 103 CXR images of COVID-19 were downloaded from GitHub covid-chest-xray dataset.15 Five hundred images of non-COVID-19 pneumonia and 500 images of the normal lung were downloaded from the Kaggle RSNA Pneumonia Detection Challenge dataset.16 To balance the dataset, we expanded the COVID-19 dataset to 500 images by slight rotation (probability = 1, max rotation = 5) and zooming (probability = 0.5, percentage area = 0.9) of the original images using the Augmentor Python package.17
Validation Dataset
For the validation dataset 30 random CXR images were obtained from the US Department of Veterans Affairs (VA) PACS (picture archiving and communication system). This dataset included 10 CXR images from hospitalized patients with COVID-19, 10 CXR pneumonia images from patients without COVID-19, and 10 normal CXRs. COVID-19 diagnoses were confirmed with a positive test result from the Xpert Xpress SARS-CoV-2 polymerase chain reaction (PCR) platform.18
Microsoft Custom
Vision Microsoft CustomVision is an automated image classification and object detection system that is a part of Microsoft Azure Cognitive Services (azure.microsoft.com). It has a pay-as-you-go model with fees depending on the computing needs and usage. It offers a free trial to users for 2 initial projects. The service is online with an easy-to-follow graphical user interface. No coding skills are necessary.
We created a new classification project in CustomVision and chose a compact general domain for small size and easy export to TensorFlow. js model format. TensorFlow.js is a JavaScript library that enables dynamic download and execution of ML models. After the project was created, we proceeded to upload our image dataset. Each class was uploaded separately and tagged with the appropriate label (covid pneumonia, non-covid pneumonia, or normal lung). The system rejected 16 COVID-19 images as duplicates. The final CustomVision training dataset consisted of 484 images of COVID-19 pneumonia, 500 images of non-COVID-19 pneumonia, and 500 images of normal lungs. Once uploaded, CustomVision self-trains using the dataset upon initiating the program (Figure 1).
Website Creation
CustomVision was used to train the model. It can be used to execute the model continuously, or the model can be compacted and decoupled from CustomVision. In this case, the model was compacted and decoupled for use in an online application. An Angular online application was created with TensorFlow.js. Within a user’s web browser, the model is executed when an image of a CXR is submitted. Confidence values for each classification are returned. In this design, after the initial webpage and model is downloaded, the webpage no longer needs to access any server components and performs all operations in the browser. Although the solution works well on mobile phone browsers and in low bandwidth situations, the quality of predictions may depend on the browser and device used. At no time does an image get submitted to the cloud.
Result
Overall, our trained model showed 92.9% precision and recall. Precision and recall results for each label were 98.9% and 94.8%, respectively for COVID-19 pneumonia; 91.8% and 89%, respectively, for non- COVID-19 pneumonia; and 88.8% and 95%, respectively, for normal lung (Figure 2). Next, we proceeded to validate the training model on the VA data by making individual predictions on 30 images from the VA dataset. Our model performed well with 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value (Table).
Discussion
We successfully demonstrated the potential of using AI algorithms in assessing CXRs for COVID-19. We first trained the CustomVision automated image classification and object detection system to differentiate cases of COVID-19 from pneumonia from other etiologies as well as normal lung CXRs. We then tested our model against known patients from the James A. Haley Veterans’ Hospital in Tampa, Florida. The program achieved 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value in differentiating the 3 scenarios. Using the trained ML model, we proceeded to create a website that could augment COVID-19 CXR diagnosis.19 The website works on mobile as well as desktop platforms. A health care provider can take a CXR photo with a mobile phone or upload the image file. The ML algorithm would provide the probability of COVID-19 pneumonia, non-COVID-19 pneumonia, or normal lung diagnosis (Figure 3).
Emerging diseases such as COVID-19 present numerous challenges to HCPs, governments, and businesses, as well as to individual members of society. As evidenced with COVID-19, the time from first recognition of an emerging pathogen to the development of methods for reliable diagnosis and treatment can be months, even with a concerted international effort. The gold standard for diagnosis of COVID-19 is by reverse transcriptase PCR (RT-PCR) technologies; however, early RT-PCR testing produced less than optimal results.20-22 Even after the development of reliable tests for detection, making test kits readily available to health care providers on an adequate scale presents an additional challenge as evident with COVID-19.
Use of X-ray vs Computed Tomography
The lack of availability of diagnostic RTPCR with COVID-19 initially placed increased reliability on presumptive diagnoses via imaging in some situations.23 Most of the literature evaluating radiographs of patients with COVID-19 focuses on chest computed tomography (CT) findings, with initial results suggesting CT was more accurate than early RT-PCR methodologies.21,22,24 The Radiological Society of North America Expert consensus statement on chest CT for COVID-19 states that CT findings can even precede positivity on RT-PCR in some cases.22 However, currently it does not recommend the use of CT scanning as a screening tool. Furthermore, the actual sensitivity and specificity of CT interpretation by radiologists for COVID-19 are unknown.22
Characteristic CT findings include ground-glass opacities (GGOs) and consolidation most commonly in the lung periphery, though a diffuse distribution was found in a minority of patients.21,23,25-27 Lomoro and colleagues recently summarized the CT findings from several reports that described abnormalities as most often bilateral and peripheral, subpleural, and affecting the lower lobes.26 Not surprisingly, CT appears more sensitive at detecting changes with COVID-19 than does CXR, with reports that a minority of patients exhibited CT changes before changes were visible on CXR.23,26
We focused our study on the potential of AI in the examination of CXRs in patients with COVID-19, as there are several limitations to the routine use of CT scans with conditions such as COVID-19. Aside from the more considerable time required to obtain CTs, there are issues with contamination of CT suites, sometimes requiring a dedicated COVID-19 CT scanner.23,28 The time constraints of decontamination or limited utilization of CT suites can delay or disrupt services for patients with and without COVID-19. Because of these factors, CXR may be a better resource to minimize the risk of infection to other patients. Also, accurate assessment of abnormalities on CXR for COVID-19 may identify patients in whom the CXR was performed for other purposes.23 CXR is more readily available than CT, especially in more remote or underdeveloped areas.28 Finally, as with CT, CXR abnormalities are reported to have appeared before RT-PCR tests became positive for a minority of patients.23
CXR findings described in patients with COVID-19 are similar to those of CT and include GGOs, consolidation, and hazy increased opacities.23,25,26,28,29 Like CT, the majority of patients who received CXR demonstrated greater involvement in the lower zones and peripherally.23,25,26,28,29 Most patients showed bilateral involvement. However, while these findings are common in patients with COVID-19, they are not specific and can be seen in other conditions, such as other viral pneumonia, bacterial pneumonia, injury from drug toxicity, inhalation injury, connective tissue disease, and idiopathic conditions.
Application of AI for COVID-19
Applications of AI in interpreting radiographs of various types are numerous, and extensive literature has been written on the topic.30 Using deep learning algorithms, AI has multiple possible roles to augment traditional radiograph interpretation. These include the potential for screening, triaging, and increasing the speed to render diagnoses. It also can provide a rapid “second opinion” to the radiologist to support the final interpretation. In areas with critical shortages of radiologists, AI potentially can be used to render the definitive diagnosis. In COVID- 19, imaging studies have been shown to correlate with disease severity and mortality, and AI could assist in monitoring the course of the disease as it progresses and potentially identify patients at greatest risk.27 Furthermore, early results from PCR have been considered suboptimal, and it is known that patients with COVID-19 can test negative initially even by reliable testing methodologies. As AI technology progresses, interpretation can detect and guide triage and treatment of patients with high suspicions of COVID-19 but negative initial PCR results, or in situations where test availability is limited or results are delayed. There are numerous potential benefits should a rapid diagnostic test as simple as a CXR be able to reliably impact containment and prevention of the spread of contagions such as COVID- 19 early in its course.
Few studies have assessed using AI in the radiologic diagnosis of COVID-19, most of which use CT scanning. Bai and colleagues demonstrated increased accuracy, sensitivity, and specificity in distinguishing chest CTs of COVID-19 patients from other types of pneumonia.21,31 A separate study demonstrated the utility of using AI to differentiate COVID-19 from community-acquired pneumonia with CT.32 However, the effective utility of AI for CXR interpretation also has been demonstrated.14,33 Implementation of convolutional neural network layers has allowed for reliable differentiation of viral and bacterial pneumonia with CXR imaging.34 Evidence suggests that there is great potential in the application of AI in the interpretation of radiographs of all types.
Finally, we have developed a publicly available website based on our studies.18 This website is for research use only as it is based on data from our preliminary investigation. To appear within the website, images must have protected health information removed before uploading. The information on the website, including text, graphics, images, or other material, is for research and may not be appropriate for all circumstances. The website does not provide medical, professional, or licensed advice and is not a substitute for consultation with a HCP. Medical advice should be sought from a qualified HCP for any questions, and the website should not be used for medical diagnosis or treatment.
Limitations
In our preliminary study, we have demonstrated the potential impact AI can have in multiple aspects of patient care for emerging pathogens such as COVID-19 using a test as readily available as a CXR. However, several limitations to this investigation should be mentioned. The study is retrospective in nature with limited sample size and with X-rays from patients with various stages of COVID-19 pneumonia. Also, cases of non-COVID-19 pneumonia are not stratified into different types or etiologies. We intend to demonstrate the potential of AI in differentiating COVID-19 pneumonia from non-COVID-19 pneumonia of any etiology, though future studies should address comparison of COVID-19 cases to more specific types of pneumonias, such as of bacterial or viral origin. Furthermore, the present study does not address any potential effects of additional radiographic findings from coexistent conditions, such as pulmonary edema as seen in congestive heart failure, pleural effusions (which can be seen with COVID-19 pneumonia, though rarely), interstitial lung disease, etc. Future studies are required to address these issues. Ultimately, prospective studies to assess AI-assisted radiographic interpretation in conditions such as COVID-19 are required to demonstrate the impact on diagnosis, treatment, outcome, and patient safety as these technologies are implemented.
Conclusions
We have used a readily available, commercial platform to demonstrate the potential of AI to assist in the successful diagnosis of COVID-19 pneumonia on CXR images. While this technology has numerous applications in radiology, we have focused on the potential impact on future world health crises such as COVID-19. The findings have implications for screening and triage, initial diagnosis, monitoring disease progression, and identifying patients at increased risk of morbidity and mortality. Based on the data, a website was created to demonstrate how such technologies could be shared and distributed to others to combat entities such as COVID-19 moving forward. Our study offers a small window into the potential for how AI will likely dramatically change the practice of medicine in the future.
1. World Health Organization. Coronavirus disease (COVID- 19) pandemic. https://www.who.int/emergencies/diseases /novel-coronavirus2019. Updated August 23, 2020. Accessed August 24, 2020.
2. World Health Organization. WHO Director-General’s opening remarks at the media briefing on COVID-19 - 11 March 2020. https://www.who.int/dg/speeches/detail/who -director-general-sopening-remarks-at-the-media-briefing -on-covid-19---11-march2020. Published March 11, 2020. Accessed August 24, 2020.
3. World Health Organization. Coronavirus disease (COVID- 19): situation report--209. https://www.who.int/docs /default-source/coronaviruse/situation-reports/20200816 -covid-19-sitrep-209.pdf. Updated August 16, 2020. Accessed August 24, 2020.
4. Nicola M, Alsafi Z, Sohrabi C, et al. The socio-economic implications of the coronavirus pandemic (COVID-19): a review. Int J Surg. 2020;78:185-193. doi:10.1016/j.ijsu.2020.04.018
5. da Costa VG, Moreli ML, Saivish MV. The emergence of SARS, MERS and novel SARS-2 coronaviruses in the 21st century. Arch Virol. 2020;165(7):1517-1526. doi:10.1007/s00705-020-04628-0
6. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.
7. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Updated January 15, 2019. Accessed August 24, 2020.
8. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. http:// arxiv.org/abs/1808.08230. Updated January 15, 2019. Accessed August 24, 2020.
9. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87. doi:10.1609/AIMAG.V27I4.1911
10. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229. doi:10.1147/rd.33.0210
11. Sarle WS. Neural networks and statistical models https:// people.orie.cornell.edu/davidr/or474/nn_sas.pdf. Published April 1994. Accessed August 24, 2020.
12. Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85-117. doi:10.1016/j.neunet.2014.09.003
13. 13. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539
14. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44- 56. doi:10.1038/s41591-018-0300-7
15. Cohen JP, Morrison P, Dao L. COVID-19 Image Data Collection. Published online March 25, 2020. Accessed May 13, 2020. http://arxiv.org/abs/2003.11597
16. Radiological Society of America. RSNA pneumonia detection challenge. https://www.kaggle.com/c/rsnapneumonia- detectionchallenge. Accessed August 24, 2020.
17. Bloice MD, Roth PM, Holzinger A. Biomedical image augmentation using Augmentor. Bioinformatics. 2019;35(21):4522-4524. doi:10.1093/bioinformatics/btz259
18. Cepheid. Xpert Xpress SARS-CoV-2. https://www.cepheid .com/coronavirus. Accessed August 24, 2020.
19. Interknowlogy. COVID-19 detection in chest X-rays. https://interknowlogy-covid-19.azurewebsites.net. Accessed August 27, 2020.
20. Bernheim A, Mei X, Huang M, et al. Chest CT Findings in Coronavirus Disease-19 (COVID-19): Relationship to Duration of Infection. Radiology. 2020;295(3):200463. doi:10.1148/radiol.2020200463
21. Ai T, Yang Z, Hou H, et al. Correlation of Chest CT and RTPCR Testing for Coronavirus Disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology. 2020;296(2):E32- E40. doi:10.1148/radiol.2020200642
22. Simpson S, Kay FU, Abbara S, et al. Radiological Society of North America Expert Consensus Statement on Reporting Chest CT Findings Related to COVID-19. Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA - Secondary Publication. J Thorac Imaging. 2020;35(4):219-227. doi:10.1097/RTI.0000000000000524
23. Wong HYF, Lam HYS, Fong AH, et al. Frequency and distribution of chest radiographic findings in patients positive for COVID-19. Radiology. 2020;296(2):E72-E78. doi:10.1148/radiol.2020201160
24. Fang Y, Zhang H, Xie J, et al. Sensitivity of chest CT for COVID-19: comparison to RT-PCR. Radiology. 2020;296(2):E115-E117. doi:10.1148/radiol.2020200432
25. Chen N, Zhou M, Dong X, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. 2020;395(10223):507-513. doi:10.1016/S0140-6736(20)30211-7
26. Lomoro P, Verde F, Zerboni F, et al. COVID-19 pneumonia manifestations at the admission on chest ultrasound, radiographs, and CT: single-center study and comprehensive radiologic literature review. Eur J Radiol Open. 2020;7:100231. doi:10.1016/j.ejro.2020.100231
27. Salehi S, Abedi A, Balakrishnan S, Gholamrezanezhad A. Coronavirus disease 2019 (COVID-19) imaging reporting and data system (COVID-RADS) and common lexicon: a proposal based on the imaging data of 37 studies. Eur Radiol. 2020;30(9):4930-4942. doi:10.1007/s00330-020-06863-0
28. Jacobi A, Chung M, Bernheim A, Eber C. Portable chest X-ray in coronavirus disease-19 (COVID- 19): a pictorial review. Clin Imaging. 2020;64:35-42. doi:10.1016/j.clinimag.2020.04.001
29. Bhat R, Hamid A, Kunin JR, et al. Chest imaging in patients hospitalized With COVID-19 infection - a case series. Curr Probl Diagn Radiol. 2020;49(4):294-301. doi:10.1067/j.cpradiol.2020.04.001
30. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Heal. 2019;1(6):E271- E297. doi:10.1016/S2589-7500(19)30123-2
31. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491
32. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905
33. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. http://arxiv.org /abs/2002.11379. Updated March 11, 2020. Accessed August 24, 2020.
34. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by imagebased deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010
1. World Health Organization. Coronavirus disease (COVID- 19) pandemic. https://www.who.int/emergencies/diseases /novel-coronavirus2019. Updated August 23, 2020. Accessed August 24, 2020.
2. World Health Organization. WHO Director-General’s opening remarks at the media briefing on COVID-19 - 11 March 2020. https://www.who.int/dg/speeches/detail/who -director-general-sopening-remarks-at-the-media-briefing -on-covid-19---11-march2020. Published March 11, 2020. Accessed August 24, 2020.
3. World Health Organization. Coronavirus disease (COVID- 19): situation report--209. https://www.who.int/docs /default-source/coronaviruse/situation-reports/20200816 -covid-19-sitrep-209.pdf. Updated August 16, 2020. Accessed August 24, 2020.
4. Nicola M, Alsafi Z, Sohrabi C, et al. The socio-economic implications of the coronavirus pandemic (COVID-19): a review. Int J Surg. 2020;78:185-193. doi:10.1016/j.ijsu.2020.04.018
5. da Costa VG, Moreli ML, Saivish MV. The emergence of SARS, MERS and novel SARS-2 coronaviruses in the 21st century. Arch Virol. 2020;165(7):1517-1526. doi:10.1007/s00705-020-04628-0
6. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.
7. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Updated January 15, 2019. Accessed August 24, 2020.
8. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. http:// arxiv.org/abs/1808.08230. Updated January 15, 2019. Accessed August 24, 2020.
9. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87. doi:10.1609/AIMAG.V27I4.1911
10. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229. doi:10.1147/rd.33.0210
11. Sarle WS. Neural networks and statistical models https:// people.orie.cornell.edu/davidr/or474/nn_sas.pdf. Published April 1994. Accessed August 24, 2020.
12. Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85-117. doi:10.1016/j.neunet.2014.09.003
13. 13. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539
14. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44- 56. doi:10.1038/s41591-018-0300-7
15. Cohen JP, Morrison P, Dao L. COVID-19 Image Data Collection. Published online March 25, 2020. Accessed May 13, 2020. http://arxiv.org/abs/2003.11597
16. Radiological Society of America. RSNA pneumonia detection challenge. https://www.kaggle.com/c/rsnapneumonia- detectionchallenge. Accessed August 24, 2020.
17. Bloice MD, Roth PM, Holzinger A. Biomedical image augmentation using Augmentor. Bioinformatics. 2019;35(21):4522-4524. doi:10.1093/bioinformatics/btz259
18. Cepheid. Xpert Xpress SARS-CoV-2. https://www.cepheid .com/coronavirus. Accessed August 24, 2020.
19. Interknowlogy. COVID-19 detection in chest X-rays. https://interknowlogy-covid-19.azurewebsites.net. Accessed August 27, 2020.
20. Bernheim A, Mei X, Huang M, et al. Chest CT Findings in Coronavirus Disease-19 (COVID-19): Relationship to Duration of Infection. Radiology. 2020;295(3):200463. doi:10.1148/radiol.2020200463
21. Ai T, Yang Z, Hou H, et al. Correlation of Chest CT and RTPCR Testing for Coronavirus Disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology. 2020;296(2):E32- E40. doi:10.1148/radiol.2020200642
22. Simpson S, Kay FU, Abbara S, et al. Radiological Society of North America Expert Consensus Statement on Reporting Chest CT Findings Related to COVID-19. Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA - Secondary Publication. J Thorac Imaging. 2020;35(4):219-227. doi:10.1097/RTI.0000000000000524
23. Wong HYF, Lam HYS, Fong AH, et al. Frequency and distribution of chest radiographic findings in patients positive for COVID-19. Radiology. 2020;296(2):E72-E78. doi:10.1148/radiol.2020201160
24. Fang Y, Zhang H, Xie J, et al. Sensitivity of chest CT for COVID-19: comparison to RT-PCR. Radiology. 2020;296(2):E115-E117. doi:10.1148/radiol.2020200432
25. Chen N, Zhou M, Dong X, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. 2020;395(10223):507-513. doi:10.1016/S0140-6736(20)30211-7
26. Lomoro P, Verde F, Zerboni F, et al. COVID-19 pneumonia manifestations at the admission on chest ultrasound, radiographs, and CT: single-center study and comprehensive radiologic literature review. Eur J Radiol Open. 2020;7:100231. doi:10.1016/j.ejro.2020.100231
27. Salehi S, Abedi A, Balakrishnan S, Gholamrezanezhad A. Coronavirus disease 2019 (COVID-19) imaging reporting and data system (COVID-RADS) and common lexicon: a proposal based on the imaging data of 37 studies. Eur Radiol. 2020;30(9):4930-4942. doi:10.1007/s00330-020-06863-0
28. Jacobi A, Chung M, Bernheim A, Eber C. Portable chest X-ray in coronavirus disease-19 (COVID- 19): a pictorial review. Clin Imaging. 2020;64:35-42. doi:10.1016/j.clinimag.2020.04.001
29. Bhat R, Hamid A, Kunin JR, et al. Chest imaging in patients hospitalized With COVID-19 infection - a case series. Curr Probl Diagn Radiol. 2020;49(4):294-301. doi:10.1067/j.cpradiol.2020.04.001
30. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Heal. 2019;1(6):E271- E297. doi:10.1016/S2589-7500(19)30123-2
31. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491
32. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905
33. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. http://arxiv.org /abs/2002.11379. Updated March 11, 2020. Accessed August 24, 2020.
34. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by imagebased deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010
Comparing Artificial Intelligence Platforms for Histopathologic Cancer Diagnosis
Artificial intelligence (AI), first described in 1956, encompasses the field of computer science in which machines are trained to learn from experience. The term was popularized by the 1956 Dartmouth College Summer Research Project on Artificial Intelligence.1 The field of AI is rapidly growing and has the potential to affect many aspects of our lives. The emerging importance of AI is demonstrated by a February 2019 executive order that launched the American AI Initiative, allocating resources and funding for AI development.2 The executive order stresses the potential impact of AI in the health care field, including its potential utility to diagnose disease. Federal agencies were directed to invest in AI research and development to promote rapid breakthroughs in AI technology that may impact multiple areas of society.
Machine learning (ML), a subset of AI, was defined in 1959 by Arthur Samuel and is achieved by employing mathematic models to compute sample data sets.3 Originating from statistical linear models, neural networks were conceived to accomplish these tasks.4 These pioneering scientific achievements led to recent developments of deep neural networks. These models are developed to recognize patterns and achieve complex computational tasks within a matter of minutes, often far exceeding human ability.5 ML can increase efficiency with decreased computation time, high precision, and recall when compared with that of human decision making.6
ML has the potential for numerous applications in the health care field.7-9 One promising application is in the field of anatomic pathology. ML allows representative images to be used to train a computer to recognize patterns from labeled photographs. Based on a set of images selected to represent a specific tissue or disease process, the computer can be trained to evaluate and recognize new and unique images from patients and render a diagnosis.10 Prior to modern ML models, users would have to import many thousands of training images to produce algorithms that could recognize patterns with high accuracy. Modern ML algorithms allow for a model known as transfer learning, such that far fewer images are required for training.11-13
Two novel ML platforms available for public use are offered through Google (Mountain View, CA) and Apple (Cupertino, CA).14,15 They each offer a user-friendly interface with minimal experience required in computer science. Google AutoML uses ML via cloud services to store and retrieve data with ease. No coding knowledge is required. The Apple Create ML Module provides computer-based ML, requiring only a few lines of code.
The Veterans Health Administration (VHA) is the largest single health care system in the US, and nearly 50 000 cancer cases are diagnosed at the VHA annually.16 Cancers of the lung and colon are among the most common sources of invasive cancer and are the 2 most common causes of cancer deaths in America.16 We have previously reported using Apple ML in detecting non-small cell lung cancers (NSCLCs), including adenocarcinomas and squamous cell carcinomas (SCCs); and colon cancers with accuracy.17,18 In the present study, we expand on these findings by comparing Apple and Google ML platforms in a variety of common pathologic scenarios in veteran patients. Using limited training data, both programs are compared for precision and recall in differentiating conditions involving lung and colon pathology.
In the first 4 experiments, we evaluated the ability of the platforms to differentiate normal lung tissue from cancerous lung tissue, to distinguish lung adenocarcinoma from SCC, and to differentiate colon adenocarcinoma from normal colon tissue. Next, cases of colon adenocarcinoma were assessed to determine whether the presence or absence of the KRAS proto-oncogene could be determined histologically using the AI platforms. KRAS is found in a variety of cancers, including about 40% of colon adenocarcinomas.19 For colon cancers, the presence or absence of the mutation in KRAS has important implications for patients as it determines whether the tumor will respond to specific chemotherapy agents.20 The presence of the KRAS gene is currently determined by complex molecular testing of tumor tissue.21 However, we assessed the potential of ML to determine whether the mutation is present by computerized morphologic analysis alone. Our last experiment examined the ability of the Apple and Google platforms to differentiate between adenocarcinomas of lung origin vs colon origin. This has potential utility in determining the site of origin of metastatic carcinoma.22
Methods
Fifty cases of lung SCC, 50 cases of lung adenocarcinoma, and 50 cases of colon adenocarcinoma were randomly retrieved from our molecular database. Twenty-five colon adenocarcinoma cases were positive for mutation in KRAS, while 25 cases were negative for mutation in KRAS. Seven hundred fifty total images of lung tissue (250 benign lung tissue, 250 lung adenocarcinomas, and 250 lung SCCs) and 500 total images of colon tissue (250 benign colon tissue and 250 colon adenocarcinoma) were obtained using a Leica Microscope MC190 HD Camera (Wetzlar, Germany) connected to an Olympus BX41 microscope (Center Valley, PA) and the Leica Acquire 9072 software for Apple computers. All the images were captured at a resolution of 1024 x 768 pixels using a 60x dry objective. Lung tissue images were captured and saved on a 2012 Apple MacBook Pro computer, and colon images were captured and saved on a 2011 Apple iMac computer. Both computers were running macOS v10.13.
Creating Image Classifier Models Using Apple Create ML
Apple Create ML is a suite of products that use various tools to create and train custom ML models on Apple computers.15 The suite contains many features, including image classification to train a ML model to classify images, natural language processing to classify natural language text, and tabular data to train models that deal with labeling information or estimating new quantities. We used Create ML Image Classification to create image classifier models for our project (Appendix A).
Creating ML Modules Using Google Cloud AutoML Vision Beta
Google Cloud AutoML is a suite of machine learning products, including AutoML Vision, AutoML Natural Language and AutoML Translation.14 All Cloud AutoML machine learning products were in beta version at the time of experimentation. We used Cloud AutoML Vision beta to create ML modules for our project. Unlike Apple Create ML, which is run on a local Apple computer, the Google Cloud AutoML is run online using a Google Cloud account. There are no minimum specifications requirements for the local computer since it is using the cloud-based architecture (Appendix B).
Experiment 1
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect and subclassify NSCLC based on the histopathologic images. We created 3 classes of images (250 images each): benign lung tissue, lung adenocarcinoma, and lung SCC.
Experiment 2
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between normal lung tissue and NSCLC histopathologic images with 50/50 mixture of lung adenocarcinoma and lung SCC. We created 2 classes of images (250 images each): benign lung tissue and lung NSCLC.
Experiment 3
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and lung SCC histopathologic images. We created 2 classes of images (250 images each): adenocarcinoma and SCC.
Experiment 4
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect colon cancer histopathologic images regardless of mutation in KRAS status. We created 2 classes of images (250 images each): benign colon tissue and colon adenocarcinoma.
Experiment 5
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between colon adenocarcinoma with mutations in KRAS and colon adenocarcinoma without the mutation in KRAS histopathologic images. We created 2 classes of images (125 images each): colon adenocarcinoma cases with mutation in KRAS and colon adenocarcinoma cases without the mutation in KRAS.
Experiment 6
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and colon adenocarcinoma histopathologic images. We created 2 classes of images (250 images each): colon adenocarcinoma lung adenocarcinoma.
Results
Twelve machine learning models were created in 6 experiments using the Apple Create ML and the Google AutoML (Table). To investigate recall and precision differences between the Apple and the Google ML algorithms, we performed 2-tailed distribution, paired t tests. No statistically significant differences were found (P = .52 for recall and .60 for precision).
Overall, each model performed well in distinguishing between normal and neoplastic tissue for both lung and colon cancers. In subclassifying NSCLC into adenocarcinoma and SCC, the models were shown to have high levels of precision and recall. The models also were successful in distinguishing between lung and colonic origin of adenocarcinoma (Figures 1-4). However, both systems had trouble discerning colon adenocarcinoma with mutations in KRAS from adenocarcinoma without mutations in KRAS.
Discussion
Image classifier models using ML algorithms hold a promising future to revolutionize the health care field. ML products, such as those modules offered by Apple and Google, are easy to use and have a simple graphic user interface to allow individuals to train models to perform humanlike tasks in real time. In our experiments, we compared multiple algorithms to determine their ability to differentiate and subclassify histopathologic images with high precision and recall using common scenarios in treating veteran patients.
Analysis of the results revealed high precision and recall values illustrating the models’ ability to differentiate and detect benign lung tissue from lung SCC and lung adenocarcinoma in ML model 1, benign lung from NSCLC carcinoma in ML model 2, and benign colon from colonic adenocarcinoma in ML model 4. In ML model 3 and 6, both ML algorithms performed at a high level to differentiate lung SCC from lung adenocarcinoma and lung adenocarcinoma from colonic adenocarcinoma, respectively. Of note, ML model 5 had the lowest precision and recall values across both algorithms demonstrating the models’ limited utility in predicting molecular profiles, such as mutations in KRAS as tested here. This is not surprising as pathologists currently require complex molecular tests to detect mutations in KRAS reliably in colon cancer.
Both modules require minimal programming experience and are easy to use. In our comparison, we demonstrated critical distinguishing characteristics that differentiate the 2 products.
Apple Create ML image classifier is available for use on local Mac computers that use Xcode version 10 and macOS 10.14 or later, with just 3 lines of code required to perform computations. Although this product is limited to Apple computers, it is free to use, and images are stored on the computer hard drive. Of unique significance on the Apple system platform, images can be augmented to alter their appearance to enhance model training. For example, imported images can be cropped, rotated, blurred, and flipped, in order to optimize the model’s training abilities to recognize test images and perform pattern recognition. This feature is not as readily available on the Google platform. Apple Create ML Image classifier’s default training set consists of 75% of total imported images with 5% of the total images being randomly used as a validation set. The remaining 20% of images comprise the testing set. The module’s computational analysis to train the model is achieved in about 2 minutes on average. The score threshold is set at 50% and cannot be manipulated for each image class as in Google AutoML Vision.
Google AutoML Vision is open and can be accessed from many devices. It stores images on remote Google servers but requires computing fees after a $300 credit for 12 months. On AutoML Vision, random 80% of the total images are used in the training set, 10% are used in the validation set, and 10% are used in the testing set. It is important to highlight the different percentages used in the default settings on the respective modules. The time to train the Google AutoML Vision with default computational power is longer on average than Apple Create ML, with about 8 minutes required to train the machine learning module. However, it is possible to choose more computational power for an additional fee and decrease module training time. The user will receive e-mail alerts when the computer time begins and is completed. The computation time is calculated by subtracting the time of the initial e-mail from the final e-mail.
Based on our calculations, we determined there was no significant difference between the 2 machine learning algorithms tested at the default settings with recall and precision values obtained. These findings demonstrate the promise of using a ML algorithm to assist in the performance of human tasks and behaviors, specifically the diagnosis of histopathologic images. These results have numerous potential uses in clinical medicine. ML algorithms have been successfully applied to diagnostic and prognostic endeavors in pathology,23-28 dermatology,29-31 ophthalmology,32 cardiology,33 and radiology.34-36
Pathologists often use additional tests, such as special staining of tissues or molecular tests, to assist with accurate classification of tumors. ML platforms offer the potential of an additional tool for pathologists to use along with human microscopic interpretation.37,38 In addition, the number of pathologists in the US is dramatically decreasing, and many other countries have marked physician shortages, especially in fields of specialized training such as pathology.39-42 These models could readily assist physicians in underserved countries and impact shortages of pathologists elsewhere by providing more specific diagnoses in an expedited manner.43
Finally, although we have explored the application of these platforms in common cancer scenarios, great potential exists to use similar techniques in the detection of other conditions. These include the potential for classification and risk assessment of precancerous lesions, infectious processes in tissue (eg, detection of tuberculosis or malaria),24,44 inflammatory conditions (eg, arthritis subtypes, gout),45 blood disorders (eg, abnormal blood cell morphology),46 and many others. The potential of these technologies to improve health care delivery to veteran patients seems to be limited only by the imagination of the user.47
Regarding the limited effectiveness in determining the presence or absence of mutations in KRAS for colon adenocarcinoma, it is mentioned that currently pathologists rely on complex molecular tests to detect the mutations at the DNA level.21 It is possible that the use of more extensive training data sets may improve recall and precision in cases such as these and warrants further study. Our experiments were limited to the stipulations placed by the free trial software agreements; no costs were expended to use the algorithms, though an Apple computer was required.
Conclusion
We have demonstrated the successful application of 2 readily available ML platforms in providing diagnostic guidance in differentiation between common cancer conditions in veteran patient populations. Although both platforms performed very well with no statistically significant differences in results, some distinctions are worth noting. Apple Create ML can be used on local computers but is limited to an Apple operating system. Google AutoML is not platform-specific but runs only via Google Cloud with associated computational fees. Using these readily available models, we demonstrated the vast potential of AI in diagnostic pathology. The application of AI to clinical medicine remains in the very early stages. The VA is uniquely poised to provide leadership as AI technologies will continue to dramatically change the future of health care, both in veteran and nonveteran patients nationwide.
Acknowledgments
The authors thank Paul Borkowski for his constructive criticism and proofreading of this manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.
1. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87-91.
2. Trump D. Accelerating America’s leadership in artificial intelligence. https://www.whitehouse.gov/articles/accelerating-americas-leadership-in-artificial-intelligence. Published February 11, 2019. Accessed September 4, 2019.
3. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229.
4. SAS Users Group International. Neural networks and statistical models. In: Sarle WS. Proceedings of the Nineteenth Annual SAS Users Group International Conference. SAS Institute: Cary, North Carolina; 1994:1538-1550. http://www.sascommunity.org/sugi/SUGI94/Sugi-94-255%20Sarle.pdf. Accessed September 16, 2019.
5. Schmidhuber J. Deep learning in neural networks: an overview. Neural Networks. 2015;61:85-117.
6. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444.
7. Jiang F, Jiang Y, Li H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243.
8. Erickson BJ, Korfiatis P, Akkus Z, Kline TL. Machine learning for medical imaging. Radiographics. 2017;37(2):505-515.
9. Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920-1930.
10. Janowczyk A, Madabhushi A. Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases. J Pathol Inform. 2016;7(1):29.
11. Oquab M, Bottou L, Laptev I, Sivic J. Learning and transferring mid-level image representations using convolutional neural networks. Presented at: IEEE Conference on Computer Vision and Pattern Recognition, 2014. http://openaccess.thecvf.com/content_cvpr_2014/html/Oquab_Learning_and_Transferring_2014_CVPR_paper.html. Accessed September 4, 2019.
12. Shin HC, Roth HR, Gao M, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging. 2016;35(5):1285-1298.
13. Tajbakhsh N, Shin JY, Gurudu SR, et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging. 2016;35(5):1299-1312.
14. Cloud AutoML. https://cloud.google.com/automl. Accessed September 4, 2019.
15. Create ML. https://developer.apple.com/documentation/createml. Accessed September 4, 2019.
16. Zullig LL, Sims KJ, McNeil R, et al. Cancer incidence among patients of the U.S. Veterans Affairs Health Care System: 2010 Update. Mil Med. 2017;182(7):e1883-e1891. 17. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. https://arxiv.org/ftp/arxiv/papers/1808/1808.08230.pdf. Accessed September 4, 2019.
18. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Revised January 15,2019. Accessed September 4, 2019.
19. Armaghany T, Wilson JD, Chu Q, Mills G. Genetic alterations in colorectal cancer. Gastrointest Cancer Res. 2012;5(1):19-27.
20. Herzig DO, Tsikitis VL. Molecular markers for colon diagnosis, prognosis and targeted therapy. J Surg Oncol. 2015;111(1):96-102.
21. Ma W, Brodie S, Agersborg S, Funari VA, Albitar M. Significant improvement in detecting BRAF, KRAS, and EGFR mutations using next-generation sequencing as compared with FDA-cleared kits. Mol Diagn Ther. 2017;21(5):571-579.
22. Greco FA. Molecular diagnosis of the tissue of origin in cancer of unknown primary site: useful in patient management. Curr Treat Options Oncol. 2013;14(4):634-642.
23. Bejnordi BE, Veta M, van Diest PJ, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210.
24. Xiong Y, Ba X, Hou A, Zhang K, Chen L, Li T. Automatic detection of mycobacterium tuberculosis using artificial intelligence. J Thorac Dis. 2018;10(3):1936-1940.
25. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450.
26. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567.
27. Ertosun MG, Rubin DL. Automated grading of gliomas using deep learning in digital pathology images: a modular approach with ensemble of convolutional neural networks. AMIA Annu Symp Proc. 2015;2015:1899-1908.
28. Wahab N, Khan A, Lee YS. Two-phase deep convolutional neural network for reducing class skewness in histopathological images based breast cancer detection. Comput Biol Med. 2017;85:86-97.
29. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118.
30. Han SS, Park GH, Lim W, et al. Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: automatic construction of onychomycosis datasets by region-based convolutional deep neural network. PLoS One. 2018;13(1):e0191493.
31. Fujisawa Y, Otomo Y, Ogata Y, et al. Deep-learning-based, computer-aided classifier developed with a small dataset of clinical images surpasses board-certified dermatologists in skin tumour diagnosis. Br J Dermatol. 2019;180(2):373-381.
32. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2010.
33. Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS One. 2017;12(4):e0174944.
34. Cheng J-Z, Ni D, Chou Y-H, et al. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep. 2016;6(1):24454.
35. Wang X, Yang W, Weinreb J, et al. Searching for prostate cancer by fully automated magnetic resonance imaging classification: deep learning versus non-deep learning. Sci Rep. 2017;7(1):15415.
36. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582.
37. Bardou D, Zhang K, Ahmad SM. Classification of breast cancer based on histology images using convolutional neural networks. IEEE Access. 2018;6(6):24680-24693.
38. Sheikhzadeh F, Ward RK, van Niekerk D, Guillaud M. Automatic labeling of molecular biomarkers of immunohistochemistry images using fully convolutional networks. PLoS One. 2018;13(1):e0190783.
39. Metter DM, Colgan TJ, Leung ST, Timmons CF, Park JY. Trends in the US and Canadian pathologist workforces from 2007 to 2017. JAMA Netw Open. 2019;2(5):e194337.
40. Benediktsson, H, Whitelaw J, Roy I. Pathology services in developing countries: a challenge. Arch Pathol Lab Med. 2007;131(11):1636-1639.
41. Graves D. The impact of the pathology workforce crisis on acute health care. Aust Health Rev. 2007;31(suppl 1):S28-S30.
42. NHS pathology shortages cause cancer diagnosis delays. https://www.gmjournal.co.uk/nhs-pathology-shortages-are-causing-cancer-diagnosis-delays. Published September 18, 2018. Accessed September 4, 2019.
43. Abbott LM, Smith SD. Smartphone apps for skin cancer diagnosis: Implications for patients and practitioners. Australas J Dermatol. 2018;59(3):168-170.
44. Poostchi M, Silamut K, Maude RJ, Jaeger S, Thoma G. Image analysis and machine learning for detecting malaria. Transl Res. 2018;194:36-55.
45. Orange DE, Agius P, DiCarlo EF, et al. Identification of three rheumatoid arthritis disease subtypes by machine learning integration of synovial histologic features and RNA sequencing data. Arthritis Rheumatol. 2018;70(5):690-701.
46. Rodellar J, Alférez S, Acevedo A, Molina A, Merino A. Image processing and machine learning in the morphological analysis of blood cells. Int J Lab Hematol. 2018;40(suppl 1):46-53.
47. Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60-88.
Artificial intelligence (AI), first described in 1956, encompasses the field of computer science in which machines are trained to learn from experience. The term was popularized by the 1956 Dartmouth College Summer Research Project on Artificial Intelligence.1 The field of AI is rapidly growing and has the potential to affect many aspects of our lives. The emerging importance of AI is demonstrated by a February 2019 executive order that launched the American AI Initiative, allocating resources and funding for AI development.2 The executive order stresses the potential impact of AI in the health care field, including its potential utility to diagnose disease. Federal agencies were directed to invest in AI research and development to promote rapid breakthroughs in AI technology that may impact multiple areas of society.
Machine learning (ML), a subset of AI, was defined in 1959 by Arthur Samuel and is achieved by employing mathematic models to compute sample data sets.3 Originating from statistical linear models, neural networks were conceived to accomplish these tasks.4 These pioneering scientific achievements led to recent developments of deep neural networks. These models are developed to recognize patterns and achieve complex computational tasks within a matter of minutes, often far exceeding human ability.5 ML can increase efficiency with decreased computation time, high precision, and recall when compared with that of human decision making.6
ML has the potential for numerous applications in the health care field.7-9 One promising application is in the field of anatomic pathology. ML allows representative images to be used to train a computer to recognize patterns from labeled photographs. Based on a set of images selected to represent a specific tissue or disease process, the computer can be trained to evaluate and recognize new and unique images from patients and render a diagnosis.10 Prior to modern ML models, users would have to import many thousands of training images to produce algorithms that could recognize patterns with high accuracy. Modern ML algorithms allow for a model known as transfer learning, such that far fewer images are required for training.11-13
Two novel ML platforms available for public use are offered through Google (Mountain View, CA) and Apple (Cupertino, CA).14,15 They each offer a user-friendly interface with minimal experience required in computer science. Google AutoML uses ML via cloud services to store and retrieve data with ease. No coding knowledge is required. The Apple Create ML Module provides computer-based ML, requiring only a few lines of code.
The Veterans Health Administration (VHA) is the largest single health care system in the US, and nearly 50 000 cancer cases are diagnosed at the VHA annually.16 Cancers of the lung and colon are among the most common sources of invasive cancer and are the 2 most common causes of cancer deaths in America.16 We have previously reported using Apple ML in detecting non-small cell lung cancers (NSCLCs), including adenocarcinomas and squamous cell carcinomas (SCCs); and colon cancers with accuracy.17,18 In the present study, we expand on these findings by comparing Apple and Google ML platforms in a variety of common pathologic scenarios in veteran patients. Using limited training data, both programs are compared for precision and recall in differentiating conditions involving lung and colon pathology.
In the first 4 experiments, we evaluated the ability of the platforms to differentiate normal lung tissue from cancerous lung tissue, to distinguish lung adenocarcinoma from SCC, and to differentiate colon adenocarcinoma from normal colon tissue. Next, cases of colon adenocarcinoma were assessed to determine whether the presence or absence of the KRAS proto-oncogene could be determined histologically using the AI platforms. KRAS is found in a variety of cancers, including about 40% of colon adenocarcinomas.19 For colon cancers, the presence or absence of the mutation in KRAS has important implications for patients as it determines whether the tumor will respond to specific chemotherapy agents.20 The presence of the KRAS gene is currently determined by complex molecular testing of tumor tissue.21 However, we assessed the potential of ML to determine whether the mutation is present by computerized morphologic analysis alone. Our last experiment examined the ability of the Apple and Google platforms to differentiate between adenocarcinomas of lung origin vs colon origin. This has potential utility in determining the site of origin of metastatic carcinoma.22
Methods
Fifty cases of lung SCC, 50 cases of lung adenocarcinoma, and 50 cases of colon adenocarcinoma were randomly retrieved from our molecular database. Twenty-five colon adenocarcinoma cases were positive for mutation in KRAS, while 25 cases were negative for mutation in KRAS. Seven hundred fifty total images of lung tissue (250 benign lung tissue, 250 lung adenocarcinomas, and 250 lung SCCs) and 500 total images of colon tissue (250 benign colon tissue and 250 colon adenocarcinoma) were obtained using a Leica Microscope MC190 HD Camera (Wetzlar, Germany) connected to an Olympus BX41 microscope (Center Valley, PA) and the Leica Acquire 9072 software for Apple computers. All the images were captured at a resolution of 1024 x 768 pixels using a 60x dry objective. Lung tissue images were captured and saved on a 2012 Apple MacBook Pro computer, and colon images were captured and saved on a 2011 Apple iMac computer. Both computers were running macOS v10.13.
Creating Image Classifier Models Using Apple Create ML
Apple Create ML is a suite of products that use various tools to create and train custom ML models on Apple computers.15 The suite contains many features, including image classification to train a ML model to classify images, natural language processing to classify natural language text, and tabular data to train models that deal with labeling information or estimating new quantities. We used Create ML Image Classification to create image classifier models for our project (Appendix A).
Creating ML Modules Using Google Cloud AutoML Vision Beta
Google Cloud AutoML is a suite of machine learning products, including AutoML Vision, AutoML Natural Language and AutoML Translation.14 All Cloud AutoML machine learning products were in beta version at the time of experimentation. We used Cloud AutoML Vision beta to create ML modules for our project. Unlike Apple Create ML, which is run on a local Apple computer, the Google Cloud AutoML is run online using a Google Cloud account. There are no minimum specifications requirements for the local computer since it is using the cloud-based architecture (Appendix B).
Experiment 1
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect and subclassify NSCLC based on the histopathologic images. We created 3 classes of images (250 images each): benign lung tissue, lung adenocarcinoma, and lung SCC.
Experiment 2
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between normal lung tissue and NSCLC histopathologic images with 50/50 mixture of lung adenocarcinoma and lung SCC. We created 2 classes of images (250 images each): benign lung tissue and lung NSCLC.
Experiment 3
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and lung SCC histopathologic images. We created 2 classes of images (250 images each): adenocarcinoma and SCC.
Experiment 4
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect colon cancer histopathologic images regardless of mutation in KRAS status. We created 2 classes of images (250 images each): benign colon tissue and colon adenocarcinoma.
Experiment 5
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between colon adenocarcinoma with mutations in KRAS and colon adenocarcinoma without the mutation in KRAS histopathologic images. We created 2 classes of images (125 images each): colon adenocarcinoma cases with mutation in KRAS and colon adenocarcinoma cases without the mutation in KRAS.
Experiment 6
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and colon adenocarcinoma histopathologic images. We created 2 classes of images (250 images each): colon adenocarcinoma lung adenocarcinoma.
Results
Twelve machine learning models were created in 6 experiments using the Apple Create ML and the Google AutoML (Table). To investigate recall and precision differences between the Apple and the Google ML algorithms, we performed 2-tailed distribution, paired t tests. No statistically significant differences were found (P = .52 for recall and .60 for precision).
Overall, each model performed well in distinguishing between normal and neoplastic tissue for both lung and colon cancers. In subclassifying NSCLC into adenocarcinoma and SCC, the models were shown to have high levels of precision and recall. The models also were successful in distinguishing between lung and colonic origin of adenocarcinoma (Figures 1-4). However, both systems had trouble discerning colon adenocarcinoma with mutations in KRAS from adenocarcinoma without mutations in KRAS.
Discussion
Image classifier models using ML algorithms hold a promising future to revolutionize the health care field. ML products, such as those modules offered by Apple and Google, are easy to use and have a simple graphic user interface to allow individuals to train models to perform humanlike tasks in real time. In our experiments, we compared multiple algorithms to determine their ability to differentiate and subclassify histopathologic images with high precision and recall using common scenarios in treating veteran patients.
Analysis of the results revealed high precision and recall values illustrating the models’ ability to differentiate and detect benign lung tissue from lung SCC and lung adenocarcinoma in ML model 1, benign lung from NSCLC carcinoma in ML model 2, and benign colon from colonic adenocarcinoma in ML model 4. In ML model 3 and 6, both ML algorithms performed at a high level to differentiate lung SCC from lung adenocarcinoma and lung adenocarcinoma from colonic adenocarcinoma, respectively. Of note, ML model 5 had the lowest precision and recall values across both algorithms demonstrating the models’ limited utility in predicting molecular profiles, such as mutations in KRAS as tested here. This is not surprising as pathologists currently require complex molecular tests to detect mutations in KRAS reliably in colon cancer.
Both modules require minimal programming experience and are easy to use. In our comparison, we demonstrated critical distinguishing characteristics that differentiate the 2 products.
Apple Create ML image classifier is available for use on local Mac computers that use Xcode version 10 and macOS 10.14 or later, with just 3 lines of code required to perform computations. Although this product is limited to Apple computers, it is free to use, and images are stored on the computer hard drive. Of unique significance on the Apple system platform, images can be augmented to alter their appearance to enhance model training. For example, imported images can be cropped, rotated, blurred, and flipped, in order to optimize the model’s training abilities to recognize test images and perform pattern recognition. This feature is not as readily available on the Google platform. Apple Create ML Image classifier’s default training set consists of 75% of total imported images with 5% of the total images being randomly used as a validation set. The remaining 20% of images comprise the testing set. The module’s computational analysis to train the model is achieved in about 2 minutes on average. The score threshold is set at 50% and cannot be manipulated for each image class as in Google AutoML Vision.
Google AutoML Vision is open and can be accessed from many devices. It stores images on remote Google servers but requires computing fees after a $300 credit for 12 months. On AutoML Vision, random 80% of the total images are used in the training set, 10% are used in the validation set, and 10% are used in the testing set. It is important to highlight the different percentages used in the default settings on the respective modules. The time to train the Google AutoML Vision with default computational power is longer on average than Apple Create ML, with about 8 minutes required to train the machine learning module. However, it is possible to choose more computational power for an additional fee and decrease module training time. The user will receive e-mail alerts when the computer time begins and is completed. The computation time is calculated by subtracting the time of the initial e-mail from the final e-mail.
Based on our calculations, we determined there was no significant difference between the 2 machine learning algorithms tested at the default settings with recall and precision values obtained. These findings demonstrate the promise of using a ML algorithm to assist in the performance of human tasks and behaviors, specifically the diagnosis of histopathologic images. These results have numerous potential uses in clinical medicine. ML algorithms have been successfully applied to diagnostic and prognostic endeavors in pathology,23-28 dermatology,29-31 ophthalmology,32 cardiology,33 and radiology.34-36
Pathologists often use additional tests, such as special staining of tissues or molecular tests, to assist with accurate classification of tumors. ML platforms offer the potential of an additional tool for pathologists to use along with human microscopic interpretation.37,38 In addition, the number of pathologists in the US is dramatically decreasing, and many other countries have marked physician shortages, especially in fields of specialized training such as pathology.39-42 These models could readily assist physicians in underserved countries and impact shortages of pathologists elsewhere by providing more specific diagnoses in an expedited manner.43
Finally, although we have explored the application of these platforms in common cancer scenarios, great potential exists to use similar techniques in the detection of other conditions. These include the potential for classification and risk assessment of precancerous lesions, infectious processes in tissue (eg, detection of tuberculosis or malaria),24,44 inflammatory conditions (eg, arthritis subtypes, gout),45 blood disorders (eg, abnormal blood cell morphology),46 and many others. The potential of these technologies to improve health care delivery to veteran patients seems to be limited only by the imagination of the user.47
Regarding the limited effectiveness in determining the presence or absence of mutations in KRAS for colon adenocarcinoma, it is mentioned that currently pathologists rely on complex molecular tests to detect the mutations at the DNA level.21 It is possible that the use of more extensive training data sets may improve recall and precision in cases such as these and warrants further study. Our experiments were limited to the stipulations placed by the free trial software agreements; no costs were expended to use the algorithms, though an Apple computer was required.
Conclusion
We have demonstrated the successful application of 2 readily available ML platforms in providing diagnostic guidance in differentiation between common cancer conditions in veteran patient populations. Although both platforms performed very well with no statistically significant differences in results, some distinctions are worth noting. Apple Create ML can be used on local computers but is limited to an Apple operating system. Google AutoML is not platform-specific but runs only via Google Cloud with associated computational fees. Using these readily available models, we demonstrated the vast potential of AI in diagnostic pathology. The application of AI to clinical medicine remains in the very early stages. The VA is uniquely poised to provide leadership as AI technologies will continue to dramatically change the future of health care, both in veteran and nonveteran patients nationwide.
Acknowledgments
The authors thank Paul Borkowski for his constructive criticism and proofreading of this manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.
Artificial intelligence (AI), first described in 1956, encompasses the field of computer science in which machines are trained to learn from experience. The term was popularized by the 1956 Dartmouth College Summer Research Project on Artificial Intelligence.1 The field of AI is rapidly growing and has the potential to affect many aspects of our lives. The emerging importance of AI is demonstrated by a February 2019 executive order that launched the American AI Initiative, allocating resources and funding for AI development.2 The executive order stresses the potential impact of AI in the health care field, including its potential utility to diagnose disease. Federal agencies were directed to invest in AI research and development to promote rapid breakthroughs in AI technology that may impact multiple areas of society.
Machine learning (ML), a subset of AI, was defined in 1959 by Arthur Samuel and is achieved by employing mathematic models to compute sample data sets.3 Originating from statistical linear models, neural networks were conceived to accomplish these tasks.4 These pioneering scientific achievements led to recent developments of deep neural networks. These models are developed to recognize patterns and achieve complex computational tasks within a matter of minutes, often far exceeding human ability.5 ML can increase efficiency with decreased computation time, high precision, and recall when compared with that of human decision making.6
ML has the potential for numerous applications in the health care field.7-9 One promising application is in the field of anatomic pathology. ML allows representative images to be used to train a computer to recognize patterns from labeled photographs. Based on a set of images selected to represent a specific tissue or disease process, the computer can be trained to evaluate and recognize new and unique images from patients and render a diagnosis.10 Prior to modern ML models, users would have to import many thousands of training images to produce algorithms that could recognize patterns with high accuracy. Modern ML algorithms allow for a model known as transfer learning, such that far fewer images are required for training.11-13
Two novel ML platforms available for public use are offered through Google (Mountain View, CA) and Apple (Cupertino, CA).14,15 They each offer a user-friendly interface with minimal experience required in computer science. Google AutoML uses ML via cloud services to store and retrieve data with ease. No coding knowledge is required. The Apple Create ML Module provides computer-based ML, requiring only a few lines of code.
The Veterans Health Administration (VHA) is the largest single health care system in the US, and nearly 50 000 cancer cases are diagnosed at the VHA annually.16 Cancers of the lung and colon are among the most common sources of invasive cancer and are the 2 most common causes of cancer deaths in America.16 We have previously reported using Apple ML in detecting non-small cell lung cancers (NSCLCs), including adenocarcinomas and squamous cell carcinomas (SCCs); and colon cancers with accuracy.17,18 In the present study, we expand on these findings by comparing Apple and Google ML platforms in a variety of common pathologic scenarios in veteran patients. Using limited training data, both programs are compared for precision and recall in differentiating conditions involving lung and colon pathology.
In the first 4 experiments, we evaluated the ability of the platforms to differentiate normal lung tissue from cancerous lung tissue, to distinguish lung adenocarcinoma from SCC, and to differentiate colon adenocarcinoma from normal colon tissue. Next, cases of colon adenocarcinoma were assessed to determine whether the presence or absence of the KRAS proto-oncogene could be determined histologically using the AI platforms. KRAS is found in a variety of cancers, including about 40% of colon adenocarcinomas.19 For colon cancers, the presence or absence of the mutation in KRAS has important implications for patients as it determines whether the tumor will respond to specific chemotherapy agents.20 The presence of the KRAS gene is currently determined by complex molecular testing of tumor tissue.21 However, we assessed the potential of ML to determine whether the mutation is present by computerized morphologic analysis alone. Our last experiment examined the ability of the Apple and Google platforms to differentiate between adenocarcinomas of lung origin vs colon origin. This has potential utility in determining the site of origin of metastatic carcinoma.22
Methods
Fifty cases of lung SCC, 50 cases of lung adenocarcinoma, and 50 cases of colon adenocarcinoma were randomly retrieved from our molecular database. Twenty-five colon adenocarcinoma cases were positive for mutation in KRAS, while 25 cases were negative for mutation in KRAS. Seven hundred fifty total images of lung tissue (250 benign lung tissue, 250 lung adenocarcinomas, and 250 lung SCCs) and 500 total images of colon tissue (250 benign colon tissue and 250 colon adenocarcinoma) were obtained using a Leica Microscope MC190 HD Camera (Wetzlar, Germany) connected to an Olympus BX41 microscope (Center Valley, PA) and the Leica Acquire 9072 software for Apple computers. All the images were captured at a resolution of 1024 x 768 pixels using a 60x dry objective. Lung tissue images were captured and saved on a 2012 Apple MacBook Pro computer, and colon images were captured and saved on a 2011 Apple iMac computer. Both computers were running macOS v10.13.
Creating Image Classifier Models Using Apple Create ML
Apple Create ML is a suite of products that use various tools to create and train custom ML models on Apple computers.15 The suite contains many features, including image classification to train a ML model to classify images, natural language processing to classify natural language text, and tabular data to train models that deal with labeling information or estimating new quantities. We used Create ML Image Classification to create image classifier models for our project (Appendix A).
Creating ML Modules Using Google Cloud AutoML Vision Beta
Google Cloud AutoML is a suite of machine learning products, including AutoML Vision, AutoML Natural Language and AutoML Translation.14 All Cloud AutoML machine learning products were in beta version at the time of experimentation. We used Cloud AutoML Vision beta to create ML modules for our project. Unlike Apple Create ML, which is run on a local Apple computer, the Google Cloud AutoML is run online using a Google Cloud account. There are no minimum specifications requirements for the local computer since it is using the cloud-based architecture (Appendix B).
Experiment 1
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect and subclassify NSCLC based on the histopathologic images. We created 3 classes of images (250 images each): benign lung tissue, lung adenocarcinoma, and lung SCC.
Experiment 2
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between normal lung tissue and NSCLC histopathologic images with 50/50 mixture of lung adenocarcinoma and lung SCC. We created 2 classes of images (250 images each): benign lung tissue and lung NSCLC.
Experiment 3
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and lung SCC histopathologic images. We created 2 classes of images (250 images each): adenocarcinoma and SCC.
Experiment 4
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect colon cancer histopathologic images regardless of mutation in KRAS status. We created 2 classes of images (250 images each): benign colon tissue and colon adenocarcinoma.
Experiment 5
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between colon adenocarcinoma with mutations in KRAS and colon adenocarcinoma without the mutation in KRAS histopathologic images. We created 2 classes of images (125 images each): colon adenocarcinoma cases with mutation in KRAS and colon adenocarcinoma cases without the mutation in KRAS.
Experiment 6
We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and colon adenocarcinoma histopathologic images. We created 2 classes of images (250 images each): colon adenocarcinoma lung adenocarcinoma.
Results
Twelve machine learning models were created in 6 experiments using the Apple Create ML and the Google AutoML (Table). To investigate recall and precision differences between the Apple and the Google ML algorithms, we performed 2-tailed distribution, paired t tests. No statistically significant differences were found (P = .52 for recall and .60 for precision).
Overall, each model performed well in distinguishing between normal and neoplastic tissue for both lung and colon cancers. In subclassifying NSCLC into adenocarcinoma and SCC, the models were shown to have high levels of precision and recall. The models also were successful in distinguishing between lung and colonic origin of adenocarcinoma (Figures 1-4). However, both systems had trouble discerning colon adenocarcinoma with mutations in KRAS from adenocarcinoma without mutations in KRAS.
Discussion
Image classifier models using ML algorithms hold a promising future to revolutionize the health care field. ML products, such as those modules offered by Apple and Google, are easy to use and have a simple graphic user interface to allow individuals to train models to perform humanlike tasks in real time. In our experiments, we compared multiple algorithms to determine their ability to differentiate and subclassify histopathologic images with high precision and recall using common scenarios in treating veteran patients.
Analysis of the results revealed high precision and recall values illustrating the models’ ability to differentiate and detect benign lung tissue from lung SCC and lung adenocarcinoma in ML model 1, benign lung from NSCLC carcinoma in ML model 2, and benign colon from colonic adenocarcinoma in ML model 4. In ML model 3 and 6, both ML algorithms performed at a high level to differentiate lung SCC from lung adenocarcinoma and lung adenocarcinoma from colonic adenocarcinoma, respectively. Of note, ML model 5 had the lowest precision and recall values across both algorithms demonstrating the models’ limited utility in predicting molecular profiles, such as mutations in KRAS as tested here. This is not surprising as pathologists currently require complex molecular tests to detect mutations in KRAS reliably in colon cancer.
Both modules require minimal programming experience and are easy to use. In our comparison, we demonstrated critical distinguishing characteristics that differentiate the 2 products.
Apple Create ML image classifier is available for use on local Mac computers that use Xcode version 10 and macOS 10.14 or later, with just 3 lines of code required to perform computations. Although this product is limited to Apple computers, it is free to use, and images are stored on the computer hard drive. Of unique significance on the Apple system platform, images can be augmented to alter their appearance to enhance model training. For example, imported images can be cropped, rotated, blurred, and flipped, in order to optimize the model’s training abilities to recognize test images and perform pattern recognition. This feature is not as readily available on the Google platform. Apple Create ML Image classifier’s default training set consists of 75% of total imported images with 5% of the total images being randomly used as a validation set. The remaining 20% of images comprise the testing set. The module’s computational analysis to train the model is achieved in about 2 minutes on average. The score threshold is set at 50% and cannot be manipulated for each image class as in Google AutoML Vision.
Google AutoML Vision is open and can be accessed from many devices. It stores images on remote Google servers but requires computing fees after a $300 credit for 12 months. On AutoML Vision, random 80% of the total images are used in the training set, 10% are used in the validation set, and 10% are used in the testing set. It is important to highlight the different percentages used in the default settings on the respective modules. The time to train the Google AutoML Vision with default computational power is longer on average than Apple Create ML, with about 8 minutes required to train the machine learning module. However, it is possible to choose more computational power for an additional fee and decrease module training time. The user will receive e-mail alerts when the computer time begins and is completed. The computation time is calculated by subtracting the time of the initial e-mail from the final e-mail.
Based on our calculations, we determined there was no significant difference between the 2 machine learning algorithms tested at the default settings with recall and precision values obtained. These findings demonstrate the promise of using a ML algorithm to assist in the performance of human tasks and behaviors, specifically the diagnosis of histopathologic images. These results have numerous potential uses in clinical medicine. ML algorithms have been successfully applied to diagnostic and prognostic endeavors in pathology,23-28 dermatology,29-31 ophthalmology,32 cardiology,33 and radiology.34-36
Pathologists often use additional tests, such as special staining of tissues or molecular tests, to assist with accurate classification of tumors. ML platforms offer the potential of an additional tool for pathologists to use along with human microscopic interpretation.37,38 In addition, the number of pathologists in the US is dramatically decreasing, and many other countries have marked physician shortages, especially in fields of specialized training such as pathology.39-42 These models could readily assist physicians in underserved countries and impact shortages of pathologists elsewhere by providing more specific diagnoses in an expedited manner.43
Finally, although we have explored the application of these platforms in common cancer scenarios, great potential exists to use similar techniques in the detection of other conditions. These include the potential for classification and risk assessment of precancerous lesions, infectious processes in tissue (eg, detection of tuberculosis or malaria),24,44 inflammatory conditions (eg, arthritis subtypes, gout),45 blood disorders (eg, abnormal blood cell morphology),46 and many others. The potential of these technologies to improve health care delivery to veteran patients seems to be limited only by the imagination of the user.47
Regarding the limited effectiveness in determining the presence or absence of mutations in KRAS for colon adenocarcinoma, it is mentioned that currently pathologists rely on complex molecular tests to detect the mutations at the DNA level.21 It is possible that the use of more extensive training data sets may improve recall and precision in cases such as these and warrants further study. Our experiments were limited to the stipulations placed by the free trial software agreements; no costs were expended to use the algorithms, though an Apple computer was required.
Conclusion
We have demonstrated the successful application of 2 readily available ML platforms in providing diagnostic guidance in differentiation between common cancer conditions in veteran patient populations. Although both platforms performed very well with no statistically significant differences in results, some distinctions are worth noting. Apple Create ML can be used on local computers but is limited to an Apple operating system. Google AutoML is not platform-specific but runs only via Google Cloud with associated computational fees. Using these readily available models, we demonstrated the vast potential of AI in diagnostic pathology. The application of AI to clinical medicine remains in the very early stages. The VA is uniquely poised to provide leadership as AI technologies will continue to dramatically change the future of health care, both in veteran and nonveteran patients nationwide.
Acknowledgments
The authors thank Paul Borkowski for his constructive criticism and proofreading of this manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.
1. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87-91.
2. Trump D. Accelerating America’s leadership in artificial intelligence. https://www.whitehouse.gov/articles/accelerating-americas-leadership-in-artificial-intelligence. Published February 11, 2019. Accessed September 4, 2019.
3. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229.
4. SAS Users Group International. Neural networks and statistical models. In: Sarle WS. Proceedings of the Nineteenth Annual SAS Users Group International Conference. SAS Institute: Cary, North Carolina; 1994:1538-1550. http://www.sascommunity.org/sugi/SUGI94/Sugi-94-255%20Sarle.pdf. Accessed September 16, 2019.
5. Schmidhuber J. Deep learning in neural networks: an overview. Neural Networks. 2015;61:85-117.
6. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444.
7. Jiang F, Jiang Y, Li H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243.
8. Erickson BJ, Korfiatis P, Akkus Z, Kline TL. Machine learning for medical imaging. Radiographics. 2017;37(2):505-515.
9. Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920-1930.
10. Janowczyk A, Madabhushi A. Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases. J Pathol Inform. 2016;7(1):29.
11. Oquab M, Bottou L, Laptev I, Sivic J. Learning and transferring mid-level image representations using convolutional neural networks. Presented at: IEEE Conference on Computer Vision and Pattern Recognition, 2014. http://openaccess.thecvf.com/content_cvpr_2014/html/Oquab_Learning_and_Transferring_2014_CVPR_paper.html. Accessed September 4, 2019.
12. Shin HC, Roth HR, Gao M, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging. 2016;35(5):1285-1298.
13. Tajbakhsh N, Shin JY, Gurudu SR, et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging. 2016;35(5):1299-1312.
14. Cloud AutoML. https://cloud.google.com/automl. Accessed September 4, 2019.
15. Create ML. https://developer.apple.com/documentation/createml. Accessed September 4, 2019.
16. Zullig LL, Sims KJ, McNeil R, et al. Cancer incidence among patients of the U.S. Veterans Affairs Health Care System: 2010 Update. Mil Med. 2017;182(7):e1883-e1891. 17. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. https://arxiv.org/ftp/arxiv/papers/1808/1808.08230.pdf. Accessed September 4, 2019.
18. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Revised January 15,2019. Accessed September 4, 2019.
19. Armaghany T, Wilson JD, Chu Q, Mills G. Genetic alterations in colorectal cancer. Gastrointest Cancer Res. 2012;5(1):19-27.
20. Herzig DO, Tsikitis VL. Molecular markers for colon diagnosis, prognosis and targeted therapy. J Surg Oncol. 2015;111(1):96-102.
21. Ma W, Brodie S, Agersborg S, Funari VA, Albitar M. Significant improvement in detecting BRAF, KRAS, and EGFR mutations using next-generation sequencing as compared with FDA-cleared kits. Mol Diagn Ther. 2017;21(5):571-579.
22. Greco FA. Molecular diagnosis of the tissue of origin in cancer of unknown primary site: useful in patient management. Curr Treat Options Oncol. 2013;14(4):634-642.
23. Bejnordi BE, Veta M, van Diest PJ, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210.
24. Xiong Y, Ba X, Hou A, Zhang K, Chen L, Li T. Automatic detection of mycobacterium tuberculosis using artificial intelligence. J Thorac Dis. 2018;10(3):1936-1940.
25. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450.
26. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567.
27. Ertosun MG, Rubin DL. Automated grading of gliomas using deep learning in digital pathology images: a modular approach with ensemble of convolutional neural networks. AMIA Annu Symp Proc. 2015;2015:1899-1908.
28. Wahab N, Khan A, Lee YS. Two-phase deep convolutional neural network for reducing class skewness in histopathological images based breast cancer detection. Comput Biol Med. 2017;85:86-97.
29. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118.
30. Han SS, Park GH, Lim W, et al. Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: automatic construction of onychomycosis datasets by region-based convolutional deep neural network. PLoS One. 2018;13(1):e0191493.
31. Fujisawa Y, Otomo Y, Ogata Y, et al. Deep-learning-based, computer-aided classifier developed with a small dataset of clinical images surpasses board-certified dermatologists in skin tumour diagnosis. Br J Dermatol. 2019;180(2):373-381.
32. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2010.
33. Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS One. 2017;12(4):e0174944.
34. Cheng J-Z, Ni D, Chou Y-H, et al. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep. 2016;6(1):24454.
35. Wang X, Yang W, Weinreb J, et al. Searching for prostate cancer by fully automated magnetic resonance imaging classification: deep learning versus non-deep learning. Sci Rep. 2017;7(1):15415.
36. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582.
37. Bardou D, Zhang K, Ahmad SM. Classification of breast cancer based on histology images using convolutional neural networks. IEEE Access. 2018;6(6):24680-24693.
38. Sheikhzadeh F, Ward RK, van Niekerk D, Guillaud M. Automatic labeling of molecular biomarkers of immunohistochemistry images using fully convolutional networks. PLoS One. 2018;13(1):e0190783.
39. Metter DM, Colgan TJ, Leung ST, Timmons CF, Park JY. Trends in the US and Canadian pathologist workforces from 2007 to 2017. JAMA Netw Open. 2019;2(5):e194337.
40. Benediktsson, H, Whitelaw J, Roy I. Pathology services in developing countries: a challenge. Arch Pathol Lab Med. 2007;131(11):1636-1639.
41. Graves D. The impact of the pathology workforce crisis on acute health care. Aust Health Rev. 2007;31(suppl 1):S28-S30.
42. NHS pathology shortages cause cancer diagnosis delays. https://www.gmjournal.co.uk/nhs-pathology-shortages-are-causing-cancer-diagnosis-delays. Published September 18, 2018. Accessed September 4, 2019.
43. Abbott LM, Smith SD. Smartphone apps for skin cancer diagnosis: Implications for patients and practitioners. Australas J Dermatol. 2018;59(3):168-170.
44. Poostchi M, Silamut K, Maude RJ, Jaeger S, Thoma G. Image analysis and machine learning for detecting malaria. Transl Res. 2018;194:36-55.
45. Orange DE, Agius P, DiCarlo EF, et al. Identification of three rheumatoid arthritis disease subtypes by machine learning integration of synovial histologic features and RNA sequencing data. Arthritis Rheumatol. 2018;70(5):690-701.
46. Rodellar J, Alférez S, Acevedo A, Molina A, Merino A. Image processing and machine learning in the morphological analysis of blood cells. Int J Lab Hematol. 2018;40(suppl 1):46-53.
47. Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60-88.
1. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87-91.
2. Trump D. Accelerating America’s leadership in artificial intelligence. https://www.whitehouse.gov/articles/accelerating-americas-leadership-in-artificial-intelligence. Published February 11, 2019. Accessed September 4, 2019.
3. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229.
4. SAS Users Group International. Neural networks and statistical models. In: Sarle WS. Proceedings of the Nineteenth Annual SAS Users Group International Conference. SAS Institute: Cary, North Carolina; 1994:1538-1550. http://www.sascommunity.org/sugi/SUGI94/Sugi-94-255%20Sarle.pdf. Accessed September 16, 2019.
5. Schmidhuber J. Deep learning in neural networks: an overview. Neural Networks. 2015;61:85-117.
6. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444.
7. Jiang F, Jiang Y, Li H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243.
8. Erickson BJ, Korfiatis P, Akkus Z, Kline TL. Machine learning for medical imaging. Radiographics. 2017;37(2):505-515.
9. Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920-1930.
10. Janowczyk A, Madabhushi A. Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases. J Pathol Inform. 2016;7(1):29.
11. Oquab M, Bottou L, Laptev I, Sivic J. Learning and transferring mid-level image representations using convolutional neural networks. Presented at: IEEE Conference on Computer Vision and Pattern Recognition, 2014. http://openaccess.thecvf.com/content_cvpr_2014/html/Oquab_Learning_and_Transferring_2014_CVPR_paper.html. Accessed September 4, 2019.
12. Shin HC, Roth HR, Gao M, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging. 2016;35(5):1285-1298.
13. Tajbakhsh N, Shin JY, Gurudu SR, et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging. 2016;35(5):1299-1312.
14. Cloud AutoML. https://cloud.google.com/automl. Accessed September 4, 2019.
15. Create ML. https://developer.apple.com/documentation/createml. Accessed September 4, 2019.
16. Zullig LL, Sims KJ, McNeil R, et al. Cancer incidence among patients of the U.S. Veterans Affairs Health Care System: 2010 Update. Mil Med. 2017;182(7):e1883-e1891. 17. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. https://arxiv.org/ftp/arxiv/papers/1808/1808.08230.pdf. Accessed September 4, 2019.
18. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Revised January 15,2019. Accessed September 4, 2019.
19. Armaghany T, Wilson JD, Chu Q, Mills G. Genetic alterations in colorectal cancer. Gastrointest Cancer Res. 2012;5(1):19-27.
20. Herzig DO, Tsikitis VL. Molecular markers for colon diagnosis, prognosis and targeted therapy. J Surg Oncol. 2015;111(1):96-102.
21. Ma W, Brodie S, Agersborg S, Funari VA, Albitar M. Significant improvement in detecting BRAF, KRAS, and EGFR mutations using next-generation sequencing as compared with FDA-cleared kits. Mol Diagn Ther. 2017;21(5):571-579.
22. Greco FA. Molecular diagnosis of the tissue of origin in cancer of unknown primary site: useful in patient management. Curr Treat Options Oncol. 2013;14(4):634-642.
23. Bejnordi BE, Veta M, van Diest PJ, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210.
24. Xiong Y, Ba X, Hou A, Zhang K, Chen L, Li T. Automatic detection of mycobacterium tuberculosis using artificial intelligence. J Thorac Dis. 2018;10(3):1936-1940.
25. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450.
26. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567.
27. Ertosun MG, Rubin DL. Automated grading of gliomas using deep learning in digital pathology images: a modular approach with ensemble of convolutional neural networks. AMIA Annu Symp Proc. 2015;2015:1899-1908.
28. Wahab N, Khan A, Lee YS. Two-phase deep convolutional neural network for reducing class skewness in histopathological images based breast cancer detection. Comput Biol Med. 2017;85:86-97.
29. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118.
30. Han SS, Park GH, Lim W, et al. Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: automatic construction of onychomycosis datasets by region-based convolutional deep neural network. PLoS One. 2018;13(1):e0191493.
31. Fujisawa Y, Otomo Y, Ogata Y, et al. Deep-learning-based, computer-aided classifier developed with a small dataset of clinical images surpasses board-certified dermatologists in skin tumour diagnosis. Br J Dermatol. 2019;180(2):373-381.
32. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2010.
33. Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS One. 2017;12(4):e0174944.
34. Cheng J-Z, Ni D, Chou Y-H, et al. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep. 2016;6(1):24454.
35. Wang X, Yang W, Weinreb J, et al. Searching for prostate cancer by fully automated magnetic resonance imaging classification: deep learning versus non-deep learning. Sci Rep. 2017;7(1):15415.
36. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582.
37. Bardou D, Zhang K, Ahmad SM. Classification of breast cancer based on histology images using convolutional neural networks. IEEE Access. 2018;6(6):24680-24693.
38. Sheikhzadeh F, Ward RK, van Niekerk D, Guillaud M. Automatic labeling of molecular biomarkers of immunohistochemistry images using fully convolutional networks. PLoS One. 2018;13(1):e0190783.
39. Metter DM, Colgan TJ, Leung ST, Timmons CF, Park JY. Trends in the US and Canadian pathologist workforces from 2007 to 2017. JAMA Netw Open. 2019;2(5):e194337.
40. Benediktsson, H, Whitelaw J, Roy I. Pathology services in developing countries: a challenge. Arch Pathol Lab Med. 2007;131(11):1636-1639.
41. Graves D. The impact of the pathology workforce crisis on acute health care. Aust Health Rev. 2007;31(suppl 1):S28-S30.
42. NHS pathology shortages cause cancer diagnosis delays. https://www.gmjournal.co.uk/nhs-pathology-shortages-are-causing-cancer-diagnosis-delays. Published September 18, 2018. Accessed September 4, 2019.
43. Abbott LM, Smith SD. Smartphone apps for skin cancer diagnosis: Implications for patients and practitioners. Australas J Dermatol. 2018;59(3):168-170.
44. Poostchi M, Silamut K, Maude RJ, Jaeger S, Thoma G. Image analysis and machine learning for detecting malaria. Transl Res. 2018;194:36-55.
45. Orange DE, Agius P, DiCarlo EF, et al. Identification of three rheumatoid arthritis disease subtypes by machine learning integration of synovial histologic features and RNA sequencing data. Arthritis Rheumatol. 2018;70(5):690-701.
46. Rodellar J, Alférez S, Acevedo A, Molina A, Merino A. Image processing and machine learning in the morphological analysis of blood cells. Int J Lab Hematol. 2018;40(suppl 1):46-53.
47. Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60-88.