Home / Technology Policy & Ethics / July 2019 / Clinical Decision Support Systems Leverage Machine Learning for Predictive Analytics

Clinical Decision Support Systems Leverage Machine Learning for Predictive Analytics – Part 1

By Tahir Hameed, Girard School of Business, Merrimack College, North Andover, USA

July 2019

Medical practice has always remained at the forefront of data-driven decision-making. For instance, primary care physicians have commonly used several types of risk scores and diagnostic data to predict morbidity and mortality in their patients. However, with Terabyte (TB) size structured and unstructured datasets abound, a massive shift is underway in the clinical decision support scene. The costs, efficiency, and effectiveness of decision-making for care planning, diagnosis, treatment, adherence monitoring and management of patient health outcomes have been improving at unprecedented rates in the last two decades.

New types of structured healthcare big data include 1) Electronic Health Record (EHR) data (including treatment history, diagnostic reports, medical images, and sensor data) 1, 2, 2) interdisciplinary omics data (genomics, proteomics, etc.)3, 3) costing data (activities, insurance claims and payments), and 4) drug R&D and trials data. Unstructured healthcare datasets are also available in forms of patient perceptions, behavior and sentiment data either 5) reported personally on social media (Twitter and Facebook groups, communities like patientslikeme.com and personal blogs) 4 or 6) collected through wearables5. Whereas big data technologies provide the foundational infrastructure and capabilities for collection, aggregation, storage and accessing large datasets, it is the data processing and analysis capabilities that generate a major share of business value for the health providers6.

Predictive analytics approaches commonly include statistical analysis, modelling, and Machine learning (ML), each of them serving specific purposes based on their strengths. Statistical analysis and modelling techniques are more suitable for inference of relationships among variables as well as forecasting, but they are limited in their ability to handle a large number of input variables as well as very large size datasets. On the contrary, ML not only recognizes complex patterns in large datasets easily but also remembers (learns) and applies them for comparative evaluation of decision options without going into too many details of the underlying relationships. It rests on signal processing and deep learning techniques such as Bayesian Networks (BNs), Neural Networks (NNs), Support Vector Machines (SVMs) and Decision Trees (DTs) for predictive modelling in clinical decision support systems (CDSS). This short article highlights the most prominent application areas of ML-based predictive analytics in CDSS.

1. Computer Aided Diagnosis (CAD) and prognosis (progression)

Disease prediction, diagnosis, and prognosis command the highest attention of ML-researchers in healthcare decision support. Physicians base their diagnosis on physiological symptoms and the pathology of the patient in addition to a variety of diagnostic scans. Correct identification of disease or infection is generally reliant on the correct interpretation of the physical, lab, and image data for which the acceptable ranges and deviations have already been well researched. That makes it easy for CAD systems to be trained on historic diagnosis data comprising hundreds of parameters.  As a result, CADs can diagnose the existence of diseases with much higher accuracy and speed without being overloaded by complexity of information (For example, see Kouros et al.’s study on cancer diagnosis and prognosis)7. However, images had always been harder to interpret, but not anymore as demonstrated by Madubashi and Lee (2016)8 in their paper on image analysis for digital pathology. In recent studies involving thousands of chest x-rays, live ML-based CAD systems have outperformed radiologists in correct diagnosis of malignant tumors, pulmonary tuberculosis and other lung lesions9. Similarly, CADs trained on MRI imaging data have correctly learned to diagnose and suggest prognosis of Parkinson’s and Alzheimer’s diseases and several other neurological disorders 10, 11, 12. These CAD systems and mobile applications are either in early or advanced stages of adoption in clinics worldwide.

2. Risk stratification and preventive healthcare

Risk stratification of patients into high, medium and low-risk for most common diseases has been a normal practice in healthcare that guides the providers in managing their patient outcomes effectively and in a timely manner. Early detection of diseases and preventive responses save lives. With sedentary lifestyles, smoking, drinking and unhealthy diets come obesity and diseases. They are further aggravated by comorbidities. Given all such patient information typically resides in longitudinal EHR data, classification of hi-cost and hi-risk patients has become much more frequent and valuable yet easier13, 14.

Another popular application of ML-based risk-stratification and preventive responses is reduction of 30-days (or unnecessary) hospital readmission rates15. Under 30-days readmissions can be prevented through better discharge and care planning and they reflect poorly on care quality besides waste of limited resources. A quick review of 30-days readmission literature shows more emphasis on risk stratification for patients with specific clinical or non-clinical conditions than all-conditions hospitalizations due to the complexity involved.

In the next part of this article we will continue to discuss other CDSS areas impacted by ML-based predictive analytics including clinical pathways optimization16, genomics17 and precision medicine18.


  1. K. Häyrinen, K. Saranto, and P. Nykänen, “Definition, structure, content, use and impacts of electronic health records: a review of the research literature,” International journal of medical informatics, vol. 77, pp. 291-304, 2008.
  2. R. Kohli and S. S.-L. Tan, “Electronic health records: how can IS researchers contribute to transforming healthcare?,” Mis Quarterly, vol. 40, pp. 553-573, 2016.
  3. C. Manzoni, D. A. Kia, J. Vandrovcova, J. Hardy, N. W. Wood, P. A. Lewis, et al., “Genome, transcriptome and proteome: the rise of omics data and their integration in biomedical sciences,” Briefings in bioinformatics, vol. 19, pp. 286-302, 2016.
  4. M. L. Antheunis, K. Tates, and T. E. Nieboer, “Patients’ and health professionals’ use of social media in health care: motives, barriers and expectations,” Patient education and counseling, vol. 92, pp. 426-431, 2013.
  5. G. Appelboom, E. Camacho, M. E. Abraham, S. S. Bruce, E. L. Dumont, B. E. Zacharia, et al., “Smart wearable body sensors for patient self-assessment and monitoring,” Archives of public health, vol. 72, p. 28, 2014.
  6. Y. Wang and N. Hajli, “Exploring the path to big data analytics success in healthcare,” Journal of Business Research, vol. 70, pp. 287-299, 2017.
  7. K. Kourou, T. P. Exarchos, K. P. Exarchos, M. V. Karamouzis, and D. I. Fotiadis, “Machine learning applications in cancer prognosis and prediction,” Computational and structural biotechnology journal, vol. 13, pp. 8-17, 2015.
  8. A. Madabhushi and G. Lee, “Image analysis and machine learning in digital pathology: Challenges and opportunities,” ed: Elsevier, 2016.
  9. J. G. Nam, S. Park, E. J. Hwang, J. H. Lee, K.-N. Jin, K. Y. Lim, et al., “Development and validation of deep learning–based automatic detection algorithm for malignant pulmonary nodules on chest radiographs,” Radiology, vol. 290, pp. 218-228, 2018.
  10. G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, et al., “A survey on deep learning in medical image analysis,” Medical image analysis, vol. 42, pp. 60-88, 2017.
  11. M. Nilashi, O. Ibrahim, H. Ahmadi, L. Shahmoradi, and M. Farahmand, “A hybrid intelligent system for the prediction of Parkinson’s Disease progression using machine learning techniques,” Biocybernetics and Biomedical Engineering, vol. 38, pp. 1-15, 2018.
  12. M. A. Mazurowski, M. Buda, A. Saha, and M. R. Bashir, “Deep learning in radiology: An overview of the concepts and a survey of the state of the art with focus on MRI,” Journal of Magnetic Resonance Imaging, vol. 49, pp. 939-954, 2019.
  13. D. W. Bates, S. Saria, L. Ohno-Machado, A. Shah, and G. Escobar, “Big data in health care: using analytics to identify and manage high-risk and high-cost patients,” Health Affairs, vol. 33, pp. 1123-1131, 2014.
  14. J. H. Chen and S. M. Asch, “Machine learning and prediction in medicine—beyond the peak of inflated expectations,” The New England journal of medicine, vol. 376, p. 2507, 2017.
  15. E. Demir, “A decision support tool for predicting patients at risk of readmission: A comparison of classification trees, logistic regression, generalized additive models, and multivariate adaptive regression splines,” Decision Sciences, vol. 45, pp. 849-880, 2014.
  16. Z. Huang, W. Dong, L. Ji, and H. Duan, “Predictive monitoring of clinical pathways,” Expert Systems with Applications, vol. 56, pp. 227-241, 2016.
  17. M.W. Libbrecht and W. S. Noble, “Machine learning applications in genetics and genomics,” Nature Reviews Genetics, vol. 16, p. 321, 2015.
  18. S.-I. Lee, S. Celik, B. A. Logsdon, S. M. Lundberg, T. J. Martins, V. G. Oehler, et al., “A machine learning approach to integrate big data for precision medicine in acute myeloid leukemia,” Nature communications, vol. 9, p. 42, 2018.

Dr. Tahir Hameed is with Merrimack College since 2018 where he teaches courses related to information systems. Prior to joining Merrimack, Dr Hameed was associated with SolBridge International School of Business in South Korea from 2012 to 2018 where he taught in the areas of information systems, technology management  and business analytics at the masters and bachelors levels. Dr. Hameed obtained his Ph.D. in Information Technology Management from the Korea Advanced Institute of Science and Technology (KAIST), and obtained his Masters in Computer Science from Lahore University of Management Sciences (LUMS). His current research focus is in the areas of health analytics, health IT, IT standards, technology commercialization, IT- enabled change management and knowledge management. He has published in prestigious journals such as Computers in Human Behavior, Sustainability, Journal of Knowledge Management, Telecommunications Policy, Technological Forecasting and Social Change and World Development. He has presented several papers at leading conferences including International Conference on Health Informatics, IEEE conference on Industrial Engineering and Engineering Management and Australasian Conference on Information Systems. He can be reached at hameedt@merrimack.edu.


Syed Hashim Raza Bukhari received his Ph.D degree in Electrical Engineering in 2017from COMSATS University Islamabad (CUI), Wah Cantt, Pakistan. Earlier, he received his M.S and B.Eng. degree in Computer Engineering in 2011 and 2007 respectively. Hashim has more than 12 years of experience in academics and has received numerous appreciations upon his contributions for the improve- ment of standards. He has also received the research productivity award for his research contributions in 2017 from COMSATS University Islamabad. His research interests include the issues in wireless sensor networks with dynamic spectrum access, cognitive radio networks and ad hoc networks. He is currently editor of IEEE Future Directions Newsletter: Technology, Policy & Ethics and guest editor of Springer Journal of Network and Systems Management (JNSM). He is also reviewer of several prestigious journals including IEEE Communication Magazine, IEEE Transactions on Industrial Informatics, IEEE Transactions on Wireless Communications, IEEE Transactions on Vehicular Technology, IEEE Communication Letters, IEEE Access journal, Elseviers Computers and Electrical Engineering (CAEE) journal, Elsevier Journal of Network and Computer Applications (JNCA), Ad Hoc Sensor Wireless Networks (AHSWN) Journal, Springer Wireless Networks Journal, Elsevier Pervasive and Mobile Computing journal, and the Journal of Communications and Networks (JCN).