Development and Investigation of Cost-Sensitive Pruned Decision Tree Model for Improved Schizophrenia Diagnosis
Abstract
Schizophrenia is often characterized by delusions, hallucinations, and other cognitive difficulties, affects approximately seventy million adults globally. This study presents a cost-sensitive pruned Decision Tree J48 model for fast and accurate diagnosis of Schizophrenia. The model implements supervised learning procedures with 10-fold cross-validation resampling method and utilizes unstructured filter to replace missing values in the data with the modal values of corresponding features. Features are selected using Pearson’s correlation on hot-coded data to detect redundancy in data. Cost matrix is designed to minimize the tendencies of the J48 algorithm to predict false negative outcomes. This consequently reduces the error of the model in diagnosing a Schizophrenia candidate as free from the disease. The model is found to significantly diagnose Schizophrenia with 78% accuracy, 89.7% sensitivity, 57.4% specificity and Area under the Receiver Operator Characteristic (ROC) curve of 0.895. The ROC curve is also seen to distinguish Schizophrenia from other conditions with similar symptoms. These results show the potential of machine-learning models for quick, effective diagnosis of schizophrenia.
Copyright (c) 2020 Ephraim Nwoye, Wai Lok Woo, Obinna Fidelis, Charles Umeh, Bin Gao
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Copyright © by the authors; licensee Research Lake International Inc., Canada. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution Non-Commercial License (CC BY-NC) (http://creative-commons.org/licenses/by-nc/4.0/).