Interpretation of the Result and Visualization of the Tree
The tree obtained as a result of the J48 algorithm shows us according to what values the dataset of this algorithm was classified. The node at the very top of the tree is presented as the most important decision maker. Starting from the top, the tree moves down to lower nodes in terms of correct decision coefficient value.
The algorithm was run to predict thoracic CT involvement, extent of such involvement and need for intensive care. As a result of this, three separate decision trees were formed and illustrated.
After the dataset was compiled on WEKA, the J48 Decision Tree and linear regression algorithms were run in our study. In this process, the dataset was divided into training and testing for each algorithm. Cross-validation was checked as testing criterion of the algorithms. Based on this value, the percentage of the dataset at the specified value was assigned as testing. Percentage split was used as the second testing criterion. The classification performance rate was compared based on this.
The linear regression algorithm, which was chosen as the second step, is a statistical method enabling modelling and formulation of the relationship between the regression dependent and independent variables. When there are more than one value as input parameter in the regression method, the Linear Regression method is used. A formula is obtained as a result of this algorithm. When calculating the formula, the square of the difference between the real value and the estimated value is taken to get the absolute value. After doing this for each sample, the absolute values of the difference between the real and estimated values are added for all samples. Minimizing this total value gives the optimum correct formula.