Interpretation of the Result and Visualization of the Tree
The tree obtained as a result of the J48 algorithm shows us according to
what values the dataset of this algorithm was classified. The node at
the very top of the tree is presented as the most important decision
maker. Starting from the top, the tree moves down to lower nodes in
terms of correct decision coefficient value.
The algorithm was run to predict thoracic CT involvement, extent of such
involvement and need for intensive care. As a result of this, three
separate decision trees were formed and illustrated.
After the dataset was compiled on WEKA, the J48 Decision Tree and linear
regression algorithms were run in our study. In this process, the
dataset was divided into training and testing for each algorithm.
Cross-validation was checked as testing criterion of the algorithms.
Based on this value, the percentage of the dataset at the specified
value was assigned as testing. Percentage split was used as the second
testing criterion. The classification performance rate was compared
based on this.
The linear regression algorithm, which was chosen as the second step, is
a statistical method enabling modelling and formulation of the
relationship between the regression dependent and independent variables.
When there are more than one value as input parameter in the regression
method, the Linear Regression method is used. A formula is obtained as a
result of this algorithm. When calculating the formula, the square of
the difference between the real value and the estimated value is taken
to get the absolute value. After doing this for each sample, the
absolute values of the difference between the real and estimated values
are added for all samples. Minimizing this total value gives the optimum
correct formula.