Decision Tree
A decision tree is a supervised learning algorithm consisting of decision nodes based on classification, feature and target, and leaf nodes, all of which form a tree-like model. One of the most widely known decision tree algorithms is the Iterative Dichotomiser 3 (ID3). A substitute of this algorithm in WEKA is J48. The major difference of J48 compared to ID3 is its ability to make statistical normalization. For this reason, entropy is calculated in the J48 algorithm. This calculation is kept as a proportion. A pruning procedure is also carried out at the end to obtain the simplest form of the tree.
It chooses the features of data that divide the sample set to enriched subsets most effectively in any of classes for each node on the tree.
This algorithm is based on recursive classification of information in a dataset. Eventually, the information normalized most is chosen for decision making.
When the basic of algorithms are considered, they are thought to provide a correct solution path for a problem. First, when all the samples in a dataset belong to the same class, it forms a leaf node that tells the decision tree to choose that class. If the chosen feature does not provide any information to make a decision, a decision node is formed on the tree using the expected value. Finally, if the tree encounters a class sample it has not seen before, it forms a decision node at the top of the tree. A summary of the process is as follows: First, all algorithm steps are implemented with iterations. At each iteration, features in the dataset are reviewed. A value is calculated for each feature. This value is named as information gain. The best one of these values is assigned as decision and added to the tree. This process is continued by forming new decision routes from under the node added latest.