발간년도 : [2023]
논문정보 |
|
논문명(한글) |
[Vol.18, No.1] Comparison of Two Classifiers for Handling Incomplete Data with Decision Trees: C4.5, SVM |
|
논문투고자 |
Jong Chan Lee |
|
논문내용 |
This paper introduces an algorithm to obtain an estimate of the missing value from incomplete data using decision trees. As a classifier to construct a decision tree, C4.5 and SVM series algorithms with different characteristics are used, and the characteristics and performance of the two classifiers are examined through the implementation process. The decision tree is selected as a handling technique for incomplete data because each node of the decision tree has the classification information (hyperplane) of the input patterns, and the path from the root to the terminal node combines the hyperplanes to form a single domain. Therefore, the key idea of this paper for incomplete data is to enter the missing event in the root and find the area most similar to the missing information through traversal. Then, an estimate of the missing information is obtained from the events in this domain. From the implementation point of view, the training data is divided into lossy data and non-lossy data, and the decision tree is completed by inputting the non-lossy data into C4.5/SVM. Next, after inputting the loss data into this decision tree, the traversal is repeated until reaching the terminal node according to the condition for finding the most similar property. |
|
첨부논문 |
|
|
|
|
|