S. No | DM Algorithm | Technique used | No. of Papers Implemented | Merits | Limitations |
1 | Support Vector Machines | Classification | 2, 15, 16, 18, 21, 25 | Very accurate, less over fitting, robust to noise. | Binary classifier, In multi-class classification, kernel selection and interpretability are some weaknesses of SVM, Computationally expensive, runs slow. |
2 | Bayesian Networks | Classification | 2, 15, 16 | Missing data entries can be handled successfully, Over-fitting of data is avoidable | Quality and extent of prior knowledge play an important role, Significant computational cost |
3 | Decision Tree | Classification | 2, 15, 16, 18 | Easy to understand, Easy to generate rules | Over fitting, does not handle easily non numeric data, can be quite large pruning is necessary |
4 | C4.5 | Classification | 15, 16, 25 | Quite fast, Output is human readable. | Small variation in data can lead to different decision tree, does not work very well on small training data set, Over fitting |
5 | K-Nearest Neighbor | Classification | 2, 15, 16, 18, 25 | Ease of understanding and implementation, depending on the distance metric, KNN can be quite accurate | Computationally expensive Noisy data can throw off kNN classifications. kNN generally requires greater storage requirements than eager classifiers, Selecting a good distance metric is crucial to kNN’s accuracy |
6 | K-Means | Clustering | 2, 15, 16, 18 | Faster and more efficient especially over large datasets | Sensitive to outliers and the initial choice of centroids, It is designed to operate on continuous data—extra tricks are needed to work on discrete data |