Fuzzy Nearest Neighbor Algorithms: Taxonomy, Experimental analysis and Prospects - Complementary Material
This Website contains complementary material to the paper:
J. Derrac, S. García and F.Herrera, Fuzzy Nearest Neighbor Algorithms: Taxonomy, Experimental analysis and Prospects. Information Sciences 260 (2014) 98-119, doi: 10.1016/j.ins.2013.10.038
The web is organized according to the following summary:
- Abstract
- Survey of Fuzzy Nearest Neighbor Methods
- Experimental Framework
- Experimental Study
- Stage 1: Comparison of fuzzy nearest neighbor classifiers
- Stage 2: Comparison with crisp nearest neighbor approaches
Experimental Study
The full results of the experimental study performed are presented here. This study is composed of two stages: 1) A first one, analyzing the performance of fuzzy nearest neighbor classifiers 2) A second one, comparing the performance of the best performing fuzzy nearest neighbor classifiers with other crisp nearest neighbor based approaches.
Comparison of fuzzy nearest neighbor classifiers
In this first stage, the 18 fuzzy nearest neighbor classifiers are tested. Table 4 summarizes the results obtained, considering the following performance measures: Accuracy in training phase (considering the best K value/configuration for each method), accuracy in test phase (considering the best K value/configuration for each method), accuracy in test phase (considering a fixed K value/configuration for each method), kappa in test phase (considering the best K value/configuration for each method), kappa in test phase (considering a fixed K value/configuration for each method), and running time.
Table 4. Summary results of Stage 1: Fuzzy nearest neighbor classifiers
Accuracy (Training) | Accuracy (Test, Best K) | Accuracy (Test, Fixed K) | Kappa (Training) | Kappa (Test, Best K) | Kappa (Test, Fixed K) | Running time | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Method | Average | Method | Average | Method | Average | K value | Method | Average | Method | Average | Method | Average | K value | Method | Time (s) |
GAFuzzyKNN | 0.8517 | GAFuzzyKNN | 0.8204 | GAFuzzyKNN | 0.8130 | 5 | GAFuzzyKNN | 0.7142 | GAFuzzyKNN | 0.6558 | GAFuzzyKNN | 0.6415 | 5 | FuzzyNPC | 0.0409 |
VWFuzzyKNN | 0.8421 | FuzzyKNN | 0.8190 | IT2FKNN | 0.8111 | 7 | VWFuzzyKNN | 0.6987 | FuzzyKNN | 0.6524 | IT2FKNN | 0.6354 | 7 | PosIBL | 2.7363 |
FENN | 0.8388 | IT2FKNN | 0.8181 | FuzzyKNN | 0.8110 | 5 | FENN | 0.6858 | IT2FKNN | 0.6484 | FuzzyKNN | 0.6366 | 7 | FRNN-VQRS | 2.9070 |
FuzzyKNN | 0.8362 | D-SKNN | 0.8136 | D-SKNN | 0.7985 | 5 | FuzzyKNN | 0.6837 | D-SKNN | 0.6468 | D-SKNN | 0.6167 | 5 | FRNN-FRS | 3.0145 |
IT2FKNN | 0.8350 | IF-KNN | 0.8062 | IF-KNN | 0.7972 | 3 | IT2FKNN | 0.6795 | IF-KNN | 0.6321 | IF-KNN | 0.6157 | 3 | D-SKNN | 3.0955 |
IF-KNN | 0.8201 | FENN | 0.8009 | FENN | 0.7926 | 5 | IF-KNN | 0.6549 | FENN | 0.6150 | FRNN-FRS | 0.6130 | 3 | FCMKNN | 4.0568 |
D-SKNN | 0.8112 | PFKNN | 0.7961 | PosIBL | 0.7883 | * | D-SKNN | 0.6387 | FRNN-FRS | 0.6138 | PosIBL | 0.6071 | * | VWFuzzyKNN | 5.6256 |
PFKNN | 0.7928 | PosIBL | 0.7913 | PFKNN | 0.7877 | 9 | PFKNN | 0.6097 | PosIBL | 0.6134 | FRNN-VQRS | 0.6061 | 5 | IFSKNN | 6.4927 |
FRNN-FRS | 0.7843 | FRNN-FRS | 0.7880 | FRNN-FRS | 0.7875 | 3 | FRNN-FRS | 0.6076 | PFKNN | 0.6130 | FENN | 0.5993 | 5 | FuzzyKNN | 6.5322 |
PosIBL | 0.7836 | VWFuzzyKNN | 0.7869 | FRNN-VQRS | 0.7799 | 5 | FRNN-VQRS | 0.6055 | FRNN-VQRS | 0.6104 | PFKNN | 0.5992 | 7 | CFKNN | 6.7276 |
FRKNNA | 0.7833 | FRNN-VQRS | 0.7825 | VWFuzzyKNN | 0.7775 | 3 | PosIBL | 0.5986 | VWFuzzyKNN | 0.5936 | VWFuzzyKNN | 0.5793 | 3 | FENN | 6.9731 |
FRNN-VQRS | 0.7794 | FRKNNA | 0.7738 | FRKNNA | 0.7640 | 3 | FRKNNA | 0.5953 | IFSKNN | 0.5890 | IFSKNN | 0.5705 | 3 | FRKNNA | 7.2246 |
IFSKNN | 0.7691 | IFSKNN | 0.7713 | IFSKNN | 0.7585 | 5 | IFSKNN | 0.5839 | FRKNNA | 0.5779 | FRKNNA | 0.5612 | 3 | IF-KNN | 7.9749 |
FRNN | 0.7375 | FRNN | 0.7408 | FRNN | 0.7408 | * | FuzzyNPC | 0.5285 | FuzzyNPC | 0.5079 | FuzzyNPC | 0.5079 | * | IFV-NP | 11.1111 |
FuzzyNPC | 0.7112 | FuzzyNPC | 0.6975 | FuzzyNPC | 0.6975 | * | CFKNN | 0.5162 | CFKNN | 0.5000 | CFKNN | 0.4925 | 3 | IT2FKNN | 13.1984 |
CFKNN | 0.7052 | CFKNN | 0.6931 | CFKNN | 0.6885 | 3 | FCMKNN | 0.4647 | FCMKNN | 0.4497 | FRNN | 0.4403 | * | FRNN | 28.5193 |
FCMKNN | 0.6549 | FCMKNN | 0.6469 | FCMKNN | 0.6397 | 5 | IFV-NP | 0.4466 | FRNN | 0.4403 | FCMKNN | 0.4390 | 3 | PFKNN | 725.8243 |
IFV-NP | 0.6450 | IFV-NP | 0.6337 | IFV-NP | 0.6085 | * | FRNN | 0.4364 | IFV-NP | 0.4299 | IFV-NP | 0.4153 | * | GAFuzzyKNN | 1275.4415 |
Note that the values shown in the table are the average results obtained considering the 44 data sets of the experimental study. Table 5 presents XLS sheets containing the detailed results per each data set and configuration, including a XLS sheets for every method. A XLS sheet with the complete results is also provided.
Table 5. Full results of Stage 1: Fuzzy nearest neighbor classifiers
Table 6 presents the results of both tests for accuracy and kappa measures. Firstly, a column is presented showing the ranks obtained in the Friedman test, where the lower the rank is, the better behavior the respective algorithm has shown. The p-values obtained by the Friedman test are 1.38E-10 and 1.08E-10 (for accuracy and kappa, respectively), which means that significant differences exists among the algorithms.
Shaffer test is conducted to characterize these differences. 153 pairwise hypotheses can be established, from which in the case of accuracy 66 are significant at a α = 0.1 level of significance (54 at a α = 0.01 level of significance). In the case of kappa, 61 and 50 hypotheses are significant, respectively.
For each method, Table 6 shows the number of algorithms which are significantly improved by it (the "+" column) and the number of algorithms which are significantly improved or equal (the "+=" column).
An XSL sheet containing the Friedman Ranks and the 153 hypotheses with their associated p-values can be dowloaded here.
Table 6. Full results of Friedman and Shaffer tests (Accuracy and Kappa) - Stage 1: Fuzzy nearest neighbor classifiers
Accuracy (α level = 0.1) | Accuracy (α level = 0.01) | Kappa (α level = 0.1) | Kappa (α level = 0.01) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Algorithm | Rank | + | += | + | += | Algorithm | Rank | + | += | + | += |
IT2FKNN | 4.9659 | 10 | 18 | 9 | 18 | FuzzyKNN | 5.4091 | 9 | 18 | 8 | 18 |
FuzzyKNN | 5.3409 | 10 | 18 | 8 | 18 | IT2FKNN | 5.4659 | 9 | 18 | 8 | 18 |
GAFuzzyKNN | 5.3523 | 10 | 18 | 8 | 18 | GAFuzzyKNN | 5.5909 | 8 | 18 | 8 | 18 |
D-SKNN | 6.6818 | 6 | 18 | 5 | 18 | D-SKNN | 7.3977 | 5 | 18 | 5 | 18 |
IF-KNN | 7.0909 | 6 | 18 | 5 | 18 | IF-KNN | 7.4091 | 5 | 18 | 5 | 18 |
FENN | 7.2614 | 5 | 18 | 5 | 18 | PosIBL | 8.1364 | 5 | 18 | 4 | 18 |
PFKNN | 7.9318 | 5 | 18 | 4 | 18 | FENN | 8.3636 | 4 | 18 | 4 | 18 |
PosIBL | 8.6023 | 4 | 18 | 3 | 18 | PFKNN | 8.4432 | 4 | 18 | 4 | 18 |
FRNN-FRS | 9.2386 | 3 | 15 | 2 | 18 | FRNN-FRS | 8.6591 | 4 | 18 | 2 | 18 |
VWFuzzyKNN | 9.7386 | 2 | 15 | 2 | 17 | FRNN-VQRS | 9.2500 | 4 | 16 | 2 | 18 |
FRNN-VQRS | 9.9091 | 2 | 15 | 2 | 15 | IFSKNN | 10.0341 | 2 | 15 | 0 | 15 |
IFSKNN | 10.1591 | 2 | 15 | 1 | 15 | VWFuzzyKNN | 10.0795 | 2 | 15 | 0 | 15 |
FRNN | 10.8977 | 1 | 13 | 0 | 15 | FuzzyNPC | 10.7273 | 0 | 15 | 0 | 15 |
FuzzyNPC | 12.1250 | 0 | 12 | 0 | 12 | CFKNN | 12.0000 | 0 | 12 | 0 | 13 |
FRKNNA | 12.8409 | 0 | 11 | 0 | 11 | FRKNNA | 12.9773 | 0 | 8 | 0 | 10 |
CFKNN | 13.2841 | 0 | 9 | 0 | 10 | FRNN | 13.0682 | 0 | 8 | 0 | 10 |
FCMKNN | 14.5114 | 0 | 6 | 0 | 7 | FCMKNN | 13.9318 | 0 | 6 | 0 | 8 |
IFV-NP | 15.0682 | 0 | 5 | 0 | 6 | IFV-NP | 14.0568 | 0 | 6 | 0 | 8 |
Comparison with crisp nearest neighbor approaches
In this second stage, the 7 best performing fuzzy nearest neighbor classifiers are compared with 7 state-of-art crisp nearest neighbor classifier. The 7 fuzzy nearest neighbor classifiers selected are FuzzyKNN and the best performing method of each family (GAFuzzyKNN, IT2FKNN, D-SKNN, IF-KNN, FRNN-FRS and FENN).
Table 7 summarizes the results obtained, considering the following performance measures: Accuracy in training phase (considering the best K value/configuration for each method), accuracy in test phase (considering the best K value/configuration for each method), accuracy in test phase (considering a fixed K value/configuration for each method), kappa in test phase (considering the best K value/configuration for each method), kappa in test phase (considering a fixed K value/configuration for each method), and running time.
Table 7. Summary results of Stage 2: Comparison with crisp nearest neighbor approaches
Accuracy (Training) | Accuracy (Test, Best K) | Accuracy (Test, Fixed K) | Kappa (Training) | Kappa (Test, Best K) | Kappa (Test, Fixed K) | Running time | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Method | Average | Method | Average | Method | Average | K value | Method | Average | Method | Average | Method | Average | K value | Method | Time (s) |
NSC | 0.9617 | GAFuzzyKNN | 0.8204 | GAFuzzyKNN | 0.8130 | 5 | NSC | 0.9250 | GAFuzzyKNN | 0.6558 | GAFuzzyKNN | 0.6415 | 5 | FRNN-FRS | 3.0145 |
KSNN | 0.8651 | FuzzyKNN | 0.8190 | IT2FKNN | 0.8111 | 7 | KSNN | 0.7438 | FuzzyKNN | 0.6524 | FuzzyKNN | 0.6366 | 7 | D-SKNN | 3.0955 |
GAFuzzyKNN | 0.8517 | IT2FKNN | 0.8181 | FuzzyKNN | 0.8110 | 5 | GAFuzzyKNN | 0.7142 | IT2FKNN | 0.6484 | IT2FKNN | 0.6354 | 7 | KNN | 3.3452 |
IDIBL | 0.8408 | D-SKNN | 0.8136 | D-SKNN | 0.7985 | 5 | IDIBL | 0.6906 | D-SKNN | 0.6468 | D-SKNN | 0.6167 | 5 | NSC | 4.8859 |
FENN | 0.8388 | KSNN | 0.8098 | IF-KNN | 0.7972 | 3 | FENN | 0.6858 | NSC | 0.6379 | KSNN | 0.6160 | 3 | ENN | 5.0116 |
FuzzyKNN | 0.8362 | IF-KNN | 0.8062 | KSNN | 0.7970 | 5 | FuzzyKNN | 0.6837 | KSNN | 0.6359 | IF-KNN | 0.6157 | 3 | KNNAdaptive | 6.1195 |
IT2FKNN | 0.8350 | NSC | 0.8020 | FENN | 0.7926 | 5 | IT2FKNN | 0.6795 | IF-KNN | 0.6321 | FRNN-FRS | 0.6130 | 3 | KSNN | 6.2721 |
IF-KNN | 0.8201 | FENN | 0.8009 | IDIBL | 0.7902 | * | PW | 0.6638 | FENN | 0.6150 | KNN | 0.6028 | 7 | FuzzyKNN | 6.5322 |
PW | 0.8151 | KNN | 0.7933 | FRNN-FRS | 0.7875 | 3 | IF-KNN | 0.6549 | KNN | 0.6143 | FENN | 0.5993 | 5 | FENN | 6.9731 |
ENN | 0.8113 | KNNAdaptive | 0.7927 | KNNAdaptive | 0.7856 | 3 | ENN | 0.6396 | FRNN-FRS | 0.6138 | PW | 0.5955 | * | IF-KNN | 7.9749 |
D-SKNN | 0.8112 | IDIBL | 0.7902 | KNN | 0.7815 | 7 | D-SKNN | 0.6387 | KNNAdaptive | 0.6131 | NSC | 0.5814 | * | IT2FKNN | 13.1984 |
KNNAdaptive | 0.8024 | ENN | 0.7901 | NSC | 0.7801 | * | KNNAdaptive | 0.6377 | PW | 0.6042 | IDIBL | 0.5807 | * | PW | 22.7235 |
KNN | 0.7901 | FRNN-FRS | 0.7880 | PW | 0.7793 | * | KNN | 0.6089 | IDIBL | 0.5946 | ENN | 0.5740 | 3 | IDIBL | 409.3493 |
FRNN-FRS | 0.7843 | PW | 0.7828 | ENN | 0.7784 | 5 | FRNN-FRS | 0.6076 | ENN | 0.5936 | KNNAdaptive | 0.5697 | 3 | GAFuzzyKNN | 1275.4415 |
Note that the values shown in the table are the average results obtained considering the 44 data sets of the experimental study. Table 8 presents XLS sheets containing the detailed results per each data set and configuration, including a XLS sheets for every method. A XLS sheet with the complete results is also provided.
Table 8. Full results of Stage 2: Comparison with crisp nearest neighbor approaches
The next step of the study involves the use of the Friedman and Shaffer tests to contrast the results shown in the former tables. For the sake of generality, these statistical analyses have been carried out considering the results presented above, using the accuracy and kappa performance measures with a fixed K value.
Table 9 presents the results of both tests for accuracy and kappa measures. Firstly, a column is presented showing the ranks obtained in the Friedman test, where the lower the rank is, the better behavior the respective algorithm has shown. The p-values obtained by the Friedman test are 1.17E-6 and 5.59E-6 (for accuracy and kappa, respectively), which means that significant differences exists among the algorithms.
Shaffer test is conducted to characterize these differences. 91 pairwise hypotheses can be established, from which in the case of accuracy 10 are significant at a α = 0.1 level of significance (4 at a α = 0.01 level of significance). In the case of kappa, 8 and 3 hypotheses are significant, respectively.
For each method, Table 6 shows the number of algorithms which are significantly improved by it (the "+" column) and the number of algorithms which are significantly improved or equal (the "+=" column).
An XSL sheet containing the Friedman Ranks and the 91 hypotheses with their associated p-values can be dowloaded here.
Table 9. Full results of Friedman and Shaffer tests (Accuracy and Kappa) - Stage 2: Comparison with crisp nearest neighbor approaches
Accuracy (α level = 0.1) | Accuracy (α level = 0.01) | Kappa (α level = 0.1) | Kappa (α level = 0.01) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Algorithm | Rank | + | += | + | += | Algorithm | Rank | + | += | + | += |
IT2FKNN | 5.5795 | 4 | 18 | 2 | 18 | GAFuzzyKNN | 5.4773 | 4 | 18 | 1 | 18 |
GAFuzzyKNN | 5.8295 | 3 | 18 | 1 | 18 | IT2FKNN | 5.6591 | 2 | 18 | 1 | 18 |
FuzzyKNN | 5.8864 | 3 | 18 | 1 | 18 | FuzzyKNN | 5.7614 | 2 | 18 | 1 | 18 |
KSNN | 6.5455 | 0 | 18 | 0 | 18 | KSNN | 6.9318 | 0 | 18 | 0 | 18 |
KNNAdaptive | 6.5682 | 0 | 18 | 0 | 18 | KNNAdaptive | 7.2273 | 0 | 18 | 0 | 18 |
D-SKNN | 7.3977 | 0 | 18 | 0 | 18 | IF-KNN | 7.2386 | 0 | 18 | 0 | 18 |
IF-KNN | 7.4318 | 0 | 18 | 0 | 18 | D-SKNN | 7.5682 | 0 | 18 | 0 | 18 |
KNN | 7.6591 | 0 | 18 | 0 | 18 | PW | 7.8636 | 0 | 18 | 0 | 18 |
FENN | 7.8409 | 0 | 18 | 0 | 18 | KNN | 7.9432 | 0 | 18 | 0 | 18 |
IDIBL | 8.3295 | 0 | 18 | 0 | 18 | FRNN-FRS | 8.1705 | 0 | 18 | 0 | 18 |
PW | 8.5455 | 0 | 17 | 0 | 18 | IDIBL | 8.3977 | 0 | 17 | 0 | 18 |
ENN | 8.9659 | 0 | 15 | 0 | 18 | FENN | 8.4205 | 0 | 17 | 0 | 18 |
FRNN-FRS | 9.1023 | 0 | 15 | 0 | 17 | NSC | 8.8068 | 0 | 15 | 0 | 18 |
NSC | 9.3182 | 0 | 15 | 0 | 15 | ENN | 9.5341 | 0 | 15 | 0 | 15 |