Fuzzy Nearest Neighbor Algorithms: Taxonomy, Experimental analysis and Prospects - Complementary Material

This Website contains complementary material to the paper:

J. Derrac, S. García and F.Herrera, Fuzzy Nearest Neighbor Algorithms: Taxonomy, Experimental analysis and Prospects. Information Sciences 260 (2014) 98-119, doi: 10.1016/j.ins.2013.10.038 PDF Icon

The web is organized according to the following summary:

  1. Abstract
  2. Survey of Fuzzy Nearest Neighbor Methods
  3. Experimental Framework
  4. Experimental Study
    1. Stage 1: Comparison of fuzzy nearest neighbor classifiers
    2. Stage 2: Comparison with crisp nearest neighbor approaches

Experimental Study

The full results of the experimental study performed are presented here. This study is composed of two stages: 1) A first one, analyzing the performance of fuzzy nearest neighbor classifiers 2) A second one, comparing the performance of the best performing fuzzy nearest neighbor classifiers with other crisp nearest neighbor based approaches.

Comparison of fuzzy nearest neighbor classifiers

In this first stage, the 18 fuzzy nearest neighbor classifiers are tested. Table 4 summarizes the results obtained, considering the following performance measures: Accuracy in training phase (considering the best K value/configuration for each method), accuracy in test phase (considering the best K value/configuration for each method), accuracy in test phase (considering a fixed K value/configuration for each method), kappa in test phase (considering the best K value/configuration for each method), kappa in test phase (considering a fixed K value/configuration for each method), and running time.

Table 4. Summary results of Stage 1: Fuzzy nearest neighbor classifiers

Accuracy (Training) Accuracy (Test, Best K) Accuracy (Test, Fixed K) Kappa (Training) Kappa (Test, Best K) Kappa (Test, Fixed K) Running time
Method Average Method Average Method Average K value Method Average Method Average Method Average K value Method Time (s)
GAFuzzyKNN 0.8517 GAFuzzyKNN 0.8204 GAFuzzyKNN 0.8130 5 GAFuzzyKNN 0.7142 GAFuzzyKNN 0.6558 GAFuzzyKNN 0.6415 5 FuzzyNPC 0.0409
VWFuzzyKNN 0.8421 FuzzyKNN 0.8190 IT2FKNN 0.8111 7 VWFuzzyKNN 0.6987 FuzzyKNN 0.6524 IT2FKNN 0.6354 7 PosIBL 2.7363
FENN 0.8388 IT2FKNN 0.8181 FuzzyKNN 0.8110 5 FENN 0.6858 IT2FKNN 0.6484 FuzzyKNN 0.6366 7 FRNN-VQRS 2.9070
FuzzyKNN 0.8362 D-SKNN 0.8136 D-SKNN 0.7985 5 FuzzyKNN 0.6837 D-SKNN 0.6468 D-SKNN 0.6167 5 FRNN-FRS 3.0145
IT2FKNN 0.8350 IF-KNN 0.8062 IF-KNN 0.7972 3 IT2FKNN 0.6795 IF-KNN 0.6321 IF-KNN 0.6157 3 D-SKNN 3.0955
IF-KNN 0.8201 FENN 0.8009 FENN 0.7926 5 IF-KNN 0.6549 FENN 0.6150 FRNN-FRS 0.6130 3 FCMKNN 4.0568
D-SKNN 0.8112 PFKNN 0.7961 PosIBL 0.7883 * D-SKNN 0.6387 FRNN-FRS 0.6138 PosIBL 0.6071 * VWFuzzyKNN 5.6256
PFKNN 0.7928 PosIBL 0.7913 PFKNN 0.7877 9 PFKNN 0.6097 PosIBL 0.6134 FRNN-VQRS 0.6061 5 IFSKNN 6.4927
FRNN-FRS 0.7843 FRNN-FRS 0.7880 FRNN-FRS 0.7875 3 FRNN-FRS 0.6076 PFKNN 0.6130 FENN 0.5993 5 FuzzyKNN 6.5322
PosIBL 0.7836 VWFuzzyKNN 0.7869 FRNN-VQRS 0.7799 5 FRNN-VQRS 0.6055 FRNN-VQRS 0.6104 PFKNN 0.5992 7 CFKNN 6.7276
FRKNNA 0.7833 FRNN-VQRS 0.7825 VWFuzzyKNN 0.7775 3 PosIBL 0.5986 VWFuzzyKNN 0.5936 VWFuzzyKNN 0.5793 3 FENN 6.9731
FRNN-VQRS 0.7794 FRKNNA 0.7738 FRKNNA 0.7640 3 FRKNNA 0.5953 IFSKNN 0.5890 IFSKNN 0.5705 3 FRKNNA 7.2246
IFSKNN 0.7691 IFSKNN 0.7713 IFSKNN 0.7585 5 IFSKNN 0.5839 FRKNNA 0.5779 FRKNNA 0.5612 3 IF-KNN 7.9749
FRNN 0.7375 FRNN 0.7408 FRNN 0.7408 * FuzzyNPC 0.5285 FuzzyNPC 0.5079 FuzzyNPC 0.5079 * IFV-NP 11.1111
FuzzyNPC 0.7112 FuzzyNPC 0.6975 FuzzyNPC 0.6975 * CFKNN 0.5162 CFKNN 0.5000 CFKNN 0.4925 3 IT2FKNN 13.1984
CFKNN 0.7052 CFKNN 0.6931 CFKNN 0.6885 3 FCMKNN 0.4647 FCMKNN 0.4497 FRNN 0.4403 * FRNN 28.5193
FCMKNN 0.6549 FCMKNN 0.6469 FCMKNN 0.6397 5 IFV-NP 0.4466 FRNN 0.4403 FCMKNN 0.4390 3 PFKNN 725.8243
IFV-NP 0.6450 IFV-NP 0.6337 IFV-NP 0.6085 * FRNN 0.4364 IFV-NP 0.4299 IFV-NP 0.4153 * GAFuzzyKNN 1275.4415
 

Note that the values shown in the table are the average results obtained considering the 44 data sets of the experimental study. Table 5 presents XLS sheets containing the detailed results per each data set and configuration, including a XLS sheets for every method. A XLS sheet with the complete results is also provided.

Table 5. Full results of Stage 1: Fuzzy nearest neighbor classifiers

Table 5. Full results of Stage 1: Fuzzy nearest neighbor classifiers
Method XLS file Method XLS file Method XLS file Method XLS file Method XLS file Method XLS file
CFKNN iconExcel.jpg D-SKNN iconExcel.jpg FCMKNN iconExcel.jpg FENN iconExcel.jpg FRKNNA iconExcel.jpg FRNN iconExcel.jpg
FRNN-FRS iconExcel.jpg FRNN-VQRS iconExcel.jpg FuzzyKNN iconExcel.jpg FuzzyNPC iconExcel.jpg GAFuzzyKNN iconExcel.jpg IF-KNN iconExcel.jpg
IFSKNN iconExcel.jpg IFV-NP iconExcel.jpg IT2FKNN iconExcel.jpg PFKNN iconExcel.jpg PosIBL iconExcel.jpg VWFuzzyKNN iconExcel.jpg
Complete results iconExcel.jpg
 
The next step of the study involves the use of the Friedman and Shaffer tests to contrast the results shown in the former tables. For the sake of generality, these statistical analyses have been carried out considering the results presented above, using the accuracy and kappa performance measures with a fixed K value.

Table 6 presents the results of both tests for accuracy and kappa measures. Firstly, a column is presented showing the ranks obtained in the Friedman test, where the lower the rank is, the better behavior the respective algorithm has shown. The p-values obtained by the Friedman test are 1.38E-10 and 1.08E-10 (for accuracy and kappa, respectively), which means that significant differences exists among the algorithms.

Shaffer test is conducted to characterize these differences. 153 pairwise hypotheses can be established, from which in the case of accuracy 66 are significant at a α = 0.1 level of significance (54 at a α = 0.01 level of significance). In the case of kappa, 61 and 50 hypotheses are significant, respectively.

For each method, Table 6 shows the number of algorithms which are significantly improved by it (the "+" column) and the number of algorithms which are significantly improved or equal (the "+=" column).
An XSL sheet containing the Friedman Ranks and the 153 hypotheses with their associated p-values can be dowloaded here.iconExcel.jpg

Table 6. Full results of Friedman and Shaffer tests (Accuracy and Kappa) - Stage 1: Fuzzy nearest neighbor classifiers

  Accuracy (α level = 0.1) Accuracy (α level = 0.01)   Kappa (α level = 0.1) Kappa (α level = 0.01)
Algorithm Rank + += + += Algorithm Rank + += + +=
IT2FKNN 4.9659 10 18 9 18 FuzzyKNN 5.4091 9 18 8 18
FuzzyKNN 5.3409 10 18 8 18 IT2FKNN 5.4659 9 18 8 18
GAFuzzyKNN 5.3523 10 18 8 18 GAFuzzyKNN 5.5909 8 18 8 18
D-SKNN 6.6818 6 18 5 18 D-SKNN 7.3977 5 18 5 18
IF-KNN 7.0909 6 18 5 18 IF-KNN 7.4091 5 18 5 18
FENN 7.2614 5 18 5 18 PosIBL 8.1364 5 18 4 18
PFKNN 7.9318 5 18 4 18 FENN 8.3636 4 18 4 18
PosIBL 8.6023 4 18 3 18 PFKNN 8.4432 4 18 4 18
FRNN-FRS 9.2386 3 15 2 18 FRNN-FRS 8.6591 4 18 2 18
VWFuzzyKNN 9.7386 2 15 2 17 FRNN-VQRS 9.2500 4 16 2 18
FRNN-VQRS 9.9091 2 15 2 15 IFSKNN 10.0341 2 15 0 15
IFSKNN 10.1591 2 15 1 15 VWFuzzyKNN 10.0795 2 15 0 15
FRNN 10.8977 1 13 0 15 FuzzyNPC 10.7273 0 15 0 15
FuzzyNPC 12.1250 0 12 0 12 CFKNN 12.0000 0 12 0 13
FRKNNA 12.8409 0 11 0 11 FRKNNA 12.9773 0 8 0 10
CFKNN 13.2841 0 9 0 10 FRNN 13.0682 0 8 0 10
FCMKNN 14.5114 0 6 0 7 FCMKNN 13.9318 0 6 0 8
IFV-NP 15.0682 0 5 0 6 IFV-NP 14.0568 0 6 0 8
 

Comparison with crisp nearest neighbor approaches

In this second stage, the 7 best performing fuzzy nearest neighbor classifiers are compared with 7 state-of-art crisp nearest neighbor classifier. The 7 fuzzy nearest neighbor classifiers selected are FuzzyKNN and the best performing method of each family (GAFuzzyKNN, IT2FKNN, D-SKNN, IF-KNN, FRNN-FRS and FENN).

Table 7 summarizes the results obtained, considering the following performance measures: Accuracy in training phase (considering the best K value/configuration for each method), accuracy in test phase (considering the best K value/configuration for each method), accuracy in test phase (considering a fixed K value/configuration for each method), kappa in test phase (considering the best K value/configuration for each method), kappa in test phase (considering a fixed K value/configuration for each method), and running time.

Table 7. Summary results of Stage 2: Comparison with crisp nearest neighbor approaches

Accuracy (Training) Accuracy (Test, Best K) Accuracy (Test, Fixed K) Kappa (Training) Kappa (Test, Best K) Kappa (Test, Fixed K) Running time
Method Average Method Average Method Average K value Method Average Method Average Method Average K value Method Time (s)
NSC 0.9617 GAFuzzyKNN 0.8204 GAFuzzyKNN 0.8130 5 NSC 0.9250 GAFuzzyKNN 0.6558 GAFuzzyKNN 0.6415 5 FRNN-FRS 3.0145
KSNN 0.8651 FuzzyKNN 0.8190 IT2FKNN 0.8111 7 KSNN 0.7438 FuzzyKNN 0.6524 FuzzyKNN 0.6366 7 D-SKNN 3.0955
GAFuzzyKNN 0.8517 IT2FKNN 0.8181 FuzzyKNN 0.8110 5 GAFuzzyKNN 0.7142 IT2FKNN 0.6484 IT2FKNN 0.6354 7 KNN 3.3452
IDIBL 0.8408 D-SKNN 0.8136 D-SKNN 0.7985 5 IDIBL 0.6906 D-SKNN 0.6468 D-SKNN 0.6167 5 NSC 4.8859
FENN 0.8388 KSNN 0.8098 IF-KNN 0.7972 3 FENN 0.6858 NSC 0.6379 KSNN 0.6160 3 ENN 5.0116
FuzzyKNN 0.8362 IF-KNN 0.8062 KSNN 0.7970 5 FuzzyKNN 0.6837 KSNN 0.6359 IF-KNN 0.6157 3 KNNAdaptive 6.1195
IT2FKNN 0.8350 NSC 0.8020 FENN 0.7926 5 IT2FKNN 0.6795 IF-KNN 0.6321 FRNN-FRS 0.6130 3 KSNN 6.2721
IF-KNN 0.8201 FENN 0.8009 IDIBL 0.7902 * PW 0.6638 FENN 0.6150 KNN 0.6028 7 FuzzyKNN 6.5322
PW 0.8151 KNN 0.7933 FRNN-FRS 0.7875 3 IF-KNN 0.6549 KNN 0.6143 FENN 0.5993 5 FENN 6.9731
ENN 0.8113 KNNAdaptive 0.7927 KNNAdaptive 0.7856 3 ENN 0.6396 FRNN-FRS 0.6138 PW 0.5955 * IF-KNN 7.9749
D-SKNN 0.8112 IDIBL 0.7902 KNN 0.7815 7 D-SKNN 0.6387 KNNAdaptive 0.6131 NSC 0.5814 * IT2FKNN 13.1984
KNNAdaptive 0.8024 ENN 0.7901 NSC 0.7801 * KNNAdaptive 0.6377 PW 0.6042 IDIBL 0.5807 * PW 22.7235
KNN 0.7901 FRNN-FRS 0.7880 PW 0.7793 * KNN 0.6089 IDIBL 0.5946 ENN 0.5740 3 IDIBL 409.3493
FRNN-FRS 0.7843 PW 0.7828 ENN 0.7784 5 FRNN-FRS 0.6076 ENN 0.5936 KNNAdaptive 0.5697 3 GAFuzzyKNN 1275.4415
 

Note that the values shown in the table are the average results obtained considering the 44 data sets of the experimental study. Table 8 presents XLS sheets containing the detailed results per each data set and configuration, including a XLS sheets for every method. A XLS sheet with the complete results is also provided.

Table 8. Full results of Stage 2: Comparison with crisp nearest neighbor approaches

Method XLS file Method XLS file Method XLS file Method XLS file Method XLS file Method XLS file Method XLS file
D-SKNN iconExcel.jpg ENN iconExcel.jpg FENN iconExcel.jpg FRNN-FRS iconExcel.jpg FuzzyKNN iconExcel.jpg GAFuzzyKNN iconExcel.jpg IDIBL iconExcel.jpg
IF-KNN iconExcel.jpg IT2FKNN iconExcel.jpg KNN iconExcel.jpg KNNAdaptive iconExcel.jpg KSNN iconExcel.jpg NSC iconExcel.jpg PW iconExcel.jpg
Complete results iconExcel.jpg

The next step of the study involves the use of the Friedman and Shaffer tests to contrast the results shown in the former tables. For the sake of generality, these statistical analyses have been carried out considering the results presented above, using the accuracy and kappa performance measures with a fixed K value.

Table 9 presents the results of both tests for accuracy and kappa measures. Firstly, a column is presented showing the ranks obtained in the Friedman test, where the lower the rank is, the better behavior the respective algorithm has shown. The p-values obtained by the Friedman test are 1.17E-6 and 5.59E-6 (for accuracy and kappa, respectively), which means that significant differences exists among the algorithms.

Shaffer test is conducted to characterize these differences. 91 pairwise hypotheses can be established, from which in the case of accuracy 10 are significant at a α = 0.1 level of significance (4 at a α = 0.01 level of significance). In the case of kappa, 8 and 3 hypotheses are significant, respectively.

For each method, Table 6 shows the number of algorithms which are significantly improved by it (the "+" column) and the number of algorithms which are significantly improved or equal (the "+=" column).
An XSL sheet containing the Friedman Ranks and the 91 hypotheses with their associated p-values can be dowloaded here.iconExcel.jpg

Table 9. Full results of Friedman and Shaffer tests (Accuracy and Kappa) - Stage 2: Comparison with crisp nearest neighbor approaches

  Accuracy (α level = 0.1) Accuracy (α level = 0.01)   Kappa (α level = 0.1) Kappa (α level = 0.01)
Algorithm Rank + += + += Algorithm Rank + += + +=
IT2FKNN 5.5795 4 18 2 18 GAFuzzyKNN 5.4773 4 18 1 18
GAFuzzyKNN 5.8295 3 18 1 18 IT2FKNN 5.6591 2 18 1 18
FuzzyKNN 5.8864 3 18 1 18 FuzzyKNN 5.7614 2 18 1 18
KSNN 6.5455 0 18 0 18 KSNN 6.9318 0 18 0 18
KNNAdaptive 6.5682 0 18 0 18 KNNAdaptive 7.2273 0 18 0 18
D-SKNN 7.3977 0 18 0 18 IF-KNN 7.2386 0 18 0 18
IF-KNN 7.4318 0 18 0 18 D-SKNN 7.5682 0 18 0 18
KNN 7.6591 0 18 0 18 PW 7.8636 0 18 0 18
FENN 7.8409 0 18 0 18 KNN 7.9432 0 18 0 18
IDIBL 8.3295 0 18 0 18 FRNN-FRS 8.1705 0 18 0 18
PW 8.5455 0 17 0 18 IDIBL 8.3977 0 17 0 18
ENN 8.9659 0 15 0 18 FENN 8.4205 0 17 0 18
FRNN-FRS 9.1023 0 15 0 17 NSC 8.8068 0 15 0 18
NSC 9.3182 0 15 0 15 ENN 9.5341 0 15 0 15