Fuzzy Nearest Neighbor Algorithms: Taxonomy, Experimental analysis and Prospects - Complementary Material

This Website contains complementary material to the paper:

J. Derrac, S. García and F.Herrera, Fuzzy Nearest Neighbor Algorithms: Taxonomy, Experimental analysis and Prospects. Information Sciences 260 (2014) 98-119, doi: 10.1016/j.ins.2013.10.038 PDF Icon

The web is organized according to the following summary:

  1. Abstract
  2. Survey of Fuzzy Nearest Neighbor Methods
  3. Experimental Framework
    1. Data sets partitions employed in the study
    2. Algorithms and parameters
    3. Source codes
  4. Experimental Study

Experimental framework

This section describes the contents of the experimental framework designed to conduct the experiments. It contains the data set partitions used in the experimental study, a summary of all the configurations considered for every method (both fuzzy nearest neighbor and crisp nearest nighbor classifiers) and source codes of the fuzzy nearest neighbor methods analyzed.

Data sets partitions employed in the paper

For the experimental study, we have selected 44 data sets available at KEEL-datasets. In all the experiments, we have adopted a 10-fold cross-validation model, that is, we have split the data-set randomly into 10 folds, each one containing the 10% of the patterns of the data set. Thus, nine folds have been used for training and one for test. Additionaly, instances with missing values have been discarded before generating the folds.

Table 1 summarize the properties of the selected data sets. It shows, for each data-set, the number of instances (#Ins.), the number of attributes (#At.), and the number of classes (#Cl.). The last column of this table contains a link for downloading the 10-fold cross validation partitions for each data-set in KEEL format. All data-sets may be downloaded by clicking here.

Table 1. Summary description of the data sets

Data set #Ins. #At. #Cl. Download Data set #Ins. #At. #Cl. Download
Appendicitis 106 7 2 iconZip.png Penbased 10992 16 10 iconZip.png
Balance 625 4 3 iconZip.png Phoneme 5404 5 2 iconZip.png
Banana 5300 2 2 iconZip.png Pima 768 8 2 iconZip.png
Bands 539 19 2 iconZip.png Ring 7400 20 2 iconZip.png
Bupa 345 6 2 iconZip.png Satimage 6435 36 7 iconZip.png
Cleveland 297 13 5 iconZip.png Segment 2310 19 7 iconZip.png
Dermatology 358 34 6 iconZip.png Sonar 208 60 2 iconZip.png
Ecoli 336 7 8 iconZip.png Spambase 4597 57 2 iconZip.png
Glass 214 9 7 iconZip.png Spectfheart 267 44 2 iconZip.png
Haberman 306 3 2 iconZip.png Tae 151 5 3 iconZip.png
Hayes-roth 160 4 3 iconZip.png Texture 5500 40 11 iconZip.png
Heart 270 13 2 iconZip.png Thyroid 7200 21 3 iconZip.png
Hepatitis 80 19 2 iconZip.png Titanic 2201 3 2 iconZip.png
Ionosphere 351 33 2 iconZip.png Twonorm 7400 20 2 iconZip.png
Iris 150 4 3 iconZip.png Vehicle 946 18 4 iconZip.png
Led7Digit 500 7 10 iconZip.png Vowel 990 13 11 iconZip.png
Mammographic 830 5 2 iconZip.png Wdbc 569 30 2 iconZip.png
Marketing 6876 13 9 iconZip.png Wine 178 13 3 iconZip.png
Monk-2 432 6 2 iconZip.png Winequality-red 1599 11 11 iconZip.png
Movement 360 90 15 iconZip.png Winequality-white 4898 11 11 iconZip.png
NewThyroid 215 5 3 iconZip.png Wisconsin 683 9 2 iconZip.png
Page-blocks 5472 10 5 iconZip.png Yeast 1484 8 10 iconZip.png

Algorithms and parameters

18 different fuzzy nearest neighbor classifiers have been included in the experimental study. Table 2 enumerates them and summarizes their parameter configuration considered. (check this link for a full description of each method).

Table 2. Parameter specification of the fuzzy nearest neighbor classifiers.

Algorithm Parameters
  CFKNN    k = {3,5,7,9}, α = 0.6
  D-SKNN    k = {3,5,7,9}, α = 0.95, β = 1.0
  FCMKNN    k = {3,5,7,9}, M = 2.0, Iterations = 50, δ = 0.01
  FENN    k = {3,5,7,9}, k (edition) = 5, kInit = 3
  FRKNNA    k = {3,5,7,9}, kInit = 3
  FRNN    -
  FRNN-FRS    k = {3,5,7,9}
  FRNN-VQRS    k = {3,5,7,9}
  FuzzyKNN    k = {3,5,7,9}, M = 2.0, kInit = 3
  FuzzyNPC    M = 2.0
  GAFuzzyKNN    k = {3,5,7,9}, M = 2.0, kInit = 3, Population size = 50, Generations = 10, Crossover probability = 0.8, Mutation probability = 0.01   
  IF-KNN    k = {3,5,7,9}, mA = 0.6, vA = 0.4, mR = 0.3, vR = 0.7, kInit = 3
  IFSKNN    k = {3,5,7,9}
  IFV-NP    Threshold = {0.5, 0.6, 0.7, 0.8}
  IT2FKNN    k = {3,5,7,9}, M = 2.0, kMax (maximum kInit) = 9
  PFKNN    k = {3,5,7,9}
  PosIBL    β = {0.1,0.2,0.5,0.7}
  VWFuzzyKNN    k = {3,5,7,9}, kInit = 3

Additionaly, 7 crisp nearest neighbor classifiers have been included in the experimental study, for comparison purposes. Table 3 enumerates them and summarizes their parameter configuration considered (a description of each method is provided in the paper).

Table 3. Parameter specification of the crisp nearest neighbor classifiers.

Algorithm Parameters
  k-NN    k = {3,5,7,9}
  ENN    k = {3,5,7,9}
  IDIBL    MaxK = 30, First stage iterations = 4, Second stage iterations = 8  
  KNNAdaptive    k = {3,5,7,9}
  KSNN    k = {3,5,7,9}
  NSC    Variance = {0.01,0.1,1,10}, Q = 1, K = 3, NoChange = 100, Epoch = 100  
  PW    Beta = {0.5, 2, 8, 32}, ρ = 0.001, ε = 0.001  

Source codes

This section includes the source code for the algorithms used in the study. Please note that the code is provided as-is, with no guarantees whatsoever, but in the hope that it might result useful in future research.

The methods have been developed in Java, under the guidelines of the KEEL software framework. You can download the Fuzzy Instance Based Learning package here, ready to be included into the standard distribution of the KEEL Software Tool (see the README.txt file for installation instructions).

You can download the open source version of the KEEL Sofware Tool here. Note that both the KEEL Sofware Tool and the Fuzzy Instance Based Learning package are released under the terms of the (GPLv3) license, meaning that both the tool and the package are open source.