Fuzzy Nearest Neighbor Algorithms: Taxonomy, Experimental analysis and Prospects - Complementary Material

This Website contains complementary material to the paper:

J. Derrac, S. García and F.Herrera, Fuzzy Nearest Neighbor Algorithms: Taxonomy, Experimental analysis and Prospects. Information Sciences 260 (2014) 98-119, doi: 10.1016/j.ins.2013.10.038

The web is organized according to the following summary:

Experimental framework

This section describes the contents of the experimental framework designed to conduct the experiments. It contains the data set partitions used in the experimental study, a summary of all the configurations considered for every method (both fuzzy nearest neighbor and crisp nearest nighbor classifiers) and source codes of the fuzzy nearest neighbor methods analyzed.

Data sets partitions employed in the paper

For the experimental study, we have selected 44 data sets available at KEEL-datasets. In all the experiments, we have adopted a 10-fold cross-validation model, that is, we have split the data-set randomly into 10 folds, each one containing the 10% of the patterns of the data set. Thus, nine folds have been used for training and one for test. Additionaly, instances with missing values have been discarded before generating the folds.

Table 1 summarize the properties of the selected data sets. It shows, for each data-set, the number of instances (#Ins.), the number of attributes (#At.), and the number of classes (#Cl.). The last column of this table contains a link for downloading the 10-fold cross validation partitions for each data-set in KEEL format. All data-sets may be downloaded by clicking here.

Table 1. Summary description of the data sets

Data set	#Ins.	#At.	#Cl.	Data set	#Ins.	#At.	#Cl.
Appendicitis	106	7	2	Penbased	10992	16	10
Balance	625	4	3	Phoneme	5404	5	2
Banana	5300	2	2	Pima	768	8	2
Bands	539	19	2	Ring	7400	20	2
Bupa	345	6	2	Satimage	6435	36	7
Cleveland	297	13	5	Segment	2310	19	7
Dermatology	358	34	6	Sonar	208	60	2
Ecoli	336	7	8	Spambase	4597	57	2
Glass	214	9	7	Spectfheart	267	44	2
Haberman	306	3	2	Tae	151	5	3
Hayes-roth	160	4	3	Texture	5500	40	11
Heart	270	13	2	Thyroid	7200	21	3
Hepatitis	80	19	2	Titanic	2201	3	2
Ionosphere	351	33	2	Twonorm	7400	20	2
Iris	150	4	3	Vehicle	946	18	4
Led7Digit	500	7	10	Vowel	990	13	11
Mammographic	830	5	2	Wdbc	569	30	2
Marketing	6876	13	9	Wine	178	13	3
Monk-2	432	6	2	Winequality-red	1599	11	11
Movement	360	90	15	Winequality-white	4898	11	11
NewThyroid	215	5	3	Wisconsin	683	9	2
Page-blocks	5472	10	5	Yeast	1484	8	10

Algorithms and parameters

18 different fuzzy nearest neighbor classifiers have been included in the experimental study. Table 2 enumerates them and summarizes their parameter configuration considered. (check this link for a full description of each method).

Table 2. Parameter specification of the fuzzy nearest neighbor classifiers.

Algorithm	Parameters
CFKNN	k = {3,5,7,9}, α = 0.6
D-SKNN	k = {3,5,7,9}, α = 0.95, β = 1.0
FCMKNN	k = {3,5,7,9}, M = 2.0, Iterations = 50, δ = 0.01
FENN	k = {3,5,7,9}, k (edition) = 5, kInit = 3
FRKNNA	k = {3,5,7,9}, kInit = 3
FRNN	-
FRNN-FRS	k = {3,5,7,9}
FRNN-VQRS	k = {3,5,7,9}
FuzzyKNN	k = {3,5,7,9}, M = 2.0, kInit = 3
FuzzyNPC	M = 2.0
GAFuzzyKNN	k = {3,5,7,9}, M = 2.0, kInit = 3, Population size = 50, Generations = 10, Crossover probability = 0.8, Mutation probability = 0.01
IF-KNN	k = {3,5,7,9}, mA = 0.6, vA = 0.4, mR = 0.3, vR = 0.7, kInit = 3
IFSKNN	k = {3,5,7,9}
IFV-NP	Threshold = {0.5, 0.6, 0.7, 0.8}
IT2FKNN	k = {3,5,7,9}, M = 2.0, kMax (maximum kInit) = 9
PFKNN	k = {3,5,7,9}
PosIBL	β = {0.1,0.2,0.5,0.7}
VWFuzzyKNN	k = {3,5,7,9}, kInit = 3

Additionaly, 7 crisp nearest neighbor classifiers have been included in the experimental study, for comparison purposes. Table 3 enumerates them and summarizes their parameter configuration considered (a description of each method is provided in the paper).

Table 3. Parameter specification of the crisp nearest neighbor classifiers.

Algorithm	Parameters
k-NN	k = {3,5,7,9}
ENN	k = {3,5,7,9}
IDIBL	MaxK = 30, First stage iterations = 4, Second stage iterations = 8
KNNAdaptive	k = {3,5,7,9}
KSNN	k = {3,5,7,9}
NSC	Variance = {0.01,0.1,1,10}, Q = 1, K = 3, NoChange = 100, Epoch = 100
PW	Beta = {0.5, 2, 8, 32}, ρ = 0.001, ε = 0.001

Source codes

This section includes the source code for the algorithms used in the study. Please note that the code is provided as-is, with no guarantees whatsoever, but in the hope that it might result useful in future research.

The methods have been developed in Java, under the guidelines of the KEEL software framework. You can download the Fuzzy Instance Based Learning package here, ready to be included into the standard distribution of the KEEL Software Tool (see the README.txt file for installation instructions).

You can download the open source version of the KEEL Sofware Tool here. Note that both the KEEL Sofware Tool and the Fuzzy Instance Based Learning package are released under the terms of the (GPLv3) license, meaning that both the tool and the package are open source.

You are here

Fuzzy Nearest Neighbor Algorithms: Taxonomy, Experimental analysis and Prospects - Complementary Material

Experimental framework

Data sets partitions employed in the paper

Algorithms and parameters

Source codes

User login

SCI2S Web-site Related