Integrating Instance Selection, Instance Weighting and Feature Weighting for Nearest Neighbor Classifiers by Co-evolutionary Algorithms - Complementary Material

This Website contains complementary material to the paper:

J. Derrac, I.Triguero, S. García and F.Herrera, Integrating Instance Selection, Instance Weighting and Feature Weighting for Nearest Neighbor Classifiers by Co-evolutionary Algorithms. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 42:5 (2012) 1383-1397, doi: 10.1109/TSMCB.2012.2191953

The web is organized according to the following summary:

Abstract
CIW-NN model
Experimental framework
Results

Experimental framework

Data-sets partitions employed in the paper

For the experimental study, we have selected thirty small data sets and eight large data sets from KEEL-datasets. In all the experiments, we have adopted a 10-fold cross-validation model, i.e., we have split the data-set randomly into 10 folds, each one containing the 10% of the patterns of the data set. Thus, nine folds have been used for training and one for test. The results reported have been obtained through 5 rounds of 10 runs per data set(i.e. five runs per each possible test set).

Tables 1 and 2 summarize the properties of the selected data sets. It shows, for each data-set, the number of examples (#Ex.), the number of attributes (#Atts.), and the number of classes (#Cl.). The last column of this table contains a link for downloading the 10-fold cross validation partitions for each data-set in KEEL format. All data-sets may be downloaded by clicking here.

Table 1. Summary Description for Small data sets

Data set	#Ex.	#At.	#Cl.	Data set	#Ex.	#At.	#Cl.
Australian	690	14	2	Monk-2	432	6	2
Balance	625	4	3	Movement	360	90	15
Bands	539	19	2	New Thyroid	215	5	3
Breast	286	9	2	Pima	768	8	2
Bupa	345	6	2	Saheart	462	9	2
Car	1728	6	4	Sonar	208	60	2
Cleveland	303	13	5	Spectfheart	267	44	2
Contraceptive	1473	9	3	Tae	151	5	3
Dermatology	366	34	6	Tic-tac-toe	958	9	2
German	1000	20	2	Vehicle	846	18	4
Glass	214	9	7	Vowel	990	13	11
Hayes-roth	160	4	3	Wine	178	13	3
Housevotes	435	16	2	Wisconsin	699	9	2
Iris	150	4	3	Yeast	1484	8	10
Lymphography	148	18	4	Zoo	101	16	7

Table 2. Summary Description for Large data sets

Data set	#Ex.	#At.	#Cl.	Data set	#Ex.	#At.	#Cl.
Abalone	4174	8	28	Page-blocks	5473	10	5
Banana	5300	2	2	Phoneme	5404	5	2
Chess	3196	36	2	Segment	2310	19	7
Marketing	8993	13	9	Splice	3190	60	3

Algorithms and parameters

Several classification methods, evolutionary and non-evolutionary, have been selected to perform an exhaustive study of the capabilities of CIW-NN. We briefly describe them here:

1-NN

The 1-NN rule is used as a baseline limit of performance which most of the rest methods should overcome.

IS-CHC, FW-SSGA and IW-SSGA

These methods follows exactly the same setup as the populations of CIW-NN, except that the Accuracy Rate component of their fitness function is computed separately.

SSMA

S. García, J.R. Cano, F. Herrera, “A Memetic Algorithm for Evolutionary Prototype Selection: A Scaling Up Approach”, Pattern Recognition, vol. 41, no. 8, pp. 2693-2709, 2008.

A Steady-State Memetic Algorithm specifically designed for prototype selection. This evolutionary IS includes a meme optimization mechanism (a local search procedure) able to improve the accuracy achieved by the SSGA and to avoid premature convergence. Moreover, it offers an high reduction capability and a good behaviour when tackling large problems

PW, CW and CPW

R. Paredes and E. Vidal, “Learning weighted metrics to minimize nearest-neighbor classification error”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 7, pp. 1100–1110, 2006.

Three gradient descent based algorithms developed with the aim of minimize a performance index that is an approximation to the leave one out error over the training set. Weights may be specified for each instance (Prototype Weighting, PW), for each combination of feature and class (Class Weighting, CW) or both (Class and Prototype Weigthing, CPW)

WDNN

A novel IW method which searches iteratively, in each training instance, for the best weight to minimize the leave one out error over the training set. A weight of 0:0 can be assigned to any instance, which means that it is discarded from the data set. Thus, this method can be regarded as a simultaneous IS and IW method.

TS/KNN

M. Z. Jahromi, E. Parvinnia, and R. John, “A method of learning weighted similarity function to improve the performance of nearest neighbor”, Information Sciences, vol. 179, no. 17, pp. 2964–2973, 2009.

A Tabu Search based method for simultaneous Feature Selection and Feature Weighting, which encodes in its solutions the current set of features selected (binary codification), the current set of weights assigned to features, and the best value of k found for the k-NN classifier. Furthermore, this method uses Fuzzy k-NN to avoid ties in the classification process.

ReliefF

M. A. Tahir, A. Bouridane, and F. Kurugollu, “Simultaneous feature selection and feature weighting using hybrid tabu search/k-nearest neighbor classifier”, Pattern Recognition Letters, vol. 28, no. 4, pp. 438–446, 2007.

The first Relief-based method adapted to perform FW process. Weights computed in the original Relief algorithm are not binarized to {0,1} Instead, they are employed as final weights for the k-NN classifier.

D. Wettschereck, D. W. Aha, and T. Mohri, “A review and empirical evaluation of feature weigthing methods for a class of lazy learning algorithms,” Artificial Intelligence Review, vol. 11, pp. 273–314, 1997.

Mutual Information between features can be used successfully as a weighting factor for k-NN based algorithms. In our implementation, we have followed the indications given by Wettschereck et al., using the Fayad and Irani's discretizer to allow this method to handle continuous valued problems.

GOCBR

H. Ahn and K. Kim, “Bankruptcy prediction modeling with hybrid case-based reasoning and genetic algorithms approach”, Applied Soft Computing, vol. 9, pp. 599–607, 2009.

A genetic algorithm designed for simultaneous Instance Selection and Feature Weighting process in the same chromosome. Weights are represented by binary chains, thus preserving binary codification in the chromosomes. It has been applied successfully in several real-world applications.

The configuration parameters for all the algorithms of this study are shown in Table 3. We have selected common values for all problems, focusing this experimentation on the recommended parameters proposed by their respective authors, assuming that the choice of the values of the parameters was optimally chosen.

Table 3. Parameter specification for the algorithms employed in the experimentation.

Algorithm	Parameters
CIW-NN	Evaluations: 10000, Population size (IS, FW and IW): 50, α: 0.5, prob0to1: 0.25, prob1: 0.25, Crossover operator: 2BLX0.3-4BLX0.5-2BLX0.7, Mutation probability: 0.05 per chromosome, Epoch length (FW population): 40 evaluations, Epoch length (IW population): 40 evaluations, γ: 0.8
IS-CHC	Evaluations: 10000, Population size: 50, α: 0.5, prob0to1: 0.25, prob1: 0.25
FW-SSGA	Evaluations: 10000, Population size: 50, Crossover operator: 2BLX0.3-4BLX0.5-2BLX0.7, Mutation probability: 0.05 per chromosome
IW-SSGA	Evaluations: 10000, Population size: 50, Crossover operator: 2BLX0.3-4BLX0.5-2BLX0.7, Mutation probability: 0.05 per chromosome
SSMA	Evaluations: 10000, Population size: 30, Cross probability: 0.5 per bit, Mutation probability: 0.001 per bit
PW	β: Best in [0.125,128], ρ: Best in [0.1,0.001], ε: 0.001, Max iterations: 1000
CW	β: Best in [0.125,128], μ: Best in [0.1,0.001], ε: 0.001, Max iterations: 1000
CPW	β: Best in [0.125,128], μ: Best in [0.1,0.001], ρ: Best in [0.1,0.001], ε: 0.001, Max iterations: 1000
WDNN	Maximun number of iterations: 10
TS/K-NN	Evaluations: 10000, M: 10, N: 2, P: ceil(Sqrt(#Features))
ReliefF	K value for contributions: Best in [1,20]
MI	It has no parameters to be fixed
GOCBR	Evaluations: 10000, Population size: 100, Crossover probability: 0.7, Mutation Probability: 0.1

Java implementation

Below we offer the Java source code implementing the CIW-NN model, ready to use with any of the data sets considered in the experimental framework. This implementation is fully compliant with the standards of the KEEL project, and will be considered for inclusion in the KEEL Software tool.

Download source code of CIW-NN

This implementation of CIW-NN can be executed on a standard Java Virtual Machine (Version 1.5 or higher). To do this, it is only needed to write a script which contains al the relevant parameters of the algorithms (see the KEEL reference manual, section 3; located under the KEEL Software Tool item of the main menu).

Esentially, to run CIW-NN, it is needed to perform the following steps:

- Unzip the CIW-NN-source.zip file

- Execute CIW-NN with the command java -jar CIW_NN.jar script.txt

- The results of the classification process will be found in the actual folder.

You are here

Integrating Instance Selection, Instance Weighting and Feature Weighting for Nearest Neighbor Classifiers by Co-evolutionary Algorithms - Complementary Material

Experimental framework

Data-sets partitions employed in the paper

Algorithms and parameters

Java implementation

User login

SCI2S Web-site Related