Integrating Instance Selection, Instance Weighting and Feature Weighting for Nearest Neighbor Classifiers by Co-evolutionary Algorithms - Complementary Material
This Website contains complementary material to the paper:
J. Derrac, I.Triguero, S. García and F.Herrera, Integrating Instance Selection, Instance Weighting and Feature Weighting for Nearest Neighbor Classifiers by Co-evolutionary Algorithms. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 42:5 (2012) 1383-1397, doi: 10.1109/TSMCB.2012.2191953
The web is organized according to the following summary:
Experimental framework
Data-sets partitions employed in the paper
For the experimental study, we have selected thirty small data sets and eight large data sets from KEEL-datasets. In all the experiments, we have adopted a 10-fold cross-validation model, i.e., we have split the data-set randomly into 10 folds, each one containing the 10% of the patterns of the data set. Thus, nine folds have been used for training and one for test. The results reported have been obtained through 5 rounds of 10 runs per data set(i.e. five runs per each possible test set).
Tables 1 and 2 summarize the properties of the selected data sets. It shows, for each data-set, the number of examples (#Ex.), the number of attributes (#Atts.), and the number of classes (#Cl.). The last column of this table contains a link for downloading the 10-fold cross validation partitions for each data-set in KEEL format. All data-sets may be downloaded by clicking here.
Table 1. Summary Description for Small data sets
Table 2. Summary Description for Large data sets
Algorithms and parameters
Several classification methods, evolutionary and non-evolutionary, have been selected to perform an exhaustive study of the capabilities of CIW-NN. We briefly describe them here:
1-NN
The 1-NN rule is used as a baseline limit of performance which most of the rest methods should overcome.
IS-CHC, FW-SSGA and IW-SSGA
These methods follows exactly the same setup as the populations of CIW-NN, except that the Accuracy Rate component of their fitness function is computed separately.
SSMA
S. García, J.R. Cano, F. Herrera, “A Memetic Algorithm for Evolutionary Prototype Selection: A Scaling Up Approach”, Pattern Recognition, vol. 41, no. 8, pp. 2693-2709, 2008.
A Steady-State Memetic Algorithm specifically designed for prototype selection. This evolutionary IS includes a meme optimization mechanism (a local search procedure) able to improve the accuracy achieved by the SSGA and to avoid premature convergence. Moreover, it offers an high reduction capability and a good behaviour when tackling large problems
PW, CW and CPW
R. Paredes and E. Vidal, “Learning weighted metrics to minimize nearest-neighbor classification error”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 7, pp. 1100–1110, 2006.
Three gradient descent based algorithms developed with the aim of minimize a performance index that is an approximation to the leave one out error over the training set. Weights may be specified for each instance (Prototype Weighting, PW), for each combination of feature and class (Class Weighting, CW) or both (Class and Prototype Weigthing, CPW)
WDNN
A novel IW method which searches iteratively, in each training instance, for the best weight to minimize the leave one out error over the training set. A weight of 0:0 can be assigned to any instance, which means that it is discarded from the data set. Thus, this method can be regarded as a simultaneous IS and IW method.
TS/KNN
M. Z. Jahromi, E. Parvinnia, and R. John, “A method of learning weighted similarity function to improve the performance of nearest neighbor”, Information Sciences, vol. 179, no. 17, pp. 2964–2973, 2009.
A Tabu Search based method for simultaneous Feature Selection and Feature Weighting, which encodes in its solutions the current set of features selected (binary codification), the current set of weights assigned to features, and the best value of k found for the k-NN classifier. Furthermore, this method uses Fuzzy k-NN to avoid ties in the classification process.
ReliefF
M. A. Tahir, A. Bouridane, and F. Kurugollu, “Simultaneous feature selection and feature weighting using hybrid tabu search/k-nearest neighbor classifier”, Pattern Recognition Letters, vol. 28, no. 4, pp. 438–446, 2007.
The first Relief-based method adapted to perform FW process. Weights computed in the original Relief algorithm are not binarized to {0,1} Instead, they are employed as final weights for the k-NN classifier.
MI
D. Wettschereck, D. W. Aha, and T. Mohri, “A review and empirical evaluation of feature weigthing methods for a class of lazy learning algorithms,” Artificial Intelligence Review, vol. 11, pp. 273–314, 1997.
Mutual Information between features can be used successfully as a weighting factor for k-NN based algorithms. In our implementation, we have followed the indications given by Wettschereck et al., using the Fayad and Irani's discretizer to allow this method to handle continuous valued problems.
GOCBR
H. Ahn and K. Kim, “Bankruptcy prediction modeling with hybrid case-based reasoning and genetic algorithms approach”, Applied Soft Computing, vol. 9, pp. 599–607, 2009.
A genetic algorithm designed for simultaneous Instance Selection and Feature Weighting process in the same chromosome. Weights are represented by binary chains, thus preserving binary codification in the chromosomes. It has been applied successfully in several real-world applications.
The configuration parameters for all the algorithms of this study are shown in Table 3. We have selected common values for all problems, focusing this experimentation on the recommended parameters proposed by their respective authors, assuming that the choice of the values of the parameters was optimally chosen.
Table 3. Parameter specification for the algorithms employed in the experimentation.
Algorithm | Parameters |
---|---|
CIW-NN | Evaluations: 10000, Population size (IS, FW and IW): 50, α: 0.5, prob0to1: 0.25, prob1: 0.25, Crossover operator: 2BLX0.3-4BLX0.5-2BLX0.7, Mutation probability: 0.05 per chromosome, Epoch length (FW population): 40 evaluations, Epoch length (IW population): 40 evaluations, γ: 0.8 |
IS-CHC | Evaluations: 10000, Population size: 50, α: 0.5, prob0to1: 0.25, prob1: 0.25 |
FW-SSGA | Evaluations: 10000, Population size: 50, Crossover operator: 2BLX0.3-4BLX0.5-2BLX0.7, Mutation probability: 0.05 per chromosome |
IW-SSGA | Evaluations: 10000, Population size: 50, Crossover operator: 2BLX0.3-4BLX0.5-2BLX0.7, Mutation probability: 0.05 per chromosome |
SSMA | Evaluations: 10000, Population size: 30, Cross probability: 0.5 per bit, Mutation probability: 0.001 per bit |
PW | β: Best in [0.125,128], ρ: Best in [0.1,0.001], ε: 0.001, Max iterations: 1000 |
CW | β: Best in [0.125,128], μ: Best in [0.1,0.001], ε: 0.001, Max iterations: 1000 |
CPW | β: Best in [0.125,128], μ: Best in [0.1,0.001], ρ: Best in [0.1,0.001], ε: 0.001, Max iterations: 1000 |
WDNN | Maximun number of iterations: 10 |
TS/K-NN | Evaluations: 10000, M: 10, N: 2, P: ceil(Sqrt(#Features)) |
ReliefF | K value for contributions: Best in [1,20] |
MI | It has no parameters to be fixed |
GOCBR | Evaluations: 10000, Population size: 100, Crossover probability: 0.7, Mutation Probability: 0.1 |
Java implementation
Below we offer the Java source code implementing the CIW-NN model, ready to use with any of the data sets considered in the experimental framework. This implementation is fully compliant with the standards of the KEEL project, and will be considered for inclusion in the KEEL Software tool.
This implementation of CIW-NN can be executed on a standard Java Virtual Machine (Version 1.5 or higher). To do this, it is only needed to write a script which contains al the relevant parameters of the algorithms (see the KEEL reference manual, section 3; located under the KEEL Software Tool item of the main menu).
Esentially, to run CIW-NN, it is needed to perform the following steps:
- Unzip the CIW-NN-source.zip file
- Execute CIW-NN with the command java -jar CIW_NN.jar script.txt
- The results of the classification process will be found in the actual folder.