IPADE: Iterative Prototype Adjustment for Nearest Neighbor Classification - Complementary Material
I. Triguero, S. García and F.Herrera, IPADE: Iterative Prototype Adjustment for Nearest Neighbor Classification.
- Abstract
- Experimental Framework
- Statistical Tests used
- Results obtained
- JAVA code for the methods used
Abstract
Nearest prototype methods are a successful trend of many pattern classification tasks. However, they present several shortcomings such as time response, noise sensitivity and storage requirements. Data reduction techniques are suitable to alleviate these drawbacks. Prototype generation is an appropriate process for data reduction that allows the fitting of a data set for nearest neighbor classification.
This concise paper presents a methodology to learn iteratively the positioning of prototypes using real parameters' optimization procedures. Concretely, we propose an iterative prototype adjustment technique based on differential evolution (IPADE). The results obtained are contrasted with non-parametrical statistical tests and show that our proposal consistently outperforms previously proposed methods, thus becoming a suitable tool in the task of enhancing the performance of the nearest neighbor classifier.
Experimental framework
Data-sets partitions employed in the paper
For the experimental study, we have selected fifty data sets from the KEEL-datasets repository. In all the experiments, we have adopted a 10-fold cross-validation model, i.e., we have split the data-set randomly into 10 folds, each one containing the 10% of the patterns of the data set. Thus, nine folds have been used for training and one for test.
Tables 1 summarize the properties of the selected data sets. It shows, for each data-set, the number of examples (#Ex.), the number of attributes (#Atts.), and the number of classes (#Cl.). The last column of this table contains a link for downloading the 10-fold cross validation partitions for each data-set in KEEL format. All data-sets may be downloaded by clicking here.
Table 1. Summary Description for the data sets used
Algorithms and parameters
Several prototype generation methods have been selected to perform an exhaustive study of the capabilities of IPADE. We briefly describe them here:
1-NN
The 1-NN rule is used as a baseline limit of performance which most of the rest methods should overcome.
RSP3
J. S. Sanchez, ''High training set size reduction by space partitioning and prototype abstraction,'' Pattern Recognition, vol. 37, no. 7, pp. 1561--1564, 2004.
This technique is based on Chen's algorithm, and it tries to avoid drastic changes in the form of decision boundaries associated with the training set which are the main shortcomings of Chen's algorithm.
MixtGauss
M.Lozano,J.M.Sotoca,J.S.Sanchez,F.Pla,E.Pekalska,and R. P. W. Duin, ''Experimental study on prototype optimisation algorithms for prototype-based classification in vector spaces,'' Pattern Recognition, vol. 39, no. 10, pp. 1827--1838, 2006.
It is an adaptive PG method considered in the framework of mixture modeling by Gaussian distributions, while assuming a statistical independence of features.
ENPC
F. Fernandez and P. Isasi, ''Evolutionary design of nearest prototype classifiers,'' Journal of Heuristics, vol. 10, no. 4, pp. 431--454, 2004.
Usually, genetic algorithms are very dependent on some crucial parameters that have to be found by a trial and error process. One of the most important characteristics of this method is the lack of initial conditions, ENPC does not need any parameter.
Initially, this method selects a random prototype from the TR that it is introduced to GS. Next, while the end condition is achived five different operators are applied to GS. These operators focus their attention on defining regions in the search space.
First of all, ENPC obtains information about the search, i.e. it checks the quality of GS with regard to TR with different measures based on the NN.
Then, GS suffers a mutation phase where prototypes are re-labeled with the most populate class in each region of the search space. In order to refine GS a reproduction operator is introduced to generate new prototypes. Each prototype p from the GS has the opportunity of introducing a new prototype to increase its own quality with the objective of defining different regions. Next, a fight operator is able to refine the regions. This step does not modify the prototypes of GS it is only used to define again the different groups that we can find in GS. When the different regions have been established, ENPC executes the move operator. It consists in relocating each protoype in the best expected place, i.e. it is moved to the centroid of its region. Finally, a die operator is used to remove some prototypes from GS that are not relevant.
AMPSO
A. Cervantes, I. M. Galvan, and P. Isasi, ''AMPSO: A new particle swarm method for nearest neighborhood classification,'' IEEE Trans- actions on Systems, Man, and Cybernetics - Part B: Cybernetics, vol. 39, no. 5, pp. 1082--1091, 2009
AMPSO is a PSO algorithm with a particular way of encoding the solution into the system. AMPSO constitutes the population with the prototypes of the solution, i.e. each particle is a prototype. Next, PSO rules guide the optimization of the positioning. At each iteration, the position of each prototype is updated with the movement equations, which are neighborhood-based.
This method has a quirky way of understand the neighborhood. Specifically, for each class two different sets of prototypes are established, competing and non-competing sets, which are defined at following:
For each prototype of class Ci, non-competing prototypes are all the prototypes of classes Cj != Ci that currently classify at least one pattern of TR of class Ci.
For each prototype of class Ci, competing prototypes are all the prototypes of classes Ci that currently classify at least one pattern of TR of class Ci
The position of each prototype is attracted by the closest non-competing prototype, and repelled by the closest competing prototype. Other parameters also determine the movement.
LVQPRU
J. Li, M. T. Manry, C. Yu, and D. R. Wilson, ''Prototype classifier design with pruning.'' International Journal on Artificial Intelligence Tools, vol. 14, no. 1-2, pp. 261--280, 2005.
LVQPRU extends LVQ by using a pruning step to remove noisy instances. Concretely, it uses a LVQ2.1 in order to refine the positioning of the prototypes and it prunes theses prototypes whose removal causes the least increase in classification error.
HYB
S.-W. Kim and B. J. Oommen, ''Enhancing prototype reduction schemes with LVQ3-type algorithms,'' Pattern Recognition, vol. 36, no. 5, pp. 1083--1093, 2003.
It constitutes a hybridization of several prototype reduction techniques. Concretely, HYB combines Support Vector Machines with LVQ3 and executes a search in order to find the most appropriate parameters of LVQ3.
LVQ3
T. Kohonen, ''The self organizing map,'' Proceedings of the IEEE, vol. 78, no. 9, pp. 1464--1480, 1990.
Learning Vector Quantization can be understood as a special case of artificial neural network in which a neuron corresponds to a prototype and a competition weight based is carried out in order to locate each neuron in a concrete place of the m-dimensional space to increase the classification accuracy.
The configuration parameters for all the algorithms of this study are shown in Table 3. We have selected common values for all problems, focusing this experimentation on the recommended parameters proposed by their respective authors, assuming that the choice of the values of the parameters was optimally chosen.
Table 2. Parameter specification for the algorithms employed in the experimentation.
Algorithm | Parameters |
---|---|
IPADE | iterations of Basic DE = 300/500/1000, iterSFGSS = 8,iterSFHC =20, Fl=0.1, Fu=0.9 |
RSP3 | Subset Choice = Diameter |
MixtGauss | Reduction Rate = 0.95 |
ENPC | Iterations = 300/500/1000 |
AMPSO | Iterations = 300/500/1000, C1 = 1.0, C2 = 1.0, C3 = 0.25 Vmax = 1, W = 0.1, X = 0.5, Pr = 0.1, Pd = 0.1 |
LVQPRU | Iterations = 300/500/1000, α = 0.1 ,WindowWidth = 0.5 |
HYB | Search Iterations = 300/500/1000, Optimal search Iterations = 1000. α = 0.1 , Initial ε = 0, Final ε = 0.5, Initial WindowWidth = 0, Final WindowWidth = 0.5, δ = 0.1, delta WindowWidth = 0.1 Initial Selection = SVM |
LVQ3 | Iterations = 300/500/1000, α = 0.1, WindowWidth=0.2 ε = 0.1 |
Statistical tests used
In this paper, we use the hypothesis testing techniques to provide statistical support for the analysis of the results. Specifically, we use non-parametric tests, due to the fact that the initial conditions that guarantee the reliability of the parametric tests may not be satisfied, causing the statistical analysis to lose credibility with these parametric tests. These tests are suggested to use in the field of Machine Learning.
In order to perform multiple comparisons between our proposals and the rest of the techniques considered, we will use the Friedman Aligned-Ranks test to detect statistical differences among a group of results and the Holm post-hoc test, to find out which algorithms are distinctive among the 1*n comparisons performed.
The Friedman test is based on n sets of ranks, one set for each data set in our case; and the performances of the algorithms analyzed are ranked separately for each data set. Such a ranking scheme allows for intra-set comparisons only, since inter-set comparisons are not meaningful. When the number of algorithms for comparison is small, this may pose a disadvantage. In such cases, comparability among data sets is desirable and we can employ the method of aligned ranks.
In this technique, a value of location is computed as the average performance achieved by all algorithms in each data set. Then it calculates the difference between the performance obtained by an algorithm and the value of location. This step is repeated for algorithms and data sets. The resulting differences, called aligned observations, which keep their identities with respect to the data set and the combination of algorithms to which they belong, are then ranked from 1 to kn relative to each other. Then, the ranking scheme is the same as that employed by a multiple comparison procedure which employs independent samples; such as the Kruskal-Wallis test. The ranks assigned to the aligned observations are called aligned ranks.
The Friedman aligned ranks test statistic can be written as:
$$T=\frac{(k-1)[\sum_{j=1}^{k}\widehat{R_j^2}-(kn^2/4)(kn+1)^2]}{([kn(kn+1)(2kn+1)]/6)-(1/k)\sum_{i=1}^{n}\widehat{R_j^2}}$$
where Ri. is equal to the rank total of the ith data set and R.j is the rank total of the jth algorithm.
The test statistic T is compared for significance with a chi-square distribution for k - 1 degrees of freedom. Critical values can be found at Table A3 in (Sheskin, D.J.: Handbook of Parametric and Nonparametric Statistical Procedures. CRC Press, Boca Raton (2006)). Furthermore, the p-value could be computed through normal approximations (Abramowitz, M.: Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables. Dover, New York (1974)). If the null hypothesis is rejected, we can proceed with a post hoc test.
We focus on the comparison between a control method, which is usually the proposed method, and a set of algorithms used in the empirical study. This set of comparisons is associated with a set or family of hypotheses, all of which
are related to the control method. Any of the post hoc tests is suitable for application to nonparametric tests working over a family of hypotheses. The test statistic for comparing the ith algorithm and jth algorithm depends on the main nonparametric procedure used. In this case, it depends on the Friedman Aligned Ranks test:
Since the set of related rankings is converted to absolute rankings, the expression for computing the test statistic in Friedman Aligned Ranks is the same as that used by the Kruskal–Wallis test (Zar, J.H.: Biostatistical Analysis. Prentice Hall, Englewood Cliffs (1999)):
$$z=(\widehat{R_i}-\widehat{R_j})/\sqrt{\frac{k(n+1)}{6}}$$
where Ri; Rj are the average rankings by Friedman Aligned Ranks of the algorithms compared.
In statistical hypothesis testing, the p-value is the probability of obtaining a result at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. It is a useful and interesting datum for many consumers of statistical analysis. A p-value provides information about whether a statistical hypothesis test is significant or not, and it also indicates something about ‘‘how significant” the result is: the smaller the p-value, the stronger the evidence against the null hypothesis. Most importantly, it does this without committing to a particular level of significance.
When a p-value is considered in a multiple comparison, it reflects the probability error of a certain comparison, but it does not take into account the remaining comparisons belonging to the family. If one is comparing k algorithms and in each comparison the level of significance is α, then in a single comparison the probability of not making a Type I error is (1-α), then the probability of not making a Type I error in the k - 1 comparison is (1 - α)(k - 1). Then the probability of making one or more Type I error is 1 - (1 - α)(k - 1).
One way to solve this problem is to report adjusted p-values (APVs) which take into account that multiple tests are conducted. An APV can be compared directly with any chosen significance level α. We recommend the use of APVs due to the fact that they provide more information in a statistical analysis.
The z value in all cases is used to find the corresponding probability (p-value) from the table of normal distribution N(0,1), which is then compared with an appropriate level of significance a (Table A1 in Sheskin, D.J.: Handbook of Parametric and Nonparametric Statistical Procedures. CRC Press, Boca Raton (2006)). The post hoc tests differ in the way they adjust the value of a to compensate for multiple comparisons.
Next, we will define a candidate set of post hoc procedures and we will explain how to compute the APVs depending on the post hoc procedure used in the analysis, following the indications given in P.H. Westfall, S.S. Young, Resampling-based Multiple Testing: Examples and Methods for p-Value Adjustment, John Wiley and Sons (2004). The notation used in the computation of the APVs is as follows:
- Indexes i and j each correspond to a concrete comparison or hypothesis in the family of hypotheses, according to an incremental order of their p-values. Index i always refers to the hypothesis in question whose APV is being computed and index j refers to another hypothesis in the family.
- pj is the p-value obtained for the jth hypothesis.
- k is the number of classifiers being compared.
The Holm procedure (S. Holm, A simple sequentially rejective multiple test procedure, Scandinavian Journal of Statistics 6 (1979) 65–70): it adjusts the value of a in a step-down manner. Let p1; p2; ... ; pk - 1 be the ordered p-values (smallest to largest), so that p1 ≤ p2 ≤ ... ≤ pk - 1, and H1;H2; ... ;Hk - 1 be the corresponding hypotheses. The Holm procedure rejects H1 to Hi - 1 if i is the smallest integer such that pi > α/(k - 1). Holm’s step-down procedure starts with the most significant p-value. If p1 is below α/(k - 1), the corresponding hypothesis is rejected and we are allowed to compare p2 with α/(k - 2). If the second hypothesis is rejected, the test proceeds with the third, and so on. As soon as a certain null hypothesis cannot be rejected, all the remaining hypotheses are retained as well. Holm APVi: min{v; 1}, where v = max{(k - j)pj : 1 ≤ j ≤ i}
More information about these tests and other statistical procedures can be found at SICIDM.
Results obtained
In this section, we report the full results obtained in the experimental study. For each pair data set/algorithm, we report its average performance results (using accuracy, reduction rate and Acc*Red) and standard deviations. Furthermore, we have done a previous study for each method that depends on the number of iterations performed, with 300, 500 and 1000 iterations in all the data sets. This parameter can be very sensitive to the problem tackled. An excesive number of iteration may produce overfitting for some problems, and a lower number of iterations may not be enough to tackle other data sets. Finally, we present the best performing number of iterations in each method and data set.
Accuracy Training
Tabla demasiado grande
Accuracy Test
Tabla demasiado grande
Reduction Rate
Tabla demasiado grande
Acc*Red
Tabla demasiado grande
Runtime
In this section we will analyze the average time elapsed (in seconds) by every PG method in each complete execution (no times are given for 1-NN, since it does not perform a data reduction phase). The characteristics of the machine used in the experimental study are:
Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
Cache size : 8192 KB
Cpu cores : 4.
MemTotal : 12322668 kB.
Tabla demasiado grande
Summary
In this section we present a comparison between the different algorithms with the best results obtained for each data sets depending on the Iteration number.
Table 8.Accuracy test of the best results obtained for each method with the different data sets.
1NN | IPADE best | RSP3 | MixtGauss | ENPC best | AMPSO best | LVQPRU best | HYB best | LVQ3 best | |
---|---|---|---|---|---|---|---|---|---|
Data set | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. |
appendicitis | 0.7936 | 0.8873 | 0.8018 | 0.8318 | 0.7945 | 0.8609 | 0.8400 | 0.7645 | 0.8973 |
australian | 0.8145 | 0.8652 | 0.8290 | 0.8304 | 0.7797 | 0.8420 | 0.8232 | 0.7203 | 0.8275 |
bal | 0.7904 | 0.8528 | 0.8432 | 0.8735 | 0.7150 | 0.6032 | 0.8367 | 0.7025 | 0.7985 |
bands | 0.6309 | 0.6921 | 0.6940 | 0.6810 | 0.7292 | 0.6606 | 0.7032 | 0.7126 | 0.6700 |
bre | 0.6535 | 0.7159 | 0.6317 | 0.6293 | 0.5943 | 0.6951 | 0.6628 | 0.6083 | 0.7098 |
bupa | 0.6108 | 0.7039 | 0.5772 | 0.6126 | 0.6170 | 0.6044 | 0.6086 | 0.5943 | 0.5896 |
car | 0.8565 | 0.8253 | 0.8466 | 0.7188 | 0.8096 | 0.7089 | 0.8629 | 0.8356 | 0.8056 |
cleveland | 0.5314 | 0.5678 | 0.5311 | 0.5349 | 0.5380 | 0.5681 | 0.5880 | 0.5116 | 0.5509 |
contraceptive | 0.4277 | 0.5479 | 0.4406 | 0.4195 | 0.4271 | 0.4474 | 0.4508 | 0.4033 | 0.4446 |
crx | 0.7957 | 0.8536 | 0.8348 | 0.8116 | 0.7754 | 0.8464 | 0.8290 | 0.7333 | 0.8319 |
dermatology | 0.9535 | 0.9620 | 0.9536 | 0.9590 | 0.9372 | 0.8993 | 0.9566 | 0.9454 | 0.9563 |
ecoli | 0.8070 | 0.8010 | 0.8157 | 0.7294 | 0.7650 | 0.7117 | 0.7406 | 0.7769 | 0.7715 |
flare-solar | 0.5554 | 0.6679 | 0.5517 | 0.6679 | 0.6614 | 0.6548 | 0.6209 | 0.6108 | 0.6304 |
german | 0.7050 | 0.7360 | 0.6990 | 0.6750 | 0.6430 | 0.6390 | 0.6560 | 0.6590 | 0.6990 |
glass | 0.7361 | 0.6552 | 0.6711 | 0.4812 | 0.7185 | 0.5534 | 0.6613 | 0.7425 | 0.5725 |
haberman | 0.6697 | 0.7413 | 0.6666 | 0.7545 | 0.6277 | 0.7414 | 0.7215 | 0.5230 | 0.6930 |
hayes-roth | 0.3570 | 0.8216 | 0.2648 | 0.5536 | 0.3653 | 0.5648 | 0.4943 | 0.3327 | 0.5220 |
heart | 0.7704 | 0.8481 | 0.7259 | 0.8000 | 0.7741 | 0.7963 | 0.8148 | 0.7333 | 0.7778 |
hepatitis | 0.8075 | 0.8400 | 0.7546 | 0.7804 | 0.7550 | 0.8121 | 0.7946 | 0.7488 | 0.7804 |
housevotes | 0.9216 | 0.9378 | 0.9053 | 0.8872 | 0.9008 | 0.8891 | 0.9145 | 0.9056 | 0.9057 |
iris | 0.9333 | 0.9600 | 0.9400 | 0.9333 | 0.9400 | 0.9467 | 0.9533 | 0.9333 | 0.9000 |
led7digit | 0.4020 | 0.7380 | 0.5680 | 0.7360 | 0.6380 | 0.7140 | 0.6920 | 0.6760 | 0.6880 |
lym | 0.7387 | 0.7915 | 0.7730 | 0.6696 | 0.7177 | 0.6977 | 0.7630 | 0.7779 | 0.7633 |
mammographic | 0.7368 | 0.8200 | 0.7409 | 0.7867 | 0.7243 | 0.7857 | 0.7503 | 0.5807 | 0.7170 |
monks | 0.7791 | 0.9681 | 0.7092 | 0.4883 | 0.7455 | 0.7224 | 0.7558 | 0.7713 | 0.7649 |
newthyroid | 0.9723 | 0.9675 | 0.9541 | 0.9258 | 0.9673 | 0.9307 | 0.9494 | 0.9582 | 0.9216 |
pima | 0.7033 | 0.7618 | 0.7214 | 0.7358 | 0.6577 | 0.7241 | 0.7240 | 0.6538 | 0.7150 |
saheart | 0.6449 | 0.7208 | 0.6492 | 0.6279 | 0.6099 | 0.6902 | 0.6559 | 0.5973 | 0.6948 |
sonar | 0.8555 | 0.7783 | 0.8264 | 0.7071 | 0.9086 | 0.6674 | 0.8219 | 0.8698 | 0.7543 |
spectfheart | 0.6970 | 0.7940 | 0.7422 | 0.7640 | 0.7568 | 0.7983 | 0.7534 | 0.7608 | 0.7793 |
tae | 0.4050 | 0.5763 | 0.4654 | 0.3258 | 0.4388 | 0.5254 | 0.4379 | 0.5188 | 0.5308 |
tic-tac-toe | 0.7307 | 0.7495 | 0.7662 | 0.6711 | 0.7640 | 0.6900 | 0.7400 | 0.7922 | 0.7109 |
wine | 0.9552 | 0.9552 | 0.9324 | 0.9660 | 0.9608 | 0.9275 | 0.9663 | 0.9663 | 0.9663 |
wisconsin | 0.9557 | 0.9699 | 0.9485 | 0.9714 | 0.9357 | 0.9599 | 0.9642 | 0.9414 | 0.9571 |
yeast | 0.5047 | 0.5782 | 0.5405 | 0.5209 | 0.4832 | 0.4866 | 0.5263 | 0.5135 | 0.5223 |
zoo | 0.9281 | 0.9767 | 0.8892 | 0.7406 | 0.7561 | 0.9139 | 0.8617 | 0.9364 | 0.9264 |
abalone | 0.1991 | 0.2132 | 0.2350 | 0.1907 | 0.1979 | 0.2041 | 0.2298 | 0.2010 | 0.2020 |
banana | 0.8751 | 0.8345 | 0.8755 | 0.6260 | 0.8566 | 0.8106 | 0.8789 | 0.8591 | 0.8657 |
chess | 0.8470 | 0.9014 | 0.8786 | 0.6152 | 0.8454 | 0.7728 | 0.8292 | 0.8279 | 0.7819 |
coil2000 | 0.8963 | 0.9400 | 0.8954 | 0.9391 | 0.8692 | 0.9384 | 0.9059 | 0.7819 | 0.9256 |
magic | 0.8059 | 0.7968 | 0.8082 | 0.7799 | 0.7734 | 0.7968 | 0.7843 | 0.6513 | 0.7768 |
marketing | 0.2738 | 0.2918 | 0.2786 | 0.2859 | 0.2560 | 0.2562 | 0.2704 | 0.2353 | 0.2472 |
page-blocks | 0.9576 | 0.9322 | 0.8107 | 0.8915 | 0.9534 | 0.9102 | 0.9424 | 0.9006 | 0.9238 |
ring | 0.7524 | 0.8600 | 0.9896 | 0.7577 | 0.8811 | 0.7331 | 0.7051 | 0.7277 | 0.9218 |
satimage | 0.9058 | 0.8159 | 0.8719 | 0.8418 | 0.9005 | 0.8001 | 0.8828 | 0.8914 | 0.8632 |
segment | 0.9662 | 0.9056 | 0.7547 | 0.8506 | 0.9580 | 0.7664 | 0.9485 | 0.9545 | 0.9035 |
splice | 0.7495 | 0.7536 | 0.9506 | 0.6266 | 0.8006 | 0.6209 | 0.7107 | 0.8135 | 0.8481 |
thyroid | 0.9258 | 0.9401 | 0.7997 | 0.9096 | 0.9031 | 0.9205 | 0.9182 | 0.7482 | 0.9081 |
titanic | 0.6075 | 0.7892 | 0.9838 | 0.7556 | 0.7792 | 0.7889 | 0.7044 | 0.7756 | 0.8885 |
twonorm | 0.9468 | 0.9777 | 0.9154 | 0.9761 | 0.9450 | 0.9461 | 0.9469 | 0.9423 | 0.9508 |
AVERAGE | 0.7368 | 0.7916 | 0.7451 | 0.7170 | 0.7370 | 0.7309 | 0.7511 | 0.7224 | 0.7551 |
Table 9.Reduction rate of the best results obtained for each method with the different data sets.
1NN | IPADE best | RSP3 | MixtGauss | ENPC best | AMPSO best | LVQPRU best | HYB best | LVQ3 best | |
---|---|---|---|---|---|---|---|---|---|
Data set | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. |
appendicitis | 0.0000 | 0.9738 | 0.8344 | 0.9581 | 0.6823 | 0.9371 | 0.9161 | 0.5996 | 0.9476 |
australian | 0.0000 | 0.9955 | 0.8341 | 0.9517 | 0.6976 | 0.9485 | 0.9034 | 0.6462 | 0.9501 |
bal | 0.0000 | 0.9909 | 0.8158 | 0.9520 | 0.7312 | 0.9467 | 0.9042 | 0.1340 | 0.9502 |
bands | 0.0000 | 0.9932 | 0.5510 | 0.9505 | 0.7401 | 0.9464 | 0.9027 | 0.3129 | 0.9505 |
bre | 0.0000 | 0.9872 | 0.6857 | 0.9534 | 0.5381 | 0.9456 | 0.9064 | 0.2952 | 0.9534 |
bupa | 0.0000 | 0.9878 | 0.5014 | 0.9549 | 0.5839 | 0.8520 | 0.9549 | 0.3955 | 0.9517 |
car | 0.0000 | 0.9946 | 0.7359 | 0.9511 | 0.8968 | 0.9486 | 0.9022 | 0.1697 | 0.9505 |
cleveland | 0.0000 | 0.9725 | 0.7712 | 0.9633 | 0.5024 | 0.8759 | 0.9039 | 0.6124 | 0.9490 |
contraceptive | 0.0000 | 0.9955 | 0.6544 | 0.9502 | 0.3644 | 0.8548 | 0.9011 | 0.3553 | 0.9502 |
crx | 0.0000 | 0.9955 | 0.8419 | 0.9517 | 0.6958 | 0.9354 | 0.9031 | 0.1767 | 0.9501 |
dermatology | 0.0000 | 0.9794 | 0.8673 | 0.9636 | 0.9359 | 0.9396 | 0.9523 | 0.5533 | 0.9493 |
ecoli | 0.0000 | 0.9663 | 0.7440 | 0.9735 | 0.6644 | 0.8925 | 0.9031 | 0.7046 | 0.9372 |
flare-solar | 0.0000 | 0.9976 | 0.9628 | 0.9512 | 0.9268 | 0.9491 | 0.9015 | 0.3059 | 0.9506 |
german | 0.0000 | 0.9954 | 0.6203 | 0.9511 | 0.7237 | 0.9435 | 0.9004 | 0.9251 | 0.9500 |
glass | 0.0000 | 0.9548 | 0.6948 | 0.9688 | 0.6122 | 0.9377 | 0.9045 | 0.4068 | 0.9377 |
haberman | 0.0000 | 0.9855 | 0.7509 | 0.9564 | 0.5225 | 0.8629 | 0.9546 | 0.0083 | 0.9528 |
hayes-roth | 0.0000 | 0.9520 | 0.6271 | 0.9723 | 0.6406 | 0.9136 | 0.9200 | 0.4855 | 0.9571 |
heart | 0.0000 | 0.9868 | 0.7626 | 0.9506 | 0.7119 | 0.9362 | 0.9091 | 0.4554 | 0.9506 |
hepatitis | 0.0000 | 0.9713 | 0.7699 | 0.9570 | 0.7914 | 0.8658 | 0.9111 | 0.5572 | 0.9534 |
housevotes | 0.0000 | 0.9903 | 0.8631 | 0.9540 | 0.9111 | 0.9489 | 0.9017 | 0.5469 | 0.9515 |
iris | 0.0000 | 0.9770 | 0.8785 | 0.9556 | 0.8837 | 0.9333 | 0.9600 | 0.8077 | 0.9556 |
led7digit | 0.0000 | 0.9751 | 0.8882 | 0.9556 | 0.8487 | 0.8570 | 0.9562 | 0.0807 | 0.9511 |
lym | 0.0000 | 0.9662 | 0.6872 | 0.9700 | 0.7635 | 0.9200 | 0.8972 | 0.8193 | 0.9399 |
mammographic | 0.0000 | 0.9957 | 0.8001 | 0.9514 | 0.6319 | 0.8844 | 0.9015 | 0.5023 | 0.9503 |
monks | 0.0000 | 0.9892 | 0.5144 | 0.9537 | 0.7739 | 0.9486 | 0.9028 | 0.5752 | 0.9511 |
newthyroid | 0.0000 | 0.9835 | 0.8491 | 0.9535 | 0.8630 | 0.8742 | 0.9530 | 0.4092 | 0.9535 |
pima | 0.0000 | 0.9959 | 0.6793 | 0.9508 | 0.6567 | 0.8760 | 0.9013 | 0.4010 | 0.9508 |
saheart | 0.0000 | 0.9899 | 0.6792 | 0.9519 | 0.6308 | 0.8830 | 0.9026 | 0.3378 | 0.9519 |
sonar | 0.0000 | 0.9770 | 0.6010 | 0.9573 | 0.8478 | 0.9466 | 0.9054 | 0.3333 | 0.9519 |
spectfheart | 0.0000 | 0.9863 | 0.6979 | 0.9501 | 0.8177 | 0.9417 | 0.9051 | 0.5481 | 0.9501 |
tae | 0.0000 | 0.9603 | 0.7005 | 0.9558 | 0.4629 | 0.9338 | 0.9169 | 0.6151 | 0.9558 |
tic-tac-toe | 0.0000 | 0.9935 | 0.7610 | 0.9513 | 0.7866 | 0.9490 | 0.9016 | 0.3054 | 0.9501 |
wine | 0.0000 | 0.9800 | 0.8489 | 0.9625 | 0.9245 | 0.8612 | 0.9089 | 0.3347 | 0.9501 |
wisconsin | 0.0000 | 0.9965 | 0.9488 | 0.9523 | 0.9258 | 0.9357 | 0.9030 | 0.4332 | 0.9507 |
yeast | 0.0000 | 0.9897 | 0.6602 | 0.9551 | 0.4894 | 0.9476 | 0.9024 | 0.8461 | 0.9491 |
zoo | 0.0000 | 0.9121 | 0.8490 | 0.9229 | 0.8672 | 0.9229 | 0.9143 | 0.0213 | 0.9130 |
abalone | 0.0000 | 0.9916 | 0.7610 | 0.9561 | 0.2567 | 0.8294 | 0.9001 | 0.6083 | 0.9475 |
banana | 0.0000 | 0.9984 | 0.8962 | 0.9501 | 0.7485 | 0.9799 | 0.9004 | 0.0000 | 0.9501 |
chess | 0.0000 | 0.9985 | 0.6362 | 0.9506 | 0.8920 | 0.8791 | 0.9004 | 0.1049 | 0.9503 |
coil2000 | 0.0000 | 0.9997 | 0.9396 | 0.9500 | 0.8341 | 0.9429 | 0.9802 | 0.8139 | 0.9500 |
magic | 0.0000 | 0.9998 | 0.7675 | 0.9501 | 0.7695 | 0.9468 | 0.9001 | 0.7988 | 0.9501 |
marketing | 0.0000 | 0.9981 | 0.7828 | 0.9511 | 0.3658 | 0.9800 | 0.9005 | 0.5369 | 0.9501 |
page-blocks | 0.0000 | 0.9984 | 0.5510 | 0.9503 | 0.8914 | 0.9797 | 0.9007 | 0.2363 | 0.9500 |
ring | 0.0000 | 0.9983 | 0.9054 | 0.9502 | 0.9130 | 0.9799 | 0.9003 | 0.8372 | 0.9500 |
satimage | 0.0000 | 0.9983 | 0.7982 | 0.9575 | 0.8952 | 0.9824 | 0.9006 | 0.4742 | 0.9501 |
segment | 0.0000 | 0.9946 | 0.7181 | 0.9529 | 0.8399 | 0.9798 | 0.9028 | 0.5408 | 0.9505 |
splice | 0.0000 | 0.9982 | 0.8589 | 0.9509 | 0.9185 | 0.9791 | 0.9005 | 0.5018 | 0.9502 |
thyroid | 0.0000 | 0.9992 | 0.5110 | 0.9500 | 0.8593 | 0.9796 | 0.9003 | 0.6342 | 0.9500 |
titanic | 0.0000 | 0.9985 | 0.8417 | 0.9505 | 0.9936 | 0.9798 | 0.9006 | 0.8593 | 0.9500 |
twonorm | 0.0000 | 0.9994 | 0.9210 | 0.9502 | 0.9529 | 0.9799 | 0.9003 | 0.9935 | 0.9500 |
AVERAGE | 0.0000 | 0.9861 | 0.7564 | 0.9541 | 0.7436 | 0.9279 | 0.9115 | 0.4791 | 0.9493 |
Table 10. Accuracy*Reduction rate of the best results obtained for each method with the different data sets.
1NN | IPADE best | RSP3 | MixtGauss | ENPC best | AMPSO best | LVQPRU best | HYB best | LVQ3 best | |
---|---|---|---|---|---|---|---|---|---|
Data set | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. | Acc*Red. |
appendicitis | 0.0000 | 0.8641 | 0.6690 | 0.7969 | 0.5421 | 0.8067 | 0.7695 | 0.4584 | 0.8503 |
australian | 0.0000 | 0.8613 | 0.6915 | 0.7903 | 0.5439 | 0.7986 | 0.7437 | 0.4655 | 0.7862 |
bal | 0.0000 | 0.8450 | 0.6879 | 0.8316 | 0.5228 | 0.5710 | 0.7565 | 0.0941 | 0.7587 |
bands | 0.0000 | 0.6874 | 0.3824 | 0.6473 | 0.5397 | 0.6252 | 0.6348 | 0.2230 | 0.6368 |
bre | 0.0000 | 0.7067 | 0.4332 | 0.6000 | 0.3198 | 0.6573 | 0.6008 | 0.1796 | 0.6767 |
bupa | 0.0000 | 0.6953 | 0.2894 | 0.5850 | 0.3603 | 0.5149 | 0.5812 | 0.2350 | 0.5611 |
car | 0.0000 | 0.8208 | 0.6230 | 0.6837 | 0.7260 | 0.6725 | 0.7785 | 0.1418 | 0.7657 |
cleveland | 0.0000 | 0.5522 | 0.4096 | 0.5153 | 0.2703 | 0.4976 | 0.5315 | 0.3133 | 0.5228 |
contraceptive | 0.0000 | 0.5454 | 0.2883 | 0.3986 | 0.1556 | 0.3824 | 0.4062 | 0.1433 | 0.4225 |
crx | 0.0000 | 0.8498 | 0.7028 | 0.7724 | 0.5395 | 0.7917 | 0.7487 | 0.1296 | 0.7904 |
dermatology | 0.0000 | 0.9422 | 0.8271 | 0.9241 | 0.8771 | 0.8450 | 0.9110 | 0.5231 | 0.9078 |
ecoli | 0.0000 | 0.7740 | 0.6069 | 0.7101 | 0.5083 | 0.6352 | 0.6688 | 0.5474 | 0.7230 |
flare-solar | 0.0000 | 0.6663 | 0.5312 | 0.6353 | 0.6130 | 0.6215 | 0.5597 | 0.1868 | 0.5993 |
german | 0.0000 | 0.7326 | 0.4336 | 0.6420 | 0.4653 | 0.6029 | 0.5907 | 0.6096 | 0.6641 |
glass | 0.0000 | 0.6256 | 0.4663 | 0.4662 | 0.4399 | 0.5189 | 0.5981 | 0.3020 | 0.5368 |
haberman | 0.0000 | 0.7306 | 0.5005 | 0.7216 | 0.3280 | 0.6398 | 0.6887 | 0.0043 | 0.6603 |
hayes-roth | 0.0000 | 0.7822 | 0.1661 | 0.5383 | 0.2340 | 0.5160 | 0.4548 | 0.1615 | 0.4996 |
heart | 0.0000 | 0.8369 | 0.5536 | 0.7605 | 0.5511 | 0.7455 | 0.7407 | 0.3339 | 0.7394 |
hepatitis | 0.0000 | 0.8159 | 0.5810 | 0.7468 | 0.5975 | 0.7031 | 0.7240 | 0.4172 | 0.7440 |
housevotes | 0.0000 | 0.9287 | 0.7814 | 0.8464 | 0.8207 | 0.8437 | 0.8246 | 0.4953 | 0.8618 |
iris | 0.0000 | 0.9379 | 0.8258 | 0.8919 | 0.8307 | 0.8836 | 0.9152 | 0.7538 | 0.8600 |
led7digit | 0.0000 | 0.7196 | 0.5045 | 0.7033 | 0.5415 | 0.6119 | 0.6617 | 0.0546 | 0.6544 |
lym | 0.0000 | 0.7647 | 0.5312 | 0.6495 | 0.5480 | 0.6419 | 0.6846 | 0.6373 | 0.7174 |
mammographic | 0.0000 | 0.8165 | 0.5928 | 0.7485 | 0.4577 | 0.6949 | 0.6764 | 0.2917 | 0.6814 |
monks | 0.0000 | 0.9576 | 0.3648 | 0.4657 | 0.5769 | 0.6853 | 0.6823 | 0.4437 | 0.7275 |
newthyroid | 0.0000 | 0.9515 | 0.8101 | 0.8828 | 0.8348 | 0.8136 | 0.9048 | 0.3921 | 0.8787 |
pima | 0.0000 | 0.7587 | 0.4900 | 0.6996 | 0.4319 | 0.6343 | 0.6525 | 0.2622 | 0.6798 |
saheart | 0.0000 | 0.7135 | 0.4409 | 0.5977 | 0.3847 | 0.6094 | 0.5920 | 0.2018 | 0.6614 |
sonar | 0.0000 | 0.7604 | 0.4967 | 0.6769 | 0.7703 | 0.6318 | 0.7441 | 0.2899 | 0.7180 |
spectfheart | 0.0000 | 0.7831 | 0.5180 | 0.7259 | 0.6188 | 0.7518 | 0.6819 | 0.4170 | 0.7404 |
tae | 0.0000 | 0.5534 | 0.3260 | 0.3114 | 0.2031 | 0.4906 | 0.4015 | 0.3191 | 0.5073 |
tic-tac-toe | 0.0000 | 0.7446 | 0.5831 | 0.6384 | 0.6010 | 0.6548 | 0.6672 | 0.2419 | 0.6754 |
wine | 0.0000 | 0.9361 | 0.7915 | 0.9298 | 0.8883 | 0.7988 | 0.8783 | 0.3234 | 0.9181 |
wisconsin | 0.0000 | 0.9665 | 0.8999 | 0.9251 | 0.8663 | 0.8982 | 0.8707 | 0.4078 | 0.9099 |
yeast | 0.0000 | 0.5722 | 0.3568 | 0.4975 | 0.2365 | 0.4611 | 0.4749 | 0.4345 | 0.4957 |
zoo | 0.0000 | 0.8908 | 0.7549 | 0.6835 | 0.6557 | 0.8434 | 0.7879 | 0.0199 | 0.8458 |
abalone | 0.0000 | 0.2114 | 0.1788 | 0.1823 | 0.0508 | 0.1693 | 0.2068 | 0.1223 | 0.1914 |
banana | 0.0000 | 0.8332 | 0.7846 | 0.5948 | 0.6412 | 0.7943 | 0.7914 | 0.0000 | 0.8225 |
chess | 0.0000 | 0.9000 | 0.5590 | 0.5848 | 0.7541 | 0.6794 | 0.7466 | 0.0868 | 0.7430 |
coil2000 | 0.0000 | 0.9397 | 0.8413 | 0.8921 | 0.7250 | 0.8848 | 0.8880 | 0.6364 | 0.8793 |
magic | 0.0000 | 0.7966 | 0.6203 | 0.7410 | 0.5951 | 0.7544 | 0.7059 | 0.5203 | 0.7380 |
marketing | 0.0000 | 0.2912 | 0.2181 | 0.2719 | 0.0936 | 0.2511 | 0.2435 | 0.1263 | 0.2349 |
page-blocks | 0.0000 | 0.9307 | 0.4467 | 0.8472 | 0.8499 | 0.8917 | 0.8488 | 0.2128 | 0.8776 |
ring | 0.0000 | 0.8585 | 0.8960 | 0.7200 | 0.8044 | 0.7184 | 0.6348 | 0.6092 | 0.8757 |
satimage | 0.0000 | 0.8145 | 0.6960 | 0.8060 | 0.8061 | 0.7860 | 0.7950 | 0.4227 | 0.8201 |
segment | 0.0000 | 0.9007 | 0.5420 | 0.8105 | 0.8046 | 0.7509 | 0.8563 | 0.5162 | 0.8588 |
splice | 0.0000 | 0.7522 | 0.8165 | 0.5958 | 0.7354 | 0.6079 | 0.6400 | 0.4082 | 0.8059 |
thyroid | 0.0000 | 0.9393 | 0.4086 | 0.8641 | 0.7760 | 0.9017 | 0.8267 | 0.4745 | 0.8627 |
titanic | 0.0000 | 0.7880 | 0.8281 | 0.7182 | 0.7742 | 0.7730 | 0.6344 | 0.6665 | 0.8441 |
twonorm | 0.0000 | 0.9771 | 0.8431 | 0.9275 | 0.9005 | 0.9271 | 0.8525 | 0.9362 | 0.9033 |
AVERAGE | 0.0000 | 0.7805 | 0.5718 | 0.6839 | 0.5742 | 0.6797 | 0.6852 | 0.3439 | 0.7167 |
Comparison of IPADE with PSO and LPD techniques
In this section we provide a comparison with PSO [1] and LPD [2] techniques. Concretely, we use the results established in [1] (Table 8) where the authors compare PSO, LPD and 1-NN in terms of classification error:
Table 11. Comparison between IPADE, PSO and LDP techniques.
PSO(5) | LPD(5) | 1-NN | IPADE | |||||
---|---|---|---|---|---|---|---|---|
Data set | Acc. | Std. | Acc. | Std. | Acc. | Std. | Acc. | Std. |
Heart | 0.8400 | 0.0200 | 0.7300 | 0.0310 | 0.7840 | 0.0290 | 0.8481 | 0.0785 |
Pima | 0.7510 | 0.0060 | 0.7360 | 0.0010 | 0.6950 | 0.0290 | 0.7592 | 0.0388 |
Wisconsin | 0.9660 | 0.0060 | 0.9650 | 0.0090 | 0.9560 | 0.0060 | 0.9685 | 0.0210 |
Wine | 0.9600 | 0.0080 | 0.9550 | 0.0150 | 0.9470 | 0.0130 | 0.9493 | 0.0632 |
Spectfheart | 0.7940 | 0.0140 | 0.7900 | 0.0350 | 0.6960 | 0.0290 | 0.7940 | 0.0631 |
AVERAGE | 0.8622 | 0.0108 | 0.8352 | 0.0182 | 0.8156 | 0.0212 | 0.8638 | 0.0529 |
JAVA code for the methods used
In this section, we provide the source code of IPADE and the rest of techniques employed in the experimental framework.
Source code
The source code of the IPADE package can be downloaded from here.
It is written in the Java programming language. Although it is developed to be run inside the KEEL environment, it can be also executed on a standard Java Virtual Machine. To do this, it is only needed to place the training datasets at the same folder, and write a script which contains al the relevant parameters of the algorithms (see the KEEL reference manual, section 3; located under the KEEL Software Tool item of the main menu).
KEEL version with Prototype Generation package
The KEEL version with the Prototype Generation package can be downloaded from here. Note that the source code is available in Section ''KEEL Source code''.
The complete KEEL user manual can be downloaded from here.
Essentially, to generate a experiment, it is needed to perform the following steps:
- Execute the KEEL GUI with the command: java -jar GraphInterKeel.jar.
- Select Experiments.
- Select Classification (it is recommended the use of 10-fold cross validation, as set by default).
- Select the data sets desired to use, and click on the main window to place them.
- Click on the third icon of the left bar, select the IG method desired, and click on the main window to place it.
- Click on the last icon of the left bar, and create some arcs which joins all the nodes of the experiments.
- Click on the Run Experiment icon, located at the top bar.
This way, it is possible to generate an experiment ready to be executed on any machine with a Java Virtual Machine installed.
References
[1] L. Nanni and A. Lumini, "Particle swarm optimization for prototype reduction," Neurocomputing, vol. 72, no. 4-6, pp. 1092-1097, 2009.
[2] R. Parades and E. Vidal, "Learning prototypes and distances: a prototype reduction technique based on nearest neighbor error minimization," Pattern Recognition, vol. 39, pp. 180-188, 2006.