Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study - Complementary Material

This Website contains additional material to the SCI2S research paper on Prototype Selection

S. García, J. Derrac, J.R. Cano and F. Herrera, Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study. IEEE Transactions on Pattern Analysis and Machine Intelligence 34:3 (2012) 417-435 doi: 10.1109/TPAMI.2011.142 PDF Icon

Summary:

S. García, J. Derrac, J.R. Cano and F.Herrera, Prototype Selection for Nearest Neighbor Classification: Survey of Methods, Technical Report T-4-2010-PSMethods. PDF Icon

Abstract

Prototype selection is a research field which has been active for more than four decades. As a result, a great number of methods tackling the prototype selection problem have been proposed in the literature.

This technical report provides a survey of the most representative algorithms developed so far. A widely used categorization (edition, condensation and hybrid methods) has been employed to present them and describe their main characteristics, thus providing a first insight into the prototype selection field which may be useful for every practitioner who needs a quick reference about the existing techniques and their particularities.

Taxonomy of methods

  1. Condensation algorithms
    1. Incremental
      1. Condensed Nearest Neighbor Rule (CNN)
      2. Ullman algorithm (Ullmann)
      3. Tomek Condensed Nearest Neighbor (TCNN)
      4. Mutual Neighborhood value Algorithm (MNV)
      5. Modified Condensed Nearest Neighbor (MCNN)
      6. Generalized Condensed Nearest Neighbor (GCNN)
      7. Fast Condensed Neighbor algorithms family (FCNN)
      8. Prototype Selection based on Clustering (PSC)
    2. Decremental
      1. Reduced Nearest Neighbor (RNN)
      2. Selective Nearest Neighbor (SNN)
      3. Shrink (Shrink)
      4. Minimal Consistent Set (MCS)
      5. Modified Selective Algorithm (MSS)
      6. PEBS Algorithm (PEBS)
    3. Batch
      1. Improved KNN (IKNN)
      2. Patterns by Ordered Projections (POP)
      3. Max Nearest Centroid Neighbor (Max-NCN)
      4. Reconsistent (Reconsistent)
      5. Template Reduction KNN (TRKNN)
  2. Edition algorithms
    1. Decremental
      1. Edited Nearest Neighbor (ENN)
      2. Repeated-ENN (RENN)
      3. Multiedit (Multiedit)
      4. Relative Neighborhood Graph Edition (RNGE)
      5. Modified Edited Nearest Neighbor (MENN)
      6. Nearest Centroid Neighbor Edition (NCNEdit)
      7. Edited Normalized Radial Basis Function (ENRBF)
      8. Edited Normalized Radial Basis Function 2 (ENRBF2)
      9. Edited Nearest Neighbor Estimating Class Probabilistic (ENNProb)
      10. Edited Nearest Neighbor Estimating Class Probabilistic and Threshold (ENNTh)
    2. Batch
      1. All k-NN (AllKNN)
      2. Model Class Selection (MoCS)
  3. Hybrid algorithms
    1. Incremental
      1. Instance-Based Learning Algorithms Family (IB3)
    2. Decremental
      1. Filter
        1. Variable Similarity Metric (VSM)
        2. Polyline Functions (PF)
        3. Decremental Reduction Optimization Procedure Algorithms Family (DROP3)
        4. Decremental Encoding Length (DEL)
        5. Prototype Selection by Relative Certainty Gain (PSRCG)
        6. C-Pruner (CPruner)
        7. Support Vector Based Prototype Selection (SVBPS)
        8. Noise Removing based on Minimal Consistent Set (NRMCS)
        9. Class Conditional Instance Selection (CCIS)
      2. Wrapper
        1. Backward Sequential Edition (BSE)
    3. Batch
      1. Iterative Case Filtering (ICF)
      2. Hit-Miss Network Algorithms (HMN)
    4. Mixed + Wrapper
      1. Explore (Explore)
      2. Generational Genetic Algorithm (GGA)
      3. Steady-State Genetic Algorithm (SSGA)
      4. Population Based Incremental Learning (PBIL)
      5. Cerveron’s Tabu Search (CerveronTS)
      6. Estimation of Distribution Algorithm (EDA)
      7. Intelligent Genetic Algorithm (IGA)
      8. Zhang’s Tabu Search (ZhangTS)
      9. CHC (CHC)
      10. Genetic Algorithm based on Mean Squared Error, Clustered Crossover and Fast Smart Mutation (GAMSE-CC-PSM)
      11. Steady-state memetic algorithm (SSMA)
      12. COoperative COevolutionary Instance Selection (CoCoIS)
    5. Fixed + Wrapper
      1. Monte Carlo 1 (MC1)
      2. Random Mutation Hill Climbing (RMHC)