Concepts

Wrapper vs filter selection

Feature-selection methods fall into two broad families:

  • Filter methods rank features by a statistic (mutual information, correlation, ANOVA F-value) computed independently of any model. They are fast but ignore feature interactions and are not tailored to the model you will actually use.

  • Wrapper methods evaluate candidate subsets by training and cross-validating the target model on them. They are more expensive but account for interactions and optimise the subset for the specific estimator.

evo-gafs is a wrapper: the fitness of a feature subset is a cross-validated score of your estimator.

Genetic representation

Each individual in the population is a binary mask of length n_features: a 1 keeps the feature, a 0 drops it. The genetic algorithm evolves this population with selection, uniform crossover and bit-flip mutation. A repair operator guarantees every individual keeps at least min_features active features, so infeasible (empty) subsets never reach evaluation.

The alpha trade-off (single-objective)

In the default single-objective mode, fitness combines performance and compression into one scalar:

fitness     = alpha * cv_score + (1 - alpha) * compression
compression = 1 - n_selected / n_total
  • alpha = 1.0 → a pure wrapper: maximise performance, ignore size.

  • alpha 0.7 → a balanced default, good for edge/embedded deployment.

  • lower alpha → aggressively smaller subsets, trading some accuracy.

This makes the performance/size decision — which every engineer faces — an explicit, tunable parameter.

Multi-objective (NSGA-II)

When you do not want to commit to a single alpha, multi-objective mode uses NSGA-II to optimise performance and compression simultaneously, returning the entire Pareto front of non-dominated trade-offs. You then pick a point on the front after inspecting it. See Multi-objective.

Reproducibility & efficiency

  • random_seed makes runs deterministic.

  • An internal cache memoises fitness per genome, so repeated subsets are not re-evaluated.

  • Cross-validation folds are clamped to the smallest class count, keeping the selector robust on small datasets.