Search code examples
pythonfeature-selectionparticle-swarm

Is there any way to choose how many features are selected in Binary Particle Swarm Optimization?


I implemented BPSO as a feature selection approach using the pyswarms library. I followed this tutorial.

Is there a way to limit the maximum number of features? If not, are there other particle swarm (or genetic/simulated annealing) python-implementations that have this functionality?


Solution

  • An easy way is to introduce a penalty for using any number of features. The in the following code a objective i defined

        # Perform classification and store performance in P
        classifier.fit(X_subset, y)
        P = (classifier.predict(X_subset) == y).mean()
        # Compute for the objective function
        j = (alpha * (1.0 - P)
            + (1.0 - alpha) * (1 - (X_subset.shape[1] / total_features)))
    
        return j
    

    What you could do, is add a penalty if the number of features is about max_num_features, e.g.

    features_count = np.count_nonzero(m)
    features_overflow = np.clip( max_num_features - features_count, 0, 10)
    
    feature_overflow_penalty = (features_overflow / 10)
    

    and define a new objective with:

    j = (alpha * (1.0 - P)
            + (1.0 - alpha) * (1 - (X_subset.shape[1] / total_features))) - feature_overflow_penalty
    

    This is not tested, and there is work to do to find the right penalty. A alternative is to never suggest/try features above certain threshold.