scikit-learn cluster-analysis logistic-regression

Clustering logistic regression models using sci-kit learn

I have a bunch of a logistic regression models and I want to see how well they cluster. Effectively making a few models to represent the whole group.

However many of the models don't have the same parameters. And it seems weird to cluster on betas when it's possible not all models will have all the parameters

Solution

I would recommend clusters the log of the odds ratios for each of the explanatory variable. This way the models that don't have certain regressors you can fill in empty values with 0.0 (this can be done quite easily with pandas

Assume you have a list of all the models in this form:

models = [{'beta1': m1_b1, 'beta2': m1_b2}, {'beta1': m2_b1, 'beta3': m2_b3}]

The nomenclature above is such that m1_b1 means model 1, beta 1. You'll notice these two don't have the same betas.

You can put them into a data frame like so:

df = pd.DataFrame(models).fillna(0.0)

Using hyphen/dash in python repository name and package name
cubic spline regression with sklearn?
How to resolve "cannot import name '_MissingValues' from 'sklearn.utils._param_validation'" issue when trying to import imblearn?
Snowflake/ Snowpark "import sklearn" results in "no module found."
PartialDependenceDisplay.from_estimator plots having lines with 0 values
ValueError with Scikit-Learn
ClassifierChain with Random Forest: Why is np.nan not supported even though Base Estimator handles it?
Image not segmenting properly using DBSCAN
Machine learning not predicting correct results
Issue setting up SciKeras model
Difference between scikit-learn and sklearn (now deprecated)
Isolation Forest Sklearn for 1D array or list and how to tune hyper parameters
Can I force sklearn to use float32 instead of float64?
Loading a pipeline with a dense-array conversion step
Simulating scikit-learn's OneHotEncoder mixed data type error
RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility
Is there a way to use SciPy's differential evolution, or another library, to minimize one of a multi-output regressor's outs and bound the others?
Importing a Model with Scikit Learn on Vertex
All intermediate steps should be transformers and implement fit and transform
XGBoost XGBClassifier Defaults in Python
Use sklearn transformers on list of columns and preserve the input columns
What does `fit_transform` do in the context of Scikit Learn PCA?
Apply Scaler() on each ID on polars dataframe
Calculate the number of points in a given radius by X and Y coordinates
Saving model hyperparameters as dictionary: json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Keep training pytorch model on new data
Installing an old version of scikit-learn
ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
Figure out which words a Naive Bayes classificator uses for deciding
Can't install imports in VsCode using 'pip install' and extra python extensions