I have 2 questions in this example:
Q1: In theory, it should be the more data, the less regulation. But "C=50.0 / train_samples" would lead to the more data with stronger regulation. Is it the deficit of the code, or my misundertanding?
clf = LogisticRegression(C=50.0 / train_samples, penalty="l1", solver="saga", tol=0.1)
Q2: Why it is possible to draw the final number image with only the coef matrix, what's the ideas behind it?
C is Inverse of regularization strength; higher C means weaker regularization, lower C means stronger regularization. Given Formula: As train samples increases, C decreases, leading to stronger regularization. Reasoning: This is intentional, designed to prevent overfitting by applying more regularization when the dataset size increases.
Coefficient Matrix (coef): Represents the importance of each pixel (feature) in predicting the class. Visualization: Reshaping coef into the image’s dimensions shows which pixels the model considers important, effectively "drawing" the digit as the model sees it.