K-means clustering and nsamples for KernelExplainer

I have a dataset which contains roughly 50,000 observations. I want to compute the Shapley value using KernelExplainer after estimating an ElasticNet for regression. Is there any reference or rule that determines the value of K and nsamples? Thank you very much.

I tried K=10 and nsamples=100 but the plot of Shapley value for each feature is usually a upward or downward sloping line.In some ocassions, there are only two point in the plot.

Solution

You would typically use K = 20-100 centroids as background data.

A good value for nsamples is of the order $p(p+1) + 200$, where $p$ is the number of features. The KernelExplainer is implemented in a very smart way that would list all important $p(p+1)$ on-off (masking) combinations. The 200 additional on-off samples will cover the less important part of the KernelSHAP distribution.

How can I close over variables in kdb/Q?
When is the EACH operator extension necessary in K besides mod/rotate?
Handling single-character strings - in a function or in its caller? ssr()
Kdb+ data fomat when writing to a file
How to convert a symbol to a string in kdb+?
Sum of each two elements using vector functions
A dictionary with a single value and multiple keys
Table transformation, table as list of dicts
Accumulator gives different result then direct function applying
Reshape [cols;table]
FK field over IPC
Protected execution, 2 cases
Enums for tables
Converge (fixed point) syntax difference in q and k
.Q.trp and bt handling
NULLs in q and in k.h
Strange view declaration behaviour
How to build a parse-tree of projections?
Could not evaluate manually created equial ~ parse tree
Select distinct for all columns from keyed table
Parallel execution: blocking receive, deferred synchronous
Multiple variable assignment in q
Select a table from the inside of external select
Select when one of filter-column may not exists
What is the meaning of `s attribute on a table?
On parallel execution - which side reports about an error?
Validate if a keyed table have unique keys
Applying dictionary to dictionary
About xkey implementation
Parse tree built on values from vars