shap - Infinite Lexicon - Infinite Lexicon

shap

SHAP, or SHapley Additive exPlanations, is a framework for explaining the predictions of machine learning models. It assigns each feature a SHAP value that represents its contribution to a specific prediction, grounded in Shapley values from cooperative game theory. The explanations are additive: for a given instance x, the model’s prediction f(x) can be expressed as f(x) = phi_0 + sum_i phi_i, where phi_0 is the base value (the expected model output over the training data) and phi_i are the SHAP values for the features. This formulation satisfies local accuracy, missingness, and consistency, meaning features with no impact receive zero attribution, and changes to a model that increase a feature’s marginal contribution should not decrease that feature’s SHAP value.

There are several SHAP variants designed for different model types. Kernel SHAP is model-agnostic and uses

Applications include interpreting individual predictions, ranking feature importance, debugging model behavior, and supporting fairness and compliance

interoperability

high-dimensional

misinterpretation

implementations