2023-04-14    Share on: Twitter | Facebook | HackerNews | Reddit

SHAP - Understanding How This Method for Explainable AI Works

Discover how the SHAP method can help you understand the important factors behind your model's predictions in a simple, intuitive way.

As a data scientist, one of the biggest challenges in deploying machine learning models is explaining how the model makes its decisions. The need for explainability is not only important for legal and ethical reasons, but it also helps in building trust in the model and making informed decisions. The SHapley Additive exPlanations (SHAP) method is a powerful technique that provides a unified framework for interpreting any model. In this blog post, I will explain the SHAP method, its mathematical foundation, and how it can be applied to interpret machine learning models.

What is SHAP?

The SHAP method is a game-theoretic approach to explain the output of any machine learning model. It is based on the concept of Shapley values. Here is a timeline for the SHAP method:

  • 1953: Introduction of Shapley values by Lloyd Shapley for game theory
  • 2010: First use of Shapley values for explaining machine learning predictions by Strumbelj and Kononenko
  • 2017: SHAP paper + Python package by Lundberg

The SHAP method is a game-theoretic approach to explain the output of any machine learning model. It is based on the concept of Shapley values, which were introduced by Lloyd Shapley in 1953 to fairly distribute the gains of a cooperative game among its players. In the context of machine learning, the players are the input features, and the gain is the difference between the actual output of the model and the expected output. The SHAP method provides a way to calculate the Shapley values for each input feature, which gives us a measure of the contribution of each feature towards the model output.

The SHapley Additive exPlanations (SHAP) method we are using today was introduced in a paper titled "A Unified Approach to Interpreting Model Predictions" by Scott Lundberg and Su-In Lee, published in the Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017). The paper is available on the arXiv preprint server at https://arxiv.org/abs/1705.07874.

How does SHAP work?

The SHAP method works by computing the Shapley values for each feature in the input space. The Shapley value for feature i, denoted by \(\phi_i\), is defined as the average contribution of the feature i across all possible coalitions of features. Mathematically, the Shapley value can be expressed as follows:

$$\phi_i(f,S) = \sum_{T \subseteq S \setminus \{i\}}\frac{|T|!(|S|-|T|-1)!}{|S|!}(f(T \cup \{i\}) - f(T))$$

where \(X\) is the set of all input features, \(S\) is a coalition of features that does not include feature \(i\), \(|S|\) is the size of the coalition, and \(f(S\cup{i})\) is the output of the model when the features in \(S\) and \(i\) are present. The term \(f(S)\) is the output of the model when only the features in \(S\) are present. The Shapley value represents the average marginal contribution of feature \(i\) over all possible coalitions.

To compute the Shapley values using the above formula, we need to evaluate the model output for all possible coalitions of features, which is computationally infeasible for most machine learning models. The SHAP method provides an efficient way to estimate the Shapley values using a weighted average of the model outputs for a subset of coalitions. The subset of coalitions is selected based on a feature importance metric, such as the permutation importance or the gradient-based importance.

How to apply SHAP?

To apply the SHAP method, we need to first compute the Shapley values for each feature in the input space. This can be done using one of the many implementations available in popular machine learning libraries, such as scikit-learn, XGBoost, and TensorFlow. Once we have the Shapley values, we can visualize them using various techniques to gain insights into the model's decision-making process.

One popular technique to visualize the Shapley values is the Shapley value plot, which shows the contribution of each feature towards the model output for each individual data point. The plot consists of a horizontal axis representing the feature contribution, and a vertical axis representing the features. Each data point is represented by a vertical bar, where the length of the bar represents the magnitude of the Shapley value for the corresponding feature. The color of the bar represents the value of the feature, where red represents high feature values and blue represents low feature values. The plot helps in identifying the most important features for each data point and the direction of the relationship between the features and the output.

Another technique to visualize the Shapley values is the summary plot, which shows the average contribution of each feature across all data points. The plot consists of a horizontal axis representing the Shapley value and a vertical axis representing the features. Each feature is represented by a horizontal bar, where the length of the bar represents the magnitude of the average Shapley value. The color of the bar represents the direction of the relationship between the feature and the output, where red represents a positive relationship and blue represents a negative relationship.

In addition to visualizing the Shapley values, the SHAP method can also be used to identify instances where the model makes biased or unfair decisions. The method can be used to quantify the extent to which each feature contributes to the model's bias towards a certain group or class. This helps in identifying the root cause of the bias and taking corrective measures to ensure fairness and equity in the model's decisions.

Conclusion

The SHapley Additive exPlanations (SHAP) method provides a powerful framework for interpreting any machine learning model. The method is based on the concept of Shapley values, which provides a fair way to distribute the gain of a cooperative game among its players. The SHAP method provides an efficient way to compute the Shapley values for each feature in the input space, which gives us a measure of the contribution of each feature towards the model output. The method can be applied to visualize the Shapley values, identify the most important features, and quantify the model's bias towards certain groups or classes. By providing a unified framework for interpretability, the SHAP method helps in building trust in the model and making informed decisions.

Any comments or suggestions? Let me know.

To cite this article:

@article{Saf2023SHAP,
    author  = {Krystian Safjan},
    title   = {SHAP - Understanding How This Method for Explainable AI Works},
    journal = {Krystian's Safjan Blog},
    year    = {2023},
}