“Why Should I Trust You?” Explaining the Predictions of Any Classifier

Link: https://www.kdd.org/kdd2016/papers/files/rfp0573-ribeiroA.pdf

DOI: http://dx.doi.org/10.1145/2939672.2939778



Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind
predictions is, however, quite important in assessing trust,
which is fundamental if one plans to take action based on a
prediction, or when choosing whether to deploy a new model.
Such understanding also provides insights into the model,
which can be used to transform an untrustworthy model or
prediction into a trustworthy one.
In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable
model locally around the prediction. We also propose a
method to explain models by presenting representative individual predictions and their explanations in a non-redundant
way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by
explaining different models for text (e.g. random forests)
and image classification (e.g. neural networks). We show the
utility of explanations via novel experiments, both simulated
and with human subjects, on various scenarios that require
trust: deciding if one should trust a prediction, choosing
between models, improving an untrustworthy classifier, and
identifying why a classifier should not be trusted.

Author(s): Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin

Publication Date: 2016

Publication Site: kdd, Association for Computing Machinery

A Unified Approach to Interpreting Model Predictions

Link: https://papers.nips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf



Understanding why a model makes a certain prediction can be as crucial as the
prediction’s accuracy in many applications. However, the highest accuracy for large
modern datasets is often achieved by complex models that even experts struggle to
interpret, such as ensemble or deep learning models, creating a tension between
accuracy and interpretability. In response, various methods have recently been
proposed to help users interpret the predictions of complex models, but it is often
unclear how these methods are related and when one method is preferable over
another. To address this problem, we present a unified framework for interpreting
predictions, SHAP (SHapley Additive exPlanations). SHAP assigns each feature
an importance value for a particular prediction. Its novel components include: (1)
the identification of a new class of additive feature importance measures, and (2)
theoretical results showing there is a unique solution in this class with a set of
desirable properties. The new class unifies six existing methods, notable because
several recent methods in the class lack the proposed desirable properties. Based
on insights from this unification, we present new methods that show improved
computational performance and/or better consistency with human intuition than
previous approaches.

Author(s): Scott M. Lundberg, Su-In Lee

Publication Date: 2017

Publication Site: Conference on Neural Information Processing Systems

Idea Behind LIME and SHAP

Link: https://towardsdatascience.com/idea-behind-lime-and-shap-b603d35d34eb



In machine learning, there has been a trade-off between model complexity and model performance. Complex machine learning models e.g. deep learning (that perform better than interpretable models e.g. linear regression) have been treated as black boxes. Research paper by Ribiero et al (2016) titled “Why Should I Trust You” aptly encapsulates the issue with ML black boxes. Model interpretability is a growing field of research. Please read here for the importance of machine interpretability. This blog discusses the idea behind LIME and SHAP.

Author(s): ashutosh nayak

Publication Date: 22 December 2019

Publication Site: Toward Data Science