Machine Learning

Explaining Anomalies with Isolation Forest and SHAP

September 30, 2024

Isolation Forest is an unsupervised, tree-based anomaly detection technique. See how each KernelSHAP and TreeSHAP can be utilized to clarify its output.

Picture by Fabrice Villard on Unsplash

Isolation Forest has develop into a staple in anomaly detection programs [1]. Its benefit is having the ability to discover complicated anomalies in massive datasets with many options. Nevertheless, relating to explaining these anomalies, this benefit shortly turns into a weak spot.

To take motion on an anomaly we regularly have to grasp the explanations for it being categorised as one. This perception is especially precious in real-world purposes, comparable to fraud detection, the place realizing the rationale behind an anomaly is commonly as vital as detecting it.

Sadly, with Isolation Forest, these explanations are hidden inside the complicated mannequin construction. To uncover them, we flip to SHAP.

We’ll apply SHAP to IsolationForest and interpret its output. We’ll see that though that is an unsupervised mannequin we are able to nonetheless use SHAP to clarify its anomaly scores. That’s to grasp:

How options have contributed to the scores of particular person situations
and which options are vital basically.