Chicheng's Blog

PhD := Pathetic Human Dying, But Enjoyable.

0%

[Paper Note] This Looks Like That: Deep Learning for Interpretable Image Recognition (ProtoPNet)

image

Before Reading:

  1. Although ProtoPNet can apply different backbones in their model, this is still an intrinsic approach for explainability, which means it require a very specific architecture to generate explanations.

ProtoPNet

The author gave the figure below in the paper to explain the over architecture of ProtoPNet, but I found it confusing until I read through the whole paper šŸ˜‚. However, the ideas and formulas they proposed make this paper so pleasing to read.

image

To express the architecture in the way I’m familiar with, this will be:

imageā€˜

  1. The input image is passed into a CNN backbone to get the feature map.
  2. The feature map is seen as several patches with size as 1 x 1 x D.
  3. Each patch is then compared to prototypes to calculate the similarity between them to obtain the similarity map (activation map).
  4. By max-pooling the similarity map, it will give us the similarity score for each patch.
  5. The similarity scores are then passed into the fully connected layer to get the final prediction.

Mathematically

image

image

Result

image

References

[1] This Looks Like That: Deep Learning for Interpretable Image Recognition (https://arxiv.org/abs/1806.10574)