Pages

Monday, January 14, 2019

DDoC #01: Grad-CAM: Gradient based Class Activation Mapping

Daily dose of creativity (DDoC) is my attempt to learn something innovative/ creative on daily basis.  Today's paper is "Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization".

Why?
Just having CNNs correctly classifying the given input images is not enough. We need an intuitive explanation on the reasons behinds the classification decision. Does CNN learn the right cues or does it learn some unrelated cue due to some biased information in the training data (e.g., unrelated background patterns)? Are they similar to how humans recognize objects in images?

What?
Grad-CAM for cat
We need to know which regions of an input image, CNN uses to classify a given image into a certain class. We also need to know what discriminative characteristics (e.g. patterns) in those images contributed mostly to classify the image (e.g., stripes in a tiger cat).

How?
Grad-CAM finds the importance of a given neuron when it comes to a particular class decision. For that, it computes the gradient of the class score with respect to the feature maps of convolution layer. The gradients flowing back are globally average pooled. They represent the importance of each activation map (weight) to a particular class label. Then Relu is applied on weighted (based on importance) combination of forward activation maps to derive the image regions with positive influence for a given class of interest. The importance of each region is projected as a heat map on the original image.

More information can be found in their paper.

No comments:

Post a Comment