Why?
Just having CNNs correctly classifying the given input images is not enough. We need an intuitive explanation on the reasons behinds the classification decision. Does CNN learn the right cues or does it learn some unrelated cue due to some biased information in the training data (e.g., unrelated background patterns)? Are they similar to how humans recognize objects in images?
What?
Grad-CAM for cat |
How?
Grad-CAM finds the importance of a given neuron when it comes to a particular class decision. For that, it computes the gradient of the class score with respect to the feature maps of convolution layer. The gradients flowing back are globally average pooled. They represent the importance of each activation map (weight) to a particular class label. Then Relu is applied on weighted (based on importance) combination of forward activation maps to derive the image regions with positive influence for a given class of interest. The importance of each region is projected as a heat map on the original image.
More information can be found in their paper.
No comments:
Post a Comment