man lifting a dumbbell

Concept

An example output of image captioning that was incorrect, used to illustrate how attention maps can reveal what the model was focusing on (arms, not a mug).

Mentioned in 1 video