computer vision CNN vs transformer model difference in image processing Let's start with an interesting experiment. Here is a Gemni-generated cow (above) with macaw feathers as the texture. Is it a cow or a Macaw? We used two vision models: ResNet and ViT to figure out what it is Let's start with ResNet: I am not