Transformers

Depend on Self-Attention Layer as only mechanism to compare input vectors

Pasted image 20241203000434.png