Something to add to the other great answers here - you can say something similar about head-specific matrices W_V and W_O - they always act together as well. In fact, Anthropic recommends thinking of W_OW_V and W_Q^TW_K as basic primitives in their transformer interpretability framework: https://transformer-circuits.pub/2021/framework/index.html
Something to add to the other great answers here - you can say something similar about head-specific matrices W_V and W_O - they always act together as well. In fact, Anthropic recommends thinking of W_OW_V and W_Q^TW_K as basic primitives in their transformer interpretability framework: https://transformer-circuits.pub/2021/framework/index.html