Yes, not always. But it is fun.
The difference is negligable, though, so I chose the xor swap because I like it.
The generic version uses a temp, because you can't assume T is an ordinal.
The swap way is mostly intel/amd issue, e.g. arm has very fast xor.