No I don't get what you try to say but let's wait for the code.
BTW do not overstimate microoptimisations such as BSRDword. First, it may not be portable (which may not matter to you), but second, it is not necessarily faster in the context of a real-world program.
I tried it for the bit-reversal algorithm required in the fast fourier transform, but on program level it proved to be no more efficient than the plain pascal code repeat l := l shr 1 until (k + l) <= nn; on no machine.