Ok I got a VM of win7 64 bit up and am stepping through combine3 comparing registers from a linux and a win 7,
vectors load in fine in both OS
Load F1 into xmm2
1.5 in Linux
{v4_float = {1.5, 0, 0, 0}, v2_double = {5.2842668622670356e-315, 0}, v16_int8 = {0, 0, -64, 63, 0 <repeats 12 times>}, v8_int16 = {0, 16320, 0, 0, 0, 0, 0, 0}, v4_int32 = {1069547520, 0, 0, 0}, v2_int64 = {1069547520, 0}, uint128 = 1069547520}
in windows
{v4_float = {2.40490394e-038, 0, 0, 0}, v2_double = {8.3840924311424506e-317, 0}, v16_int8 = {120, -17, 2, 1, 0 <repeats 12 times>}, v8_int16 = {-4232, 258, 0, 0, 0, 0, 0, 0}, v4_int32 = {16969592, 0, 0, 0}, v2_int64 = {16969592, 0}, uint128 = 16969592}
Load F2 into xmm3
Linux
{v4_float = {5.5, 0, 0, 0}, v2_double = {5.3619766690650802e-315, 0}, v16_int8 = {0, 0, -80, 64, 0 <repeats 12 times>}, v8_int16 = {0, 16560, 0, 0, 0, 0, 0, 0}, v4_int32 = {1085276160, 0, 0, 0}, v2_int64 = {1085276160, 0}, uint128 = 1085276160}
Windows
{v4_float = {2.4049017e-038, 0, 0, 0}, v2_double = {8.3840884786172839e-317, 0}, v16_int8 = {112, -17, 2, 1, 0 <repeats 12 times>}, v8_int16 = {-4240, 258, 0, 0, 0, 0, 0, 0}, v4_int32 = {16969584, 0, 0, 0}, v2_int64 = {16969584, 0}, uint128 = 16969584}
Load F3 into xmm5
Linux
{v4_float = {6.5999999, 0, 0, 0}, v2_double = {5.3733741064073288e-315, 0}, v16_int8 = {51, 51, -45, 64, 0 <repeats 12 times>}, v8_int16 = {13107, 16595, 0, 0, 0, 0, 0, 0}, v4_int32 = {1087583027, 0, 0, 0}, v2_int64 = {1087583027, 0}, uint128 = 1087583027}
Windows
{v4_float = {2.40489946e-038, 0, 0, 0}, v2_double = {8.3840845260921172e-317, 0}, v16_int8 = {104, -17, 2, 1, 0 <repeats 12 times>}, v8_int16 = {-4248, 258, 0, 0, 0, 0, 0, 0}, v4_int32 = {16969576, 0, 0, 0}, v2_int64 = {16969576, 0}, uint128 = 16969576}
So no wonder you are getting wroing answers. Atm I have not got a clue what is going on but will think about it and play some more.
Peter
Update Removing constref from singles and passing by value on the stack and it works. That is probably why the compiler gave the 64 bits message as against a 32 bit. Single are probably better off passed by value as 32 bits is less than 64 bit pointer.