Forum > General
Vectorcall and records
Madoc:
It appears that if I define a vector type like this:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---TVec4f = record case integer of 0: (v: array[0..3] of single); 1: (xyz: array[0..2] of single); 2: (xy: TA2f; zw: array[0..1] of single); 3: (x,y,z,w: single); end;
The vectorcall convention does not work with this as a parameter. Am I doing something wrong? Is there a workaround? Is this behaviour due to change?
Keep in mind that the above is just an example. The actual type has a bunch of operators and functions and the embedded vector array types are predefined with their own mechanics, this is just to illustrate the problem. Simply passing "v" to a function won't do.
Thanks
Madoc:
I've noticed another possibly more serious issue. If I declare a simple function like this:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---type TA3f = array[0..2] of single; function Sub(const a, b: TA3f): TA3f; assembler; nostackframe; vectorcall;asm subps xmm0, xmm1end;
I would expect a and b to be passed in in xmm0 and xmm1, and the result to be returned in xmm0, but instead of being properly loaded into these registers it seems that this happens instead:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---movq xmm3, [rax]movq xmm4, [rax+8]...movq xmm1, [rax]movq xmm2, [rax+8] And the result is not using xmm registers.
Obviously a 3 component vector is not recognised so I'm seeing this weird 2-1, 2-1 split and not result, which is also using registers 1-4 instead of 0-3.
Is this intended behaviour?
jamie:
using 3.2.2 I don't see the same.
The call order here is:
0,4,1,2 of the xmm? regs.
and 0 is moved to 3 before the call.
and RCX is being loaded with a stack address, not sure about that, maybe that is to be used as the return address?
EDIT:
as for the RECORD issue, you are correct, the compiler simply passes the addresses of the records via a standard Register.
however, on return, the Xmmx registers are being used to set a record return type.
strange, maybe this should be reported?
Madoc:
Sounds the same to me. You're also getting 2 vectors in xmm1-4 instead of 0-1. The code before the call should look something like this:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} --- movq xmm0, [rax] movss xmm1, [rax+8] movlhps xmm0, xmm1 ... movq xmm1, [rax] movss xmm2, [rax+8] movlhps xmm1, xmm2
And obviously the result should be expected in xmm0. If the vectors don't remain resident in registers between calls then the calling convention is pointless. That's the optimisation.
Once the register is done with it should be written back to memory with the same split addresses, using a different load and store pattern won't hit the same cache lines and can incur a massive performance hit.
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} --- movhlps xmm1, xmm0 movq [rax], xmm0 movss [rax+8], xmm1
jamie:
I don't think the compiler is fully compatible with Delphi or at least buggy for some cases.
Navigation
[0] Message Index
[#] Next page