Forum > General

Vectorcall and records

(1/3) > >>

Madoc:
It appears that if I define a vector type like this:



--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---TVec4f = record    case integer of      0: (v: array[0..3] of single);      1: (xyz: array[0..2] of single);      2: (xy: TA2f; zw: array[0..1] of single);      3: (x,y,z,w: single);    end;
The vectorcall convention does not work with this as a parameter. Am I doing something wrong? Is there a workaround? Is this behaviour due to change?

Keep in mind that the above is just an example. The actual type has a bunch of operators and functions and the embedded vector array types are predefined with their own mechanics, this is just to illustrate the problem. Simply passing "v" to a function won't do.

Thanks

Madoc:
I've noticed another possibly more serious issue. If I declare a simple function like this:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---type  TA3f = array[0..2] of single; function Sub(const a, b: TA3f): TA3f; assembler; nostackframe; vectorcall;asm          subps    xmm0, xmm1end;
I would expect a and b to be passed in in xmm0 and xmm1, and the result to be returned in xmm0, but instead of being properly loaded into these registers it seems that this happens instead:


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---movq xmm3, [rax]movq xmm4, [rax+8]...movq xmm1, [rax]movq xmm2, [rax+8] And the result is not using xmm registers.

Obviously a 3 component vector is not recognised so I'm seeing this weird 2-1, 2-1 split and not result, which is also using registers 1-4 instead of 0-3.

Is this intended behaviour?

jamie:
using 3.2.2 I don't see the same.

The call order here is:
0,4,1,2 of the xmm? regs.

and 0 is moved to 3 before the call.

and RCX is being loaded with a stack address, not sure about that, maybe that is to be used as the return address?


EDIT:
 as for the RECORD issue, you are correct, the compiler simply passes the addresses of the records via a standard Register.
however, on return, the Xmmx registers are being used to set a record return type.

 strange, maybe this should be reported?


Madoc:
Sounds the same to me. You're also getting 2 vectors in xmm1-4 instead of 0-1. The code before the call should look something like this:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---        movq     xmm0, [rax]        movss    xmm1, [rax+8]        movlhps  xmm0, xmm1        ...        movq     xmm1, [rax]        movss    xmm2, [rax+8]        movlhps  xmm1, xmm2 
And obviously the result should be expected in xmm0. If the vectors don't remain resident in registers between calls then the calling convention is pointless. That's the optimisation.

Once the register is done with it should be written back to memory with the same split addresses, using a different load and store pattern won't hit the same cache lines and can incur a massive performance hit.

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---        movhlps  xmm1, xmm0        movq     [rax], xmm0        movss    [rax+8], xmm1 

jamie:
I don't think the compiler is fully compatible with Delphi or at least buggy for some cases.

Navigation

[0] Message Index

[#] Next page

Go to full version