Recent

Author Topic: Is there any free pascal intrinsic for the asm adc (add with carry) instruction.  (Read 1383 times)

uart

  • Jr. Member
  • **
  • Posts: 58
I was trying to make some branch free algorithms for a few small simple functions (that I wanted to inline), and I found that for one of them an add with carry would be useful. Just wondering if there were any intrinsics for doing that or something similar.

As a small example of the usage, consider making a "mask" that is all zeros if an integer is zero, and all ones otherwise. In asm you can make that branch free, I was just wondering if anything like the code below was possible.

Code: Pascal  [Select][+][-]
  1. // NOT REAL CODE. Just an example of the type of thing I would like to be able to do.
  2. var x, mask : Longint;
  3. begin
  4.   x := x-1;                    // note that this must be coded as "sub" not "dec" in x86/x64
  5.   mask := adc($FFFFFFFF,0);    // if (x=0) then mask := $00000000 else mask:=$FFFFFFFF
  6.   ....
  7. end;

Or perhaps there is another branch free way of making a mask like that?
« Last Edit: February 25, 2020, 06:58:36 am by uart »

Thaddy

  • Hero Member
  • *****
  • Posts: 14213
  • Probably until I exterminate Putin.
Well, FreePascal has implicit add with carry, it expands the size if necessary. E,g, two 32 bit variables that have the potential to overflow become a 64 bit result.(the compiler even tells you it is doing that!!!!)
The adc instruction itself can also be used from inline assembler - in this case Intel/Amd and not portable.
Specialize a type, not a var.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
I was trying to make some branch free algorithms for a few small simple functions (that I wanted to inline), and I found that for one of them an add with carry would be useful. Just wondering if there were any intrinsics for doing that or something similar.

Providing a useful intrinsic for this would not be easy. Just take the example you provided: currently the compiler would optimize the x := x - 1 to dec which would not help. Also the compiler might optimize other parts as well (e.g. if instruction scheduling should decide for a better order). Not to mention that things might behave differently on other CPU targets. So if you have dependencies between instructions it is better to directly code them as assembly.

uart

  • Jr. Member
  • **
  • Posts: 58
Thanks everyone.

Providing a useful intrinsic for this would not be easy. ..

Yes I was thinking that too. To be honest this was just an random thought experiment, and I figured it might be interesting to discuss.

I had some short functions that were pretty heavily used in some code I'm running, and I wanted to make them inline and as efficient as possible. So I was just playing around with ideas for making branch free code to do some simple stuff like conditional addition.

I came up with the following which does work really well. (This was just to implement: if x<0 then x:=x+a; )
Code: Pascal  [Select][+][-]
  1. var x,a,maskx : Longword;
  2. begin
  3.   ...
  4.   maskx  := sarLongint(x,31);
  5.   Result := Result + (a and maskx);  //  if Result<0 then Result := Result + a;
  6. end;

But then in another function I also nedded to do a similar conditional addition, but on the condition that (x<>0), and that one was a bit more difficult. I can implement it in asm (similar to what I did in the opening post), but that precludes making it inline which kind of defeats the purpose (of having short branch free code).
« Last Edit: February 25, 2020, 10:39:01 am by uart »

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
I was trying to make some branch free algorithms for a few small simple functions (that I wanted to inline), and I found that for one of them an add with carry would be useful. Just wondering if there were any intrinsics for doing that or something similar.

Providing a useful intrinsic for this would not be easy. Just take the example you provided: currently the compiler would optimize the x := x - 1 to dec which would not help. Also the compiler might optimize other parts as well (e.g. if instruction scheduling should decide for a better order). Not to mention that things might behave differently on other CPU targets. So if you have dependencies between instructions it is better to directly code them as assembly.

Are carries and condition registers modeled by the register allocator at all?

jamie

  • Hero Member
  • *****
  • Posts: 6091
if you move a LongWord into a Uint64, do the adding then use the HI(MyUint64) to test for a 0 value..

 If the value isn't 0 them an overflow took place and thus this can be used on the next addition to add to that.
 
 The same can be done with subtracting.. if the upper half isn't 0 then a borrow was done..
etc
The only true wisdom is knowing you know nothing

ASerge

  • Hero Member
  • *****
  • Posts: 2223
But then in another function I also nedded to do a similar conditional addition, but on the condition that (x<>0), and that one was a bit more difficult. I can implement it in asm (similar to what I did in the opening post), but that precludes making it inline which kind of defeats the purpose (of having short branch free code).
The compiler converts comparisons to short setxx statements, i.e. without jumps. Therefore, the universal solution is:
Code: Pascal  [Select][+][-]
  1. function Test(A, X: LongWord): LongWord;
  2. const
  3.   CMask: array[Boolean] of LongWord = (0, $FFFFFFFF);
  4. begin
  5.   Result := 1;
  6.   Inc(Result, A and CMask[LongInt(X) < 0]);
  7. end;

uart

  • Jr. Member
  • **
  • Posts: 58
Nice solution ASerge.  :)

Thanks.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
I was trying to make some branch free algorithms for a few small simple functions (that I wanted to inline), and I found that for one of them an add with carry would be useful. Just wondering if there were any intrinsics for doing that or something similar.

Providing a useful intrinsic for this would not be easy. Just take the example you provided: currently the compiler would optimize the x := x - 1 to dec which would not help. Also the compiler might optimize other parts as well (e.g. if instruction scheduling should decide for a better order). Not to mention that things might behave differently on other CPU targets. So if you have dependencies between instructions it is better to directly code them as assembly.

Are carries and condition registers modeled by the register allocator at all?

The register allocator doesn't really care about them. The optimizer on the other hand handles them (and some optimizations also change code to use e.g. ADC if appropriate). At least that's the case for x86. On other targets things might be different.

 

TinyPortal © 2005-2018