Recent

Author Topic: Question on portable code  (Read 4398 times)

MathMan

  • Full Member
  • ***
  • Posts: 214
Question on portable code
« on: August 11, 2015, 08:35:06 am »
Hi all,

I am currently working with FPC 2.6.4.

For the math library I develope I do have a question on portable code. Naturally the lib contains a lot of statements like

Code: [Select]
  X := X + a;

Using Inc instead will produce significantly faster executables on my Version of FPC. As I want to produce a lib that can be compiled with as many Pascal compilers as possible - the "portable" aspect - I do have the following questions

* is Inc/Dec a common Pascal language extension (or should I stay with the above for portablilty)?
* if Inc/Dec is not portable can I expect better optimization in the upcomming FPC 3.x.x Versions (so that i can stay with the above)?

If the answers two both questions are "no" then I do see two Options to make the high-speed version avail to FPC users

* I could define a compile time switch and integrate both variants in one source. However this will render the sources unreadable, as they will then mainly consist of "{ifdef ...} ... {endif}" instead of anything useful.
* I could translate the portable version into an optimized FPC version and include both in the package. However this will require me to maintain coherence between the two files (which are already >20.000 lines and double to end of this year).

Comments, suggestions anyone?

Kind regards,
MathMan

Blaazen

  • Hero Member
  • *****
  • Posts: 3029
  • POKE 54296,15
    • Eye-Candy Controls
Re: Question on portable code
« Reply #1 on: August 11, 2015, 08:43:28 am »
It should be safe. There are no limitations mentioned in docs: http://www.freepascal.org/docs-html/rtl/system/inc.html
Lazarus 2.1.0 r64546 FPC 3.3.1 r40507 x86_64-linux-qt Chakra, Qt 4.8.7/5.13.2, Plasma 5.17.3
Lazarus 1.8.2 r57369 FPC 3.0.4 i386-win32-win32/win64 Wine 3.21

Try Eye-Candy Controls: https://sourceforge.net/projects/eccontrols/files/

taazz

  • Hero Member
  • *****
  • Posts: 5365
Re: Question on portable code
« Reply #2 on: August 11, 2015, 08:49:59 am »
third solution create your own inc function with the ifdef code and use that instead (make it inlined too).
Good judgement is the result of experience … Experience is the result of bad judgement.

OS : Windows 7 64 bit
Laz: Lazarus 1.4.4 FPC 2.6.4 i386-win32-win32/win64

MathMan

  • Full Member
  • ***
  • Posts: 214
Re: Question on portable code
« Reply #3 on: August 11, 2015, 09:09:24 am »
It should be safe. There are no limitations mentioned in docs: http://www.freepascal.org/docs-html/rtl/system/inc.html

With "safe" you mean it's a common Pascal RTL function?

MathMan

  • Full Member
  • ***
  • Posts: 214
Re: Question on portable code
« Reply #4 on: August 11, 2015, 09:10:30 am »
third solution create your own inc function with the ifdef code and use that instead (make it inlined too).

Good one - didn't catch that. Will consider if there is no other solution avail.

Blaazen

  • Hero Member
  • *****
  • Posts: 3029
  • POKE 54296,15
    • Eye-Candy Controls
Re: Question on portable code
« Reply #5 on: August 11, 2015, 09:19:21 am »
Quote
With "safe" you mean it's a common Pascal RTL function?

Yes. It is FPC equivalent to
Code: [Select]
i++ in C/C++. IMO writing your own fucntion with IFDEF is overengeneered solution. If there's any target platform which doesn't have instruction for inc() then the same is very probably already done inside compiler.
Lazarus 2.1.0 r64546 FPC 3.3.1 r40507 x86_64-linux-qt Chakra, Qt 4.8.7/5.13.2, Plasma 5.17.3
Lazarus 1.8.2 r57369 FPC 3.0.4 i386-win32-win32/win64 Wine 3.21

Try Eye-Candy Controls: https://sourceforge.net/projects/eccontrols/files/

Leledumbo

  • Hero Member
  • *****
  • Posts: 8310
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Question on portable code
« Reply #6 on: August 11, 2015, 09:32:41 am »
Using Inc instead will produce significantly faster executables on my Version of FPC
...
* if Inc/Dec is not portable can I expect better optimization in the upcomming FPC 3.x.x Versions (so that i can stay with the above)?
Since early FPC days, both should generate the same opcode if you use optimization (no idea about the minimum level, 2 I believe).
* is Inc/Dec a common Pascal language extension (or should I stay with the above for portablilty)?
To date, all active Pascal compilers are usually Turbo Pascal compatible. Inc/Dec should be available in all of them.

Thaddy

  • Hero Member
  • *****
  • Posts: 10706
Re: Question on portable code
« Reply #7 on: August 11, 2015, 11:00:30 am »
Even w/o optimization the code for inc(x,a)  and x := x + a is identical for intel x86 processors.....

It used to be the case that on x86 family processors it was directly converted to the asm inc instruction.
But processors have evolved and this is no longer always the case.

You should never optimize with a specific processor in mind.
If you really want your code to compile with as many pascal dialects as possible, simply optimize your algorithms.

But your question was not very clear: do you mean compilers different from Freepascal or do you mean Freepascal compilers for different platforms and cpu's?

If the latter is the case, FPC will optimize as best as it can given the available instruction set.

AIR even UCSD Pascal supported inc (I taught UCSD Pascal on Apple ]['s at university to my fellow students in the early 80's) but it could have been a function instead of an intrinsic.
« Last Edit: August 11, 2015, 11:07:27 am by Thaddy »

Leledumbo

  • Hero Member
  • *****
  • Posts: 8310
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Question on portable code
« Reply #8 on: August 11, 2015, 11:33:02 am »
Even w/o optimization the code for inc(x,a)  and x := x + a is identical for intel x86 processors.....
Not really, the following program:
Code: [Select]
program test;

procedure p;
var
  x: longword;
begin
  x := x + 1;
end;

procedure q;
var
  x: longword;
begin
  Inc(x);
end;

begin
  p;q
end.
generates this asm when compiled without optimization (relevant parts only):
Code: [Select]
# P
# Var x located at ebp-4, size=OS_32
# [7] x := x + 1;
movl -4(%ebp),%eax
leal 1(%eax),%eax
movl %eax,-4(%ebp)
# [8] end;

# Q
# Var x located at ebp-4, size=OS_32
# [14] Inc(x);
addl $1,-4(%ebp)
# [15] end;
Recompile with -O1:
Code: [Select]
// P
# Var x located at ebp-4, size=OS_32
# [7] x := x + 1;
movl -4(%ebp),%eax
addl $1,%eax

// Q
# Var x located at ebp-4, size=OS_32
# [14] Inc(x);
addl $1,-4(%ebp)
# [15] end;
and -O2:
Code: [Select]
// P
# [7] x := x + 1;
addl $1,%eax
# Var x located in register eax
# [8] end;

// Q
# Var x located in register eax
# [13] begin
# [14] Inc(x);
addl $1,%eax
# [15] end;
As can be seen, they're only equal in -O2 optimization. Note that I'm cross compiling from x86_64 to i386 and x must be of 32-bit integer type, in x86_64 even when the integer type matches processor native pointer size (64-bit) they're never equal.

Thaddy

  • Hero Member
  • *****
  • Posts: 10706
Re: Question on portable code
« Reply #9 on: August 11, 2015, 11:36:04 am »
My bad since O2 is debuggable and my default.. The rest of my comments are valid though.

MathMan

  • Full Member
  • ***
  • Posts: 214
Re: Question on portable code
« Reply #10 on: August 11, 2015, 11:52:42 am »

...

As can be seen, they're only equal in -O2 optimization. Note that I'm cross compiling from x86_64 to i386 and x must be of 32-bit integer type, in x86_64 even when the integer type matches processor native pointer size (64-bit) they're never equal.

Thanks - I should have been more specific. My target platform is x86_64 and I do see differences in asm when compiling with -O3. Speed varies up to 50% between standard and Inc/Dec variant. I never compiled for 32bit target ...

However after reading your response and Thaddy I'll double check again this evening - maybe I made a simple mistake in the compiler setup.

Kind regards,
MathMan

MathMan

  • Full Member
  • ***
  • Posts: 214
Re: Question on portable code
« Reply #11 on: August 12, 2015, 09:59:39 am »

...

As can be seen, they're only equal in -O2 optimization. Note that I'm cross compiling from x86_64 to i386 and x must be of 32-bit integer type, in x86_64 even when the integer type matches processor native pointer size (64-bit) they're never equal.

Thanks - I should have been more specific. My target platform is x86_64 and I do see differences in asm when compiling with -O3. Speed varies up to 50% between standard and Inc/Dec variant. I never compiled for 32bit target ...

However after reading your response and Thaddy I'll double check again this evening - maybe I made a simple mistake in the compiler setup.

Kind regards,
MathMan

To rev things up. I converted my sources to Inc/Dec yesterday and did some testing and benchmarking. Compiler target is x86-64 with maximum optimization (-O3).

Execution speed increased across the board - varying from 15% to 50% on basic primitives. The computation of 1 mio. digits of e (with a very crude algorithm so only consider the relative timing) went down from 80 seconds to 63 seconds.

Pure Pascal now is only 5 times slower than pure asm - which I find is a very good ratio considering that I had to emulate all the carry handling in Pascal.

Kind regards,
Jens

 

TinyPortal © 2005-2018