Recent

Author Topic: [SOLVED] Any cross-platform problems to be aware of while using ASM?  (Read 1492 times)

Eugene Loza

  • Hero Member
  • *****
  • Posts: 729
    • My games in Pascal
Once upon a time I've had great fun writing games in Turbo Pascal, in the end ending up in >50% of the code written in Assembly. But all of those were targeting my specific computer. Now thinking about reviving some bottleneck optimizations on lower level (e.g. 2D/3D raycasting and pathfinding is my first guess), I wonder how reliable are these things in cross-platform context (especially talking about mobile target, i.e. Android).

First of all FreePascal compiler is by far superior than Turbo Pascal's - so, while back in late 90s I'd get +18% performance just by rewriting "the same thing" in Assembly, now the performance gain is barely noticeable (managed to get +4% on random number generator, and only thanks to reusing registry content, which is definitely not worth the effort and reduction in code readability). In other word, the whole idea of writing Assembly is by far less important now.

But but for me the bigger issue is how well does Assembly snippets in FPC play in cross-platform context? I understand maybe that's a very basic and dumb question :D I mean, apparently there is a difference between 32 and 64 bits. But is there something I have to know on top of that? Or can different targets (AMD/Intel/ARM, etc) have significant differences in Assembly code forcing to have several separate implementations? (In this case I'm not even considering using Assembly, don't have the capacity for that) Or are those only minor things, like "checking if CPU supports the specific function if you use it", e.g. through Cpu unit or compiler directives?
« Last Edit: May 13, 2024, 03:20:45 pm by Eugene Loza »
My FOSS games in FreePascal&CastleGameEngine: https://decoherence.itch.io/ (Sources: https://gitlab.com/EugeneLoza)

Laksen

  • Hero Member
  • *****
  • Posts: 801
    • J-Software
Re: Any cross-platform problems to be aware of while using ASM?
« Reply #1 on: May 13, 2024, 01:36:52 pm »
I would not bother with it unless you discover a hotpath by profiling that you can identify clearly could be done better

What you are asking about is both the bit-ness, but also ABI and architecture. They will vary from slightly different (i386 fastcall vs x86_64 win64/sysv psabi) to completely different (x86 vs ARM)
« Last Edit: May 13, 2024, 01:43:41 pm by Laksen »

Thaddy

  • Hero Member
  • *****
  • Posts: 18372
  • Here stood a man who saw the Elbe and jumped it.
Re: Any cross-platform problems to be aware of while using ASM?
« Reply #2 on: May 13, 2024, 02:10:45 pm »
ARM requires completely different assembler code. Usually you must assume assembler code is not portable, even between the very similar x86_64-win64 and x86_64-Linux64 generated assembler code. On the latter two, you sometimes can get away with very very simple and clear snippets, but if you are not familiar with the different ABI's let it to the compiler to generate the assembly code.
Contrary to popular belief the compiler often generates far better code, that is, unless you are an absolute complete and utter assembler Guru for that particular assembler and the different ABI requirements.
IOW Nowadays, for 99.9% of the programmers, including very good ones, do not attempt assembler. It bites you.

What I mostly do is have the code in pure pascal, compile in -al mode, examine the assembler output and if I see obvious optimizations I can do that by hand. Not often worth the trouble.
« Last Edit: May 13, 2024, 02:16:37 pm by Thaddy »
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 12536
  • FPC developer.
Re: Any cross-platform problems to be aware of while using ASM?
« Reply #3 on: May 13, 2024, 03:01:55 pm »
Yes. For instance the win64 and linux 64-bit calling conventions are also different, so even within the same architecture it might not be the same.

I do use assembler in my work, but very sparingly, and only in large enough blocks (the code contains loops that run many times), typically processing a whole image, or in the case of very complicated transformations a single line of an image.

I never found much use for the typical very short peephole-like fragments of old compilers like TP and old Delphi. They mostly only frustrate the compiler

Usually if it wasn't written to perform well in say Sandy Bridge (2nd generation core) or better, it probably needs a rewrite.
« Last Edit: May 13, 2024, 03:03:58 pm by marcov »

Eugene Loza

  • Hero Member
  • *****
  • Posts: 729
    • My games in Pascal
Re: Any cross-platform problems to be aware of while using ASM?
« Reply #4 on: May 13, 2024, 03:20:34 pm »
Thank you all! Yeah, that pretty much answers my questions perfectly - too much hassle (including upgrading my Assembly knowledge from late 90s) for my usecase :)
My FOSS games in FreePascal&CastleGameEngine: https://decoherence.itch.io/ (Sources: https://gitlab.com/EugeneLoza)

Thaddy

  • Hero Member
  • *****
  • Posts: 18372
  • Here stood a man who saw the Elbe and jumped it.
Well, and that goes for both you and me, our original 8/16/32 bit 8086/8087/6502/6510/z80 , maybe 80386 or God forbid 80286 assembler knowledge is soooooo outdated we will have to use one of the FreePascal esotheric (cross-)compilers to show off our knowledge  :o.
Even if you or me still wrote assembler from scratch, we wouldn't know about cache stalls, locked cores etc. So our code becomes slower instead of faster on modern multi-core cpu's.
Many people have that pretention and it is seldom that it is proper to write assembler, except for the likes of Marcov, but he knows what he is doing and knows about the limitations. The world and the compilers have moved on, in Indonesean, Tempo dulu, that time has gone by.
(The latter is just to ensure our friend Handoko reads the thread  ;) )

I don't want to discourage beginners that still think: "Hey, Look, this assembler MUST be faster than the compiler generated code!". Just let them dream... and pursue the deeper knowledge.
Remember, beginners do not know about profilers...
We both started like that I guess...
« Last Edit: May 13, 2024, 04:50:50 pm by Thaddy »
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

Handoko

  • Hero Member
  • *****
  • Posts: 5493
  • My goal: build my own game engine using Lazarus
I already abandoned Assembly Language long ago. I was obsessed with assembly because I found by rewriting some functions into Assembly it run even faster than the computer's BIOS functions.

I still keep some of the code, for example this is my Write(Color, Row, Col, Text), which works by direct mapping the data to the vga memory.
Code: Pascal  [Select][+][-]
  1. ; PROCEDURE WRITE (W, B, K : BYTE; S : STRING);
  2. ;
  3. PUSH  DS
  4. PUSH  SS
  5. POP   DS
  6. MOV   BX, SP
  7. MOV   AX, 0028
  8. MUL   BYTE PTR [BX+0A]
  9. MOV   CX, [BX+08]
  10. XOR   CH, CH
  11. ADD   AX, CX
  12. ADD   AX, AX
  13. MOV   DX, B800
  14. MOV   ES, DX
  15. MOV   DI, AX
  16. MOV   AH, [BX+0C]
  17. LDS   SI, [BX+04]
  18. XOR   CH, CH
  19. MOV   CL, [SI]
  20. CMP   CL, 0
  21. JZ    130
  22. INC   SI
  23. CLD
  24. LODSB
  25. STOSW
  26. LOOP  012C
  27. POP   DS
  28. RET   000A

I used BASIC, which was very slow. That forced me to write code using DEBUG.COM to produce obj files for being included into Turbo Basic. That worked. But soon after I found Pascal, there is no reason for me to use BASIC nor Assembly anymore.
« Last Edit: May 13, 2024, 07:45:58 pm by Handoko »

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 12536
  • FPC developer.
I still keep some of the code, for example this is my Write(Color, Row, Col, Text), which works by direct mapping the data to the vga memory.

I raise you basic date-time routines in FPC/go32v2 0.99.5/0.99.8  :)

PascalDragon

  • Hero Member
  • *****
  • Posts: 6195
  • Compiler Developer
Re: Any cross-platform problems to be aware of while using ASM?
« Reply #8 on: May 14, 2024, 10:48:44 pm »
ARM requires completely different assembler code. Usually you must assume assembler code is not portable, even between the very similar x86_64-win64 and x86_64-Linux64 generated assembler code.

To be fair, the ABIs of x86_64-win64 and x86_64 SYS-V are close enough that some ifdefs for the prologue are enough. Just take a look at the assembly code in the rtl/x86_64/x86_64.inc which is shared between Win64 and non-Win64.

 

TinyPortal © 2005-2018