My question is does anyone know if it is possible to use processor specific statements in Pascal code, ideally via an inlined function call.
Modern x86 CPUs have PREFETCH, LFENCE/MFENCE, CMOV, SETcc, all the SSE instructions, etc. In free pascal, if I create a function with inline assembly and then try to inline that function I get an error specifically saying I can't do that (inline assembly / inline function). Perhaps the instruction scheduler or register allocator can't deal with the complexity.
Another possibility I was thinking is that perhaps the machine-specific components of the back-end compiler that have the ability to select instructions may also have the capability to support new assembly language instructions and have them exposed to the developer in some way. I think I've seen config files with lists of instructions before. If this would require extending the compiler I'm willing to look at that possibility.
I know the standard response to someone who is looking for performance is "why do you need speed CPUs are fast enough blah blah". Please understand that my interest is in understanding and demonstrating how to expose the underlying performance of the CPU to the user in a high level language in library code like memory allocators and parallel data structures.
Another definition of the problem is that the instruction set of a modern x86 CPU is much broader than you can express in a typical language statement. In C/gcc the problem is kludged using inline assembly statements that describe what registers you want to preserve, and then inlining those function calls. In C++/gcc these can be wrapped in templated functions and classes to create fairly elegant solutions (hopefully). I really like Pascal and would hope to do some of the same type of thing using Generics.