Recent

Author Topic: Processor affinity  (Read 365 times)

MarkMLl

  • Hero Member
  • *****
  • Posts: 7999
Processor affinity
« on: November 06, 2024, 10:36:57 pm »
Back in about 2011 there was an ML thread relating to processor affinity in which I was involved, since I had some medium-scale (8-16 CPU) SPARC-based systems where processor- and interrupt-affinity were relevant.

There is still no direct support for this in the RTL, but in case it's useful to anybody I offer the fragment below which largely builds on a contribution in the above.

Code: Pascal  [Select][+][-]
  1. (* Set the processor affinity. The parameter is normally a +ve integer
  2.   representing a bitmap with CPU 0 as the LSB, but if -ve it will instead fill
  3.   in bits from the highest numbered available (i.e. -1 is the highest available,
  4.   -2 is the second-highest available, -3 is the two highest available and so on).
  5.   The number of bits set is returned, or -1 on error.
  6.  
  7.   See discussion at https://lists.freepascal.org/fpc-pascal/2011-January/028026.html
  8. *)
  9. function setProcessorAffinity(affinity: Int64): integer;
  10.  
  11. const
  12.   CPU_SETSIZE_BITS= 1024;
  13.   CPU_SETSIZE_QWORDS= CPU_SETSIZE_BITS DIV 64;
  14.  
  15. var
  16.   cpu_set: array[0..CPU_SETSIZE_QWORDS - 1] of qword;
  17.   r, i: integer;
  18.   bit, unBit: qword;
  19.  
  20.  
  21.   function bitsSet(q: qword): integer;
  22.  
  23.   var
  24.     i: integer;
  25.  
  26.   begin
  27.     result := 0;
  28.     for i := 0 to 63 do begin
  29.       if Odd(q) then result += 1;
  30.       q := q >> 1
  31.     end
  32.   end { bitsSet } ;
  33.  
  34.  
  35. begin
  36.   FillByte(cpu_set, SizeOf(cpu_set), 0);
  37. {$push }{$R- }{ Needed for i386 but not x86_64 }
  38.   r := do_Syscall(syscall_nr_sched_getaffinity, fpgetpid(), SizeOf(cpu_set), ptruint(@cpu_set));
  39. {$pop }
  40.   Assert(r >= 0, 'sched_getaffinity() -> error ' + IntToStr(fpGetErrNo) + ', "' + StrError(fpGetErrNo) + '"');
  41.  
  42. (* The behaviour of the syscalls and C library routines differ, this describes  *)
  43. (* the former. A +ve return value from sched_getaffinity() indicates the number *)
  44. (* of bytes set to a known state in the cpu_set bitmap, which will be at least  *)
  45. (* as many as are required to enumerate the available CPUs (i.e. cores and/or   *)
  46. (* threads as determined by the CPU design).                                    *)
  47. (*                                                                              *)
  48. (* This implementation is adequate for no more than 64 CPUs, since it only      *)
  49. (* looks at a single qword.                                                     *)
  50.  
  51.   case Sign(r) of
  52.     -1: exit(-1);                       (* Syscall error                        *)
  53.      0: exit(1)                         (* No error, but no CPU count           *)
  54.   otherwise
  55.     case Sign(affinity) of
  56.       -1: begin
  57.             FillByte(cpu_set, SizeOf(cpu_set), 0);
  58.  
  59. (* Get the bitmap representing the entire population of available CPUs, up to a *)
  60. (* maximum of 64.                                                               *)
  61.  
  62.             cpu_set[0] := High(qword);
  63. {$push }{$R- }{ Possibly needed for i386 but not x86_64 }
  64.             r := do_Syscall(syscall_nr_sched_setaffinity, fpgetpid(), SizeOf(cpu_set), ptruint(@cpu_set));
  65. {$pop }
  66.             Assert(r = 0, 'sched_setaffinity() -> error ' + IntToStr(fpGetErrNo) + ', "' + StrError(fpGetErrNo) + '"');
  67.             if r <> 0 then
  68.               exit(-1);                 (* Syscall error                        *)
  69. {$push }{$R- }{ Possibly needed for i386 but not x86_64 }
  70.             r := do_Syscall(syscall_nr_sched_getaffinity, fpgetpid(), SizeOf(cpu_set), ptruint(@cpu_set));
  71. {$pop }
  72.             Assert(r >= 0, 'sched_getaffinity() -> error ' + IntToStr(fpGetErrNo) + ', "' + StrError(fpGetErrNo) + '"');
  73.             if r < 0 then
  74.               exit(-1);                 (* Syscall error                        *)
  75.  
  76. (* Working from the top down, find the highest available CPU. Don't assume that *)
  77. (* shift distances > 31 are reliable.                                           *)
  78.  
  79.             i := 63;
  80.             bit := qword($8000000000000000);
  81.             unBit := qword($7fffffffffffffff);
  82.             while ((cpu_set[0] and bit) = 0) and (i >= 0) do begin
  83.               bit := bit div 2;
  84.               unBit := (unBit div 2) + qword($8000000000000000);
  85.               i -= 1
  86.             end;
  87.  
  88. (* Mark the CPUs we want to use. This doesn't handle CPUs which aren't already  *)
  89. (* marked as active specially, since this is how the +ve case (below) works.    *)
  90.  
  91.             affinity := Abs(affinity);
  92.             while i >= 0 do begin
  93.               if Odd(affinity) then
  94.                 cpu_set[0] := cpu_set[0] or bit
  95.               else
  96.                 cpu_set[0] := cpu_set[0] and unBit;
  97.               affinity := affinity >> 1;
  98.               bit := bit div 2;
  99.               unBit := (unBit div 2) + qword($8000000000000000);
  100.               i -= 1
  101.             end
  102.           end;
  103.        0: exit(bitsSet(cpu_set[0]))     (* No change requested                  *)
  104.     otherwise
  105.       FillByte(cpu_set, SizeOf(cpu_set), 0);
  106.       cpu_set[0] := affinity
  107.     end;
  108.  
  109. (* The pattern of bits in the set indicates the CPUs we want to use, starting   *)
  110. (* either at zero or at the highest available CPU depending on the sign of the  *)
  111. (* affinity parameter.                                                          *)
  112.  
  113. {$push }{$R- }{ Possibly needed for i386 but not x86_64 }
  114.     r := do_Syscall(syscall_nr_sched_setaffinity, fpgetpid(), SizeOf(cpu_set), ptruint(@cpu_set));
  115. {$pop }
  116.     Assert(r = 0, 'sched_setaffinity() -> error ' + IntToStr(fpGetErrNo) + ', "' + StrError(fpGetErrNo) + '"');
  117.     FillByte(cpu_set, SizeOf(cpu_set), 0);
  118. {$push }{$R- }{ Possibly needed for i386 but not x86_64 }
  119.     r := do_Syscall(syscall_nr_sched_getaffinity, fpgetpid(), SizeOf(cpu_set), ptruint(@cpu_set));
  120. {$pop }
  121.     case Sign(r) of
  122.       -1: result := -1;                 (* Syscall error                        *)
  123.        0: result := 1                   (* No error, but no CPU count           *)
  124.     otherwise
  125.       result := bitsSet(cpu_set[0])
  126.     end
  127.   end
  128. end { setProcessorAffinity } ;
  129.  

The size of the initial bitset representing the entire CPU population is, I believe, derived from work contributed by SGI (in their decline) to the Linux kernel: they really did have 1024-CPU systems with high-bandwidth connections between their L2 (?) caches.

The Linux sched_getaffinity syscall returns the number of bytes within this bitset which have been populated by present bits. It is important to appreciate that this differs from the GNU libc sched_getaffinity which only returns zero on success. The above code is good for up to 64 core/threads, which is what I see on an x86_64 system.

The function above, tested only on Linux, takes as parameter a CPU bitmask and returns the number of CPUs within the resultant affinity. I wrote it after encountering a suspected race condition which behaves differently on different systems, and it works to the extent that I can either select "lowest of" or "highest of" CPUs.

I challenge partisans of other OSes to contribute equivalent code, so that the community can agree on something which could be usefully added to the RTL.

Updated: testing in i386 highlighted the need for a $R- in one syscall. I've tentatively done the same for the others.

MarkMLl
« Last Edit: November 08, 2024, 11:56:48 am by MarkMLl »
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11929
  • FPC developer.
Re: Processor affinity
« Reply #1 on: November 08, 2024, 07:59:17 pm »
(Afaik a symmetrical bitset is a simplification. Due to NUMA and other forms of connecting CPUs, afaik CPUs are first grouped, and then masked)

MarkMLl

  • Hero Member
  • *****
  • Posts: 7999
Re: Processor affinity
« Reply #2 on: November 08, 2024, 09:35:00 pm »
(Afaik a symmetrical bitset is a simplification. Due to NUMA and other forms of connecting CPUs, afaik CPUs are first grouped, and then masked)

Yes, I agree, and the Linux kernel has advanced in leaps and bounds since I first started playing with this sort of thing which was- goodness- around version 2.4.

I had a SPARCServer- 8x 80MHz CPUs IIRC in a layout designed by Xerox PARC and also used by Cray... and found that Dave Miller had got the bus initialisation wrong. Once everything was working that was a surprising fast machine for e.g. kernel compilations which parallelised well but it was also an effective building heater. Didn't manage to port my hacks to later versions: I forget the detail by now.

So the kernel's good for SSI up to some hundreds of CPUs, but that either has to be heavily NUMAed or to have complex multilevel caches like the IBM Zs. But you'll have problems with hotplugging if you stick to open code: SunOS and IBM's large-scale OSes had an edge that Linux hasn't really eroded.

Once you get beyond that and try clustering there are inherent problems endemic to all unixes, relating to too many things which can't be moved around because the doctrine which has grown up around them implies that they're either pointers or indexes into a system-wide table: and that to some extent is why we've got the current emphasis on virtualisation.

However in practical terms I think what I've got there is moderately useful particularly when viewed in a "this behaves differently on different computers" context. I've got reservations about how useful the "-ve parameter" is since in principle at least the kernel shouldn't ascribe any particular significance to either the lowest or highest CPU/core, however in practice I think it's a useful "let's try something different" facility.

Although obviously, anything like this has the potential of being misleading or downright destructive if it clashes badly with the kernel's idea of the hardware organisation. Which is largely your point ;-)

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

 

TinyPortal © 2005-2018