Recent

Author Topic: PopCnt of a buffer  (Read 623 times)

LemonParty

  • Sr. Member
  • ****
  • Posts: 361
PopCnt of a buffer
« on: October 13, 2025, 11:01:26 am »
Hello.

I suggest to add this function to standart units. Maybe to Math.
Code: Pascal  [Select][+][-]
  1. function PopCnt(constref Buf; Count: SizeUInt): SizeUInt;
  2. var
  3.   p: PByte;
  4.   i: SizeUInt;
  5. begin
  6.   Result:= 0;
  7.   if Count = 0 then
  8.     Exit;
  9.   i:= Min(Count, SizeOf(SizeUInt) - 1 - (PtrUInt(@Buf) and (SizeOf(SizeUInt) - 1)));
  10.   p:= @Buf;
  11.  
  12.   {proceed unaligned begin}
  13.   if (i and 1) <> 0 then begin
  14.     inc(Result, System.PopCnt(p^));
  15.     inc(p);
  16.   end;
  17. {$If Defined(CPU32) or Defined(CPU64)}
  18.   if (i and 2) <> 0 then begin
  19.     inc(Result, System.PopCnt(PWord(p)^));
  20.     inc(p, SizeOf(Word));
  21.   end;
  22. {$EndIf}
  23. {$IfDef CPU64}
  24.   if (i and 4) <> 0 then begin
  25.     inc(Result, System.PopCnt(PDWord(p)^));
  26.     inc(p, SizeOf(DWord));
  27.   end;
  28. {$EndIf}
  29.   dec(Count, i);
  30.  
  31.   {aligned reads in next loop}
  32.   for i:= 1 to Count div SizeOf(SizeUInt) do begin
  33.     inc(Result, System.PopCnt(PSizeUInt(p)^));
  34.     inc(p, SizeOf(SizeUInt));
  35.   end;
  36.  
  37.   {proceed tail}
  38. {$IfDef CPU64}
  39.   if (Count and 4) <> 0 then begin
  40.     inc(Result, System.PopCnt(PDWord(p)^));
  41.     inc(p, SizeOf(DWord));
  42.   end;
  43. {$EndIf}
  44. {$If Defined(CPU32) or Defined(CPU64)}
  45.   if (Count and 2) <> 0 then begin
  46.     inc(Result, System.PopCnt(PWord(p)^));
  47.     inc(p, SizeOf(Word));
  48.   end;
  49. {$EndIf}
  50.   if (Count and 1) <> 0 then begin
  51.     inc(Result, System.PopCnt(p^));
  52.   end;
  53. end;
  54.  
Lazarus v. 4.99. FPC v. 3.3.1. Windows 11

Thaddy

  • Hero Member
  • *****
  • Posts: 18327
  • Here stood a man who saw the Elbe and jumped it.
Re: PopCnt of a buffer
« Reply #1 on: October 13, 2025, 12:46:18 pm »
No, because the code is not good enough.
Also, the defines make no sense. The compiler will optimize for the cpu's.
Much simpler and not much slower is this:
Code: Pascal  [Select][+][-]
  1. {$mode objfpc}
  2. uses sysutils;
  3.  
  4. function PopCnt(const Buf:Tbytes): integer;overload;inline;
  5. var
  6.   a:byte;
  7. begin
  8.   Result:= 0;
  9.   for a in buf do
  10.      inc(result, popcnt(a));
  11. end;
  12.  
  13. function PopCnt(const Buf;len:cardinal): integer;overload;inline;
  14. var
  15.   a:integer;
  16.   p:PByte;
  17. begin
  18.   p:=@Buf;
  19.   Result:= 0;
  20.    for a := 0 to len-1 do
  21.      inc(result,popcnt(p[a]));
  22. end;
  23.  
  24. var
  25.   test:string = 'ábcdefghij';
  26. begin
  27.   writeln(popcnt(BytesOf(test)));
  28.   writeln(popcnt(test[1],length(test)));
  29. end.
The use case for an optimized version takes a lot of bytes for not much gain.
« Last Edit: October 13, 2025, 02:36:38 pm by Thaddy »
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

LemonParty

  • Sr. Member
  • ****
  • Posts: 361
Re: PopCnt of a buffer
« Reply #2 on: October 13, 2025, 03:19:36 pm »
No, because the code is not good enough.
Also, the defines make no sense. The compiler will optimize for the cpu's.
Much simpler and not much slower is this:
Code: Pascal  [Select][+][-]
  1. {$mode objfpc}
  2. uses sysutils;
  3.  
  4. function PopCnt(const Buf:Tbytes): integer;overload;inline;
  5. var
  6.   a:byte;
  7. begin
  8.   Result:= 0;
  9.   for a in buf do
  10.      inc(result, popcnt(a));
  11. end;
  12.  
  13. function PopCnt(const Buf;len:cardinal): integer;overload;inline;
  14. var
  15.   a:integer;
  16.   p:PByte;
  17. begin
  18.   p:=@Buf;
  19.   Result:= 0;
  20.    for a := 0 to len-1 do
  21.      inc(result,popcnt(p[a]));
  22. end;
  23.  
  24. var
  25.   test:string = 'ábcdefghij';
  26. begin
  27.   writeln(popcnt(BytesOf(test)));
  28.   writeln(popcnt(test[1],length(test)));
  29. end.
The use case for an optimized version takes a lot of bytes for not much gain.

I just completed a simple benchmark that look like this:
Code: Pascal  [Select][+][-]
  1.   SetLength(ArrA, 1024*1024);
  2.        
  3.   sw.Start;
  4.   i:= PopCnt(ArrA[0], Length(ArrA));
  5.   sw.Stop;
  6.   Writeln(sw.ElapsedTicks);
  7.  
  8.   sw.Reset; sw.Start;
  9.   i:= PopCntS(ArrA[0], Length(ArrA));
  10.   sw.Stop;
  11.   Writeln(sw.ElapsedTicks);
  12.  
PopCntS is your version. I got results:
Quote
12333
15077

12310
15707

12310
14814
Almost 25 % performance boost. Those defines are not useless, compiler can't optimize such code by himself.
Lazarus v. 4.99. FPC v. 3.3.1. Windows 11

runewalsh

  • Full Member
  • ***
  • Posts: 106
Re: PopCnt of a buffer
« Reply #3 on: October 13, 2025, 05:35:22 pm »
Can you please describe the chain of events that led you to popcnting the buffer?

Thaddy

  • Hero Member
  • *****
  • Posts: 18327
  • Here stood a man who saw the Elbe and jumped it.
Re: PopCnt of a buffer
« Reply #4 on: October 13, 2025, 05:44:55 pm »
Yes, curious, especially on larger populations it makes almost no sense.
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

Thaddy

  • Hero Member
  • *****
  • Posts: 18327
  • Here stood a man who saw the Elbe and jumped it.
Re: PopCnt of a buffer
« Reply #5 on: October 13, 2025, 05:48:10 pm »
Almost 25 % performance boost. Those defines are not useless, compiler can't optimize such code by himself.
You can leave the defines out! test on both CPU types...
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

LemonParty

  • Sr. Member
  • ****
  • Posts: 361
Re: PopCnt of a buffer
« Reply #6 on: October 13, 2025, 07:05:45 pm »
Can you please describe the chain of events that led you to popcnting the buffer?
I am working with bitpacked arrays like Classes.TBits. There needed this function.

Quote
You can leave the defines out!
Nope. I getting wrong results on 32-bit target without defines.
Lazarus v. 4.99. FPC v. 3.3.1. Windows 11

runewalsh

  • Full Member
  • ***
  • Posts: 106
Re: PopCnt of a buffer
« Reply #7 on: October 13, 2025, 07:27:05 pm »
What would you even need PopCnt for? For instance, if you're using a bit field to count unique elements in a sequence (say, all elements are between 0 and 999, soo 1000-bit bitfield will do), you don't need PopCnt, you just read and write individual bits, and increment a counter each time you changed a bit from 0 to 1.

LemonParty

  • Sr. Member
  • ****
  • Posts: 361
Re: PopCnt of a buffer
« Reply #8 on: October 13, 2025, 07:38:51 pm »
I use it for removing elements in given range. Instead of checking every bit I can take a chunk of bits, and than use PopCnt to find exact amount of setted bits in the chunk, then I zero this chunk and move to next one.
Lazarus v. 4.99. FPC v. 3.3.1. Windows 11

LemonParty

  • Sr. Member
  • ****
  • Posts: 361
Re: PopCnt of a buffer
« Reply #9 on: October 13, 2025, 07:45:09 pm »
Also the same method is used to find the amount of elements that lie in a given range.
Lazarus v. 4.99. FPC v. 3.3.1. Windows 11

 

TinyPortal © 2005-2018