Recent

Author Topic: Sorting and Counting  (Read 8438 times)

440bx

  • Hero Member
  • *****
  • Posts: 1267
Re: Sorting and Counting
« Reply #75 on: July 21, 2019, 08:50:36 pm »
Benchmark is compiled with a 32-bit compiler and runs on 64-bit Windows7.
Thank you. 

If it isn't too much trouble, I'd like to see the results when compiled for 64bit. 
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

VTwin

  • Hero Member
  • *****
  • Posts: 793
  • Former Turbo Pascal 3 user
Re: Sorting and Counting
« Reply #76 on: July 21, 2019, 10:06:38 pm »
I would like to get in on this adventure myself  :D

However, you guys wouldn't like my version because it would involve using Assembler code.  >:D

I'd be interested, as long as it is simple and cross-platform. :D
“Talk is cheap. Show me the code.” -Linus Torvalds

macOS 10.13.6: Lazarus 2.0.7 fixes svn 62300 (64 bit Cocoa)
Ubuntu 18.04.3: Lazarus 2.0.6 (64 bit on VBox)
Windows 7 Pro SP1: Lazarus 2.0.6 (64 bit on VBox)
fpc 3.0.4

ASerge

  • Hero Member
  • *****
  • Posts: 1420
Re: Sorting and Counting
« Reply #77 on: July 22, 2019, 12:04:16 am »
@ASerge, I'm not sure about rtl-generics, but LGenerics is definitely incompatible with FPC 3.0.4.
OK. What about project with FPC 3.3.1?

avk

  • Full Member
  • ***
  • Posts: 163
    • my self-education project
Re: Sorting and Counting
« Reply #78 on: July 22, 2019, 02:44:36 am »
OK. What about project with FPC 3.3.1?
And what's wrong with the project for 3.3.1?

@ 440BX
64-bit version, it seems you won  :):
Code: Text  [Select]
  1. RandomRange = 5
  2. Julkas1's time: 0.7330  #unique: 490717 #total: 2000000
  3. Julkas2's time: 0.7490  #unique: 490717 #total: 2000000
  4.   Akira's time: 0.9050  #unique: 490717 #total: 2000000
  5.  Howard's time: 1.1230  #unique: 490717 #total: 2000000
  6.    Avk1's time: 0.8420  #unique: 490717 #total: 2000000
  7.    Avk2's time: 0.5780  #unique: 490717 #total: 2000000
  8.   440bx's time: 0.4520  #unique: 490717 #total: 2000000
  9.  
  10. RandomRange = 6
  11. Julkas1's time: 0.7640  #unique: 578502 #total: 2000000
  12. Julkas2's time: 0.7490  #unique: 578502 #total: 2000000
  13.   Akira's time: 0.9510  #unique: 578502 #total: 2000000
  14.  Howard's time: 1.0770  #unique: 578502 #total: 2000000
  15.    Avk1's time: 0.8890  #unique: 578502 #total: 2000000
  16.    Avk2's time: 0.5930  #unique: 578502 #total: 2000000
  17.   440bx's time: 0.5150  #unique: 578502 #total: 2000000
  18.  
  19. RandomRange = 7
  20. Julkas1's time: 0.7800  #unique: 659891 #total: 2000000
  21. Julkas2's time: 0.7800  #unique: 659891 #total: 2000000
  22.   Akira's time: 0.9830  #unique: 659891 #total: 2000000
  23.  Howard's time: 1.1890  #unique: 659891 #total: 2000000
  24.    Avk1's time: 0.9530  #unique: 659891 #total: 2000000
  25.    Avk2's time: 0.6490  #unique: 659891 #total: 2000000
  26.   440bx's time: 0.4960  #unique: 659891 #total: 2000000
  27.  
  28. RandomRange = 8
  29. Julkas1's time: 0.7990  #unique: 734164 #total: 2000000
  30. Julkas2's time: 0.8200  #unique: 734164 #total: 2000000
  31.   Akira's time: 1.0450  #unique: 734164 #total: 2000000
  32.  Howard's time: 1.2610  #unique: 734164 #total: 2000000
  33.    Avk1's time: 1.1300  #unique: 734164 #total: 2000000
  34.    Avk2's time: 0.6810  #unique: 734164 #total: 2000000
  35.   440bx's time: 0.4940  #unique: 734164 #total: 2000000
  36.  
  37. RandomRange = 9
  38. Julkas1's time: 0.9130  #unique: 802348 #total: 2000000
  39. Julkas2's time: 1.0760  #unique: 802348 #total: 2000000
  40.   Akira's time: 1.3740  #unique: 802348 #total: 2000000
  41.  Howard's time: 1.2440  #unique: 802348 #total: 2000000
  42.    Avk1's time: 1.0330  #unique: 802348 #total: 2000000
  43.    Avk2's time: 0.6590  #unique: 802348 #total: 2000000
  44.   440bx's time: 0.5400  #unique: 802348 #total: 2000000
  45.  
  46. RandomRange = 10
  47. Julkas1's time: 0.8330  #unique: 864249 #total: 2000000
  48. Julkas2's time: 0.8140  #unique: 864249 #total: 2000000
  49.   Akira's time: 1.1120  #unique: 864249 #total: 2000000
  50.  Howard's time: 1.1760  #unique: 864249 #total: 2000000
  51.    Avk1's time: 1.0670  #unique: 864249 #total: 2000000
  52.    Avk2's time: 0.6760  #unique: 864249 #total: 2000000
  53.   440bx's time: 0.5330  #unique: 864249 #total: 2000000
  54.  
  55. repeatMillionsCount = 10
  56. Julkas1's time: 4.4350  #unique: 5709324 #total: 10000000
  57. Julkas2's time: 4.4840  #unique: 5709324 #total: 10000000
  58.   Akira's time: 6.5020  #unique: 5709324 #total: 10000000
  59.  Howard's time: 6.8220  #unique: 5709324 #total: 10000000
  60.    Avk1's time: 6.0550  #unique: 5709324 #total: 10000000
  61.    Avk2's time: 4.4340  #unique: 5709324 #total: 10000000
  62.   440bx's time: 3.8650  #unique: 5709324 #total: 10000000
  63.  
  64. repeatMillionsCount = 12
  65. Julkas1's time: 5.2520  #unique: 6216581 #total: 12000000
  66. Julkas2's time: 5.3630  #unique: 6216581 #total: 12000000
  67.   Akira's time: 7.6030  #unique: 6216581 #total: 12000000
  68.  Howard's time: 7.6950  #unique: 6216581 #total: 12000000
  69.    Avk1's time: 7.4680  #unique: 6216581 #total: 12000000
  70.    Avk2's time: 5.1700  #unique: 6216581 #total: 12000000
  71.   440bx's time: 4.4730  #unique: 6216581 #total: 12000000
  72.  
  73. repeatMillionsCount = 14
  74. Julkas1's time: 5.9630  #unique: 6609319 #total: 14000000
  75. Julkas2's time: 5.9450  #unique: 6609319 #total: 14000000
  76.   Akira's time: 8.7800  #unique: 6609319 #total: 14000000
  77.  Howard's time: 8.9000  #unique: 6609319 #total: 14000000
  78.    Avk1's time: 8.1640  #unique: 6609319 #total: 14000000
  79.    Avk2's time: 5.8020  #unique: 6609319 #total: 14000000
  80.   440bx's time: 5.0030  #unique: 6609319 #total: 14000000
  81.  
  82. repeatMillionsCount = 16
  83. Julkas1's time: 6.6950  #unique: 6917359 #total: 16000000
  84. Julkas2's time: 6.6010  #unique: 6917359 #total: 16000000
  85.   Akira's time: 9.5630  #unique: 6917359 #total: 16000000
  86.  Howard's time: 9.8050  #unique: 6917359 #total: 16000000
  87.    Avk1's time: 8.9960  #unique: 6917359 #total: 16000000
  88.    Avk2's time: 6.3830  #unique: 6917359 #total: 16000000
  89.   440bx's time: 5.5600  #unique: 6917359 #total: 16000000
  90.  
  91. repeatMillionsCount = 18
  92. Julkas1's time: 7.2430  #unique: 7157162 #total: 18000000
  93. Julkas2's time: 7.2440  #unique: 7157162 #total: 18000000
  94.   Akira's time: 10.1980 #unique: 7157162 #total: 18000000
  95.  Howard's time: 11.2310 #unique: 7157162 #total: 18000000
  96.    Avk1's time: 9.6480  #unique: 7157162 #total: 18000000
  97.    Avk2's time: 7.1490  #unique: 7157162 #total: 18000000
  98.   440bx's time: 5.9770  #unique: 7157162 #total: 18000000
  99.  
  100. repeatMillionsCount = 20
  101. Julkas1's time: 8.1910  #unique: 7343071 #total: 20000000
  102. Julkas2's time: 7.9810  #unique: 7343071 #total: 20000000
  103.   Akira's time: 11.0300 #unique: 7343071 #total: 20000000
  104.  Howard's time: 12.1460 #unique: 7343071 #total: 20000000
  105.    Avk1's time: 10.5860 #unique: 7343071 #total: 20000000
  106.    Avk2's time: 7.9840  #unique: 7343071 #total: 20000000
  107.   440bx's time: 6.5130  #unique: 7343071 #total: 20000000
  108.  

Please excuse me, but for some time I will not be able to attend the forum.

440bx

  • Hero Member
  • *****
  • Posts: 1267
Re: Sorting and Counting
« Reply #79 on: July 22, 2019, 03:04:54 am »
@ 440BX
64-bit version, it seems you won  :):
but only for 64bit.  In 32bit, you win :). The cost of pushing and popping parameters is just too high.
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

ASerge

  • Hero Member
  • *****
  • Posts: 1420
Re: Sorting and Counting
« Reply #80 on: July 22, 2019, 12:09:02 pm »
And what's wrong with the project for 3.3.1?
Attach it , please.

julkas

  • Sr. Member
  • ****
  • Posts: 430
  • KISS principle / Lazarus 2.0.0 / FPC 3.0.4
Re: Sorting and Counting
« Reply #81 on: July 23, 2019, 02:57:05 pm »
I replaced fcl-stl TVector with generics.collections TList in my algo. I don't know why TList gives very poor performance.
So my fcl-stl TVector solution is better than Akira's generics.collections TDictionary and better than generics.collections TList.
Strange ... Who can explain this ? :-(

Also generics.collections is way faster and more modern ...
Is it true ?
« Last Edit: July 23, 2019, 03:40:05 pm by julkas »
procedure mulu64(a, b: QWORD; out clo, chi: QWORD); assembler;
asm
  mov rax, a
  mov rdx, b
  mul rdx
  mov [clo], rax
  mov [chi], rdx
end;

Thaddy

  • Hero Member
  • *****
  • Posts: 9279
Re: Sorting and Counting
« Reply #82 on: July 23, 2019, 06:58:52 pm »
Usually, yes, but mind the remarks I made in the other thread: it is about some internals. In this thread somebody used those low-level adjustments. See if you can spot who did... :-X
The uses clause is usually a dead giveaway.... Analyze the inheritance...
« Last Edit: July 23, 2019, 07:01:43 pm by Thaddy »
also related to equus asinus.

julkas

  • Sr. Member
  • ****
  • Posts: 430
  • KISS principle / Lazarus 2.0.0 / FPC 3.0.4
Re: Sorting and Counting
« Reply #83 on: July 23, 2019, 08:32:25 pm »
Usually, yes, but mind the remarks I made in the other thread: it is about some internals. In this thread somebody used those low-level adjustments. See if you can spot who did... :-X
The uses clause is usually a dead giveaway.... Analyze the inheritance...
In this case I don't want low-level tricks, fast I/O, ... I want short, clean and fast as possible solution with well known data structures (classes) and algorithms. So I try understand pros and cons of different Pascal generics implementation.
In real life I can't use even fcl-stl. 80% of my Pascal code (DS, algorithm) is written from scratch.
procedure mulu64(a, b: QWORD; out clo, chi: QWORD); assembler;
asm
  mov rax, a
  mov rdx, b
  mul rdx
  mov [clo], rax
  mov [chi], rdx
end;

440bx

  • Hero Member
  • *****
  • Posts: 1267
Re: Sorting and Counting
« Reply #84 on: July 23, 2019, 09:10:19 pm »
I want short, clean and fast as possible solution with well known data structures (classes) and algorithms.
You can legitimately say that about the algorithms you used to solve the problem, I _cannot_ say that about the implementations I presented, I traded cleanliness for speed. 

At least theoretically, it seems an optimized version (still clean) of Howard's customized radix sort should usually be the fastest.  Its downside is, in some cases, it can take a lot more memory than desirable.

When everything is taken into account, I believe Avk implementation number 2 is the best one. 
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

VTwin

  • Hero Member
  • *****
  • Posts: 793
  • Former Turbo Pascal 3 user
Re: Sorting and Counting
« Reply #85 on: July 23, 2019, 09:43:56 pm »
In this case I don't want low-level tricks, fast I/O, ... I want short, clean and fast as possible solution with well known data structures (classes) and algorithms. So I try understand pros and cons of different Pascal generics implementation.
In real life I can't use even fcl-stl. 80% of my Pascal code (DS, algorithm) is written from scratch.

I don't have (ready) access to a generics library, so I just replaced one line of code in Avk2. Not a speed record or new suggestion, but it runs pretty quick. I believe Avk2 uses an Introsort.

Code: Pascal  [Select]
  1. procedure Quicksort(var a: IVector; left, right: integer);
  2. var
  3.   i, j: integer;
  4.   x: integer;
  5. begin
  6.   if (right <= left) or (left < 0) then
  7.     exit;
  8.   if (right - left + 1) < MinQSortElem then begin
  9.     Insertionsort(a, left, right);
  10.     exit;
  11.   end;
  12.   i := left;
  13.   j := right;
  14.   x := a[(left + right) div 2];
  15.   repeat
  16.     while (Compare(a[i], x) = -1) do
  17.       i := i + 1;
  18.     while (Compare(x, a[j]) = -1) do
  19.       j := j - 1;
  20.     if i <= j then begin
  21.       Swap(a[i], a[j]);
  22.       i := i + 1;
  23.       j := j - 1;
  24.     end;
  25.   until i > j;
  26.   if left < j then
  27.     Quicksort(a, left, j);
  28.   if i < right then
  29.     Quicksort(a, i, right);
  30. end;  

Of course the Insertionsort is unnecessary here, and Compare can be replaced with "<".
« Last Edit: July 23, 2019, 10:00:59 pm by VTwin »
“Talk is cheap. Show me the code.” -Linus Torvalds

macOS 10.13.6: Lazarus 2.0.7 fixes svn 62300 (64 bit Cocoa)
Ubuntu 18.04.3: Lazarus 2.0.6 (64 bit on VBox)
Windows 7 Pro SP1: Lazarus 2.0.6 (64 bit on VBox)
fpc 3.0.4

ASerge

  • Hero Member
  • *****
  • Posts: 1420
Re: Sorting and Counting
« Reply #86 on: July 23, 2019, 09:59:13 pm »
At least theoretically, it seems an optimized version (still clean) of Howard's customized radix sort should usually be the fastest. 
+1
That's why the results seem strange to me, because Howard's algorithm is O(n), and algorithms with sorts is O(n*lg(n)). But it is impossible to check, @avk did not provide the project.

howardpc

  • Hero Member
  • *****
  • Posts: 3201
Re: Sorting and Counting
« Reply #87 on: July 23, 2019, 11:30:03 pm »
I want short, clean and fast as possible solution with well known data structures (classes) and algorithms.
At least theoretically, it seems an optimized version (still clean) of Howard's customized radix sort should usually be the fastest.  Its downside is, in some cases, it can take a lot more memory than desirable.
Replacing the Format() in my original implementation with a simple Writeln() gives a slight speed increase.
However my implementation has two drawbacks apart from initially high memory usage:
  • it requires two passes over the data (because the min and max values must be found in order to dimension the array)
  • if the spread of values is high, or the scattering of values is particularly sparse, the memory usage goes very high - so it's performance and memory usage is very data-dependent


440bx

  • Hero Member
  • *****
  • Posts: 1267
Re: Sorting and Counting
« Reply #88 on: July 23, 2019, 11:30:22 pm »
That's why the results seem strange to me, because Howard's algorithm is O(n), and algorithms with sorts is O(n*lg(n)). But it is impossible to check, @avk did not provide the project.
They do look a bit strange until you carefully analyze the algorithm's implementation and the data it has to handle.

Here are some of the costs his algorithm incurs, some of which could be avoided and others simply can't.
Code: Pascal  [Select]
  1.     min: Integer = High(Integer);
  2.     max: Integer = -1;
  3.   begin
  4.    routineName := {$I %currentroutine%};
  5.    AssignFile(textf, inFilename);
  6.    Reset(textf);
  7.    while not EOF(textf) do
  8.     begin
  9.        ReadLn(textf, i);
  10.         Inc(Total);
  11.         if i < min then
  12.           min := i;
  13.         if i > max then
  14.           max := i;
  15.      end;
  16.  
He has to walk the entire data file to determine min and max.  If in the above snippet of code, both min and max had been initialized to the first element in the data file then it would not be necessary to always do two (2) comparisons (one against min and another one against max.)  It could have been coded instead as
Code: Pascal  [Select]
  1.         if i < min then
  2.         begin
  3.           min := i;
  4.           continue;
  5.         end;
  6.         if i > max then
  7.           max := i;
  8.  
Thereby, for elements lower than the current min, avoiding a now unnecessary second comparison against max.  The gains would, of course, depend on the file's data arrangement.

The instruction
Code: Pascal  [Select]
  1.     SetLength(arr, max-min+1);  
is very expensive because it has to set (max - min + 1) number of elements - which, due to the large number of duplicates in the data file and their large range, will be significantly greater than the number of elements in the data file - to zero. 

Directly related to the above,
Code: Pascal  [Select]
  1.       case (arr[i] > 0) of
  2.         True:
  3.           begin
  4.             WriteLn(textf, i+min, ' - ', arr[i]);
  5.             Inc(Unique);
  6.           end;
  7.       end;
  8.  
The number of comparisons needed to "weed out" superfluous buckets (=0) is large (due to the large min, max range) and the large number of duplicates.

The larger number of comparisons due to the data range and the need to zero out a large memory area, is enough to nullify the gains of his O(n) algorithm.


ETA:

@Howard

Our posts crossed.  You are absolutely right.
« Last Edit: July 23, 2019, 11:44:47 pm by 440bx »
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

ASerge

  • Hero Member
  • *****
  • Posts: 1420
Re: Sorting and Counting
« Reply #89 on: July 23, 2019, 11:45:06 pm »
They do look a bit strange until you carefully analyze the algorithm's implementation and the data it has to handle.
But for large n it's better. And the algorithm can be improved. Since memory is still allocated a lot and we know the data format, it is better to set the minimum and maximum as constants.