Recent

Author Topic: Sorting and Counting  (Read 8010 times)

440bx

  • Hero Member
  • *****
  • Posts: 1200
Re: Sorting and Counting
« Reply #45 on: July 18, 2019, 08:25:35 pm »
I imagine he used the original "20MB" code of his from an earlier comment for that one.
I will use that. Thank you Akira.
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

avk

  • Full Member
  • ***
  • Posts: 154
    • my self-education project
Re: Sorting and Counting
« Reply #46 on: July 18, 2019, 08:30:02 pm »
What do you get if you change the ...

just commented:
Code: Text  [Select]
  1. running SortCountAkira:
  2. elapsed time: 1.3890
  3. #total: 1750000, #unique: 710248
  4.  
  5. running SortCountHoward:
  6. elapsed time: 1.5600
  7. #total: 1750000, #unique: 710248
  8.  
  9. running SortCountAvk1:
  10. elapsed time: 0.9980
  11. #total: 1750000, #unique: 710248
  12.  
  13. running SortCountAvk2:
  14. elapsed time: 0.7490
  15. #total: 1750000, #unique: 710248
  16.  
  17. running SortCountJulkas:
  18. elapsed time: 2.6210
  19. #total: 1750000, #unique: 710248
  20.  
  21. running SortCountMangakissa:
  22. elapsed time: 342.3730
  23. #total: 1750000, #unique: 710248
  24.  

Map.Capacity := 1750000:
Code: Text  [Select]
  1. running SortCountAkira:
  2. elapsed time: 1.4040
  3. #total: 1750000, #unique: 710248
  4.  
  5. running SortCountHoward:
  6. elapsed time: 1.6130
  7. #total: 1750000, #unique: 710248
  8.  
  9. running SortCountAvk1:
  10. elapsed time: 1.0440
  11. #total: 1750000, #unique: 710248
  12.  
  13. running SortCountAvk2:
  14. elapsed time: 0.7430
  15. #total: 1750000, #unique: 710248
  16.  
  17. running SortCountJulkas:
  18. elapsed time: 2.7800
  19. #total: 1750000, #unique: 710248
  20.  

Also FWIW, I'm quite sure that for Mangakissa's version, it's this part ...
It also seems that the array search has a quadratic complexity.

@440bx
These results was for 20MB file because of SortCountMangakissa.

For current version Akira's InputGenerator:
Code: Text  [Select]
  1. running SortCountAkira:
  2. elapsed time: 5.0910
  3. #total: 10000000, #unique: 799999
  4.  
  5. running SortCountHoward:
  6. elapsed time: 6.3260
  7. #total: 10000000, #unique: 799999
  8.  
  9. running SortCountAvk1:
  10. elapsed time: 4.1010
  11. #total: 10000000, #unique: 799999
  12.  
  13. running SortCountAvk2:
  14. elapsed time: 3.0760
  15. #total: 10000000, #unique: 799999
  16.  
  17. running SortCountJulkas:
  18. elapsed time: 14.3140
  19. #total: 10000000, #unique: 799999
  20.  
« Last Edit: July 19, 2019, 04:12:26 am by avk »

mpknap

  • Jr. Member
  • **
  • Posts: 92
Re: Sorting and Counting
« Reply #47 on: July 18, 2019, 08:41:06 pm »
Gentlemen, I do not have time to deal with testing your algorithms. I will deal with Monday, on holidays;).
I am also glad that the topic is interesting.

Yesterday I wrote that the Engkin function has been working for an hour. She worked longer until the night was blue screen because the laptop was reset. But the fault was the TXT file. There was an empty line between the lines with the time Stamp. Today I made a TXT file without empty lines and after a few minutes the Out.txt file appeared.
I'm happy because that's what I meant :)

Soon I will check your other ideas. I was unsure that I will get such an encyclopedia from you;)

440bx

  • Hero Member
  • *****
  • Posts: 1200
Re: Sorting and Counting
« Reply #48 on: July 18, 2019, 08:54:57 pm »
@440bx
For current version Akira's InputGenerator:
Excellent!.  Thank you Avk.
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

howardpc

  • Hero Member
  • *****
  • Posts: 3181
Re: Sorting and Counting
« Reply #49 on: July 19, 2019, 12:14:51 am »
To see what effect varying the size of the data file, and varying the number of unique values to be counted might have on the faster algorithms presented here I adapted Akira1364's program as follows.
Code: Pascal  [Select]
  1. program OccurrenceCounter;
  2.  
  3. {$mode delphi}
  4. {$ImplicitExceptions Off}
  5. {$MODESWITCH NESTEDPROCVARS}
  6.  
  7. uses
  8.   SysUtils, DateUtils,
  9.   Generics.Defaults, Generics.Collections,
  10.   LGUtils, LGHashMultiSet, LGArrayHelpers,
  11.   gutil, gmap;
  12.  
  13. type
  14.   TIntPair = TPair<LongInt, LongInt>;
  15.   TProcedureArray = array of procedure;
  16.  
  17.   function ComparePairs(constref L, R: TIntPair): LongInt;
  18.   begin
  19.     if L.Key < R.Key then
  20.       Result := -1
  21.     else if L.Key = R.Key then
  22.       Result := 0
  23.     else
  24.       Result := 1;
  25.   end;
  26.  
  27. var
  28.   Total, Unique, repeatCount, randomrange: Integer;
  29.   Start: TDateTime;
  30.   inFilename: String = 'data.txt';
  31.   outFilename: String = 'sorted.txt';
  32.   routineName: String;
  33.   procedures: TProcedureArray;
  34.   proc: procedure;
  35.  
  36.   procedure GenerateData(randomRange: Integer=8; repeatMillionsCount: Integer=2);
  37.   var
  38.     InFile: Text;
  39.     I: LongInt;
  40.   begin
  41.     Assign(InFile, inFilename);
  42.     Rewrite(InFile);
  43.     for I := 1 to repeatMillionsCount * 1000000 do
  44.       WriteLn(InFile, 1500000000 + Random(randomRange * 100000));
  45.     Close(InFile);
  46.   end;
  47.  
  48.   procedure SortCountAkira;
  49.   var
  50.     I: LongInt;
  51.     InOut: Text;
  52.     Map: TDictionary<LongInt, LongInt>;
  53.     Pair: TIntPair;
  54.     Pairs: TArray<TIntPair>;
  55.   begin
  56.     routineName := {$I %currentroutine%};
  57.     Map := TDictionary<LongInt, LongInt>.Create();
  58.     Map.Capacity := 10000000;
  59.     Assign(InOut, inFilename);
  60.     Reset(InOut);
  61.     while not EOF(InOut) do begin
  62.       ReadLn(InOut, I);
  63.       Inc(Total);
  64.       if not Map.ContainsKey(I) then
  65.         begin
  66.           Map.Add(I, 1);
  67.           Inc(Unique);
  68.         end
  69.       else
  70.         Map[I] := Map[I] + 1;
  71.     end;
  72.     Close(InOut);
  73.     Pairs := Map.ToArray();
  74.     Map.Free();
  75.     TArrayHelper<TIntPair>.Sort(
  76.       Pairs,
  77.       TComparer<TIntPair>.Construct(ComparePairs)
  78.     );
  79.     Assign(InOut, outFilename);
  80.     Rewrite(InOut);
  81.     for Pair in Pairs do with Pair do
  82.       WriteLn(InOut, Key, ' - ', Value);
  83.     Close(InOut);
  84.   end;
  85.  
  86.   procedure SortCountHoward;
  87.   var
  88.     arr: array of Integer;
  89.     textf: TextFile;
  90.     min: Integer = High(Integer);
  91.     max: Integer = -1;
  92.     i: Integer;
  93.   begin
  94.     routineName := {$I %currentroutine%};
  95.     AssignFile(textf, inFilename);
  96.     Reset(textf);
  97.     while not EOF(textf) do
  98.       begin
  99.         ReadLn(textf, i);
  100.         Inc(Total);
  101.         if i < min then
  102.           min := i;
  103.         if i > max then
  104.           max := i;
  105.       end;
  106.     SetLength(arr, max-min+1);
  107.  
  108.     Reset(textf);
  109.     while not EOF(textf) do
  110.       begin
  111.         ReadLn(textf, i);
  112.         Dec(i, min);
  113.         Inc(arr[i]);
  114.       end;
  115.     CloseFile(textf);
  116.  
  117.     AssignFile(textf, outFilename);
  118.     Rewrite(textf);
  119.     for i := Low(arr) to High(arr) do
  120.       case (arr[i] > 0) of
  121.         True:
  122.           begin
  123.             WriteLn(textf, i+min, ' - ', arr[i]);
  124.             Inc(Unique);
  125.           end;
  126.       end;
  127.     CloseFile(textf);
  128.     SetLength(arr, 0);
  129.   end;
  130.  
  131.   procedure SortCountAvk1;
  132.   type
  133.     TCounter  = TGHashMultiSetLP<Integer>;
  134.     TCountRef = TGAutoRef<TCounter>;
  135.     TEntry    = TCounter.TEntry;
  136.  
  137.     function EntryCmp(constref L, R: TEntry): SizeInt;
  138.     begin
  139.       if L.Key > R.Key then
  140.         Result := 1
  141.       else
  142.         if L.Key < R.Key then
  143.           Result := -1
  144.         else
  145.           Result := 0;
  146.     end;
  147.  
  148.   var
  149.     CountRef: TCountRef;
  150.     InOut: Text;
  151.     Counter: TCounter;
  152.     e: TEntry;
  153.     I: Integer;
  154.   begin
  155.     routineName := {$I %currentroutine%};
  156.     Counter := CountRef;
  157.     Counter.LoadFactor := 0.7;
  158.     Assign(InOut, inFilename);
  159.     Reset(InOut);
  160.     while not EOF(InOut) do
  161.       begin
  162.         ReadLn(InOut, I);
  163.         Counter.Add(I);
  164.       end;
  165.     Close(InOut);
  166.     Total := Counter.Count;
  167.     Unique := Counter.EntryCount;
  168.     if Counter.NonEmpty then
  169.       begin
  170.         Assign(InOut, outFilename);
  171.         Rewrite(InOut);
  172.         for e in Counter.Entries.Sorted(EntryCmp) do
  173.           with e do
  174.             WriteLn(InOut, Key, ' - ', Count);
  175.         Close(InOut);
  176.       end;
  177.   end;
  178.  
  179.   procedure SortCountAvk2;
  180.   var
  181.     List: array of Integer;
  182.     InOut: Text;
  183.     I, J, Count, DupCount: Integer;
  184.   begin
  185.     routineName := {$I %currentroutine%};
  186.     Assign(InOut, inFilename);
  187.     Reset(InOut);
  188.     SetLength(List, 4096);
  189.     I := 0;
  190.     while not EOF(InOut) do
  191.       begin
  192.         ReadLn(InOut, J);
  193.         Inc(Total);
  194.         if Length(List) = I then
  195.           SetLength(List, I * 2);
  196.         List[I] := J;
  197.         Inc(I);
  198.       end;
  199.     Close(InOut);
  200.     SetLength(List, I);
  201.     if List = nil then
  202.       exit;
  203.     TGOrdinalArrayHelper<Integer>.Sort(List);
  204.     Count := I;
  205.     DupCount := 0;
  206.     I := 0;
  207.     Assign(InOut, outFilename);
  208.     Rewrite(InOut);
  209.     repeat
  210.       J := List[I];
  211.       while (I < Count) and (List[I] = J) do
  212.         begin
  213.           Inc(DupCount);
  214.           Inc(I);
  215.         end;
  216.       WriteLn(InOut, J, ' - ', DupCount);
  217.       Inc(Unique);
  218.       DupCount := 0;
  219.     until I = Count;
  220.     Close(InOut);
  221.   end;
  222.  
  223.   procedure SortCountJulkas;
  224.   type
  225.     TIntLess = TLess<LongInt>;
  226.     TDict = TMap<LongInt, LongInt, TIntLess>;
  227.   var
  228.     sc: TDict;
  229.     scit: TDict.TIterator;
  230.     InOut: Text;
  231.     key, cnt: LongInt;
  232.   begin
  233.     routineName := {$I %currentroutine%};
  234.     sc := TDict.Create;
  235.     Assign(InOut, inFilename);
  236.     Reset(InOut);
  237.     while not EOF(InOut) do
  238.       begin
  239.         ReadLn(InOut, key);
  240.         Inc(Total);
  241.         cnt := 0;
  242.         sc.TryGetValue(key, cnt);
  243.         sc[key] := cnt + 1;
  244.       end;
  245.     Close(InOut);
  246.     Unique := sc.Size;
  247.     if Unique > 0 then
  248.       begin
  249.         Assign(InOut, outFilename);
  250.         Rewrite(InOut);
  251.         scit := sc.Min;
  252.         repeat
  253.           WriteLn(InOut, scit.Key, ' - ', scit.Value);
  254.         until not scit.Next;
  255.         Close(InOut);
  256.         scit.Free;
  257.       end;
  258.     sc.Free;
  259.   end;
  260.  
  261.   procedure Run(aProc: TProcedure);
  262.   begin
  263.     Total := 0;
  264.     Unique := 0;
  265.     Start := Now;
  266.     try
  267.       aProc();
  268.       WriteLn(Copy(routineName, 10, 20):7,'''s time: ', MilliSecondsBetween(Now(), Start) / 1000.0 : 0 : 4,' #unique: ',Unique,' #total: ',Total);
  269.     except
  270.       on e: Exception do
  271.         WriteLn('crashes with message "', e.Message, '"');
  272.     end;
  273.   end;
  274.  
  275. begin
  276.   Randomize;
  277.  
  278.   procedures := TProcedureArray.Create(SortCountJulkas, SortCountAkira, SortCountHoward, SortCountAvk1, SortCountAvk2);
  279.  
  280.   for randomrange := 1 to 10 do
  281.     begin
  282.       GenerateData(randomrange);
  283.       WriteLn(#10'RandomRange = ',randomrange);
  284.       for proc in procedures do
  285.         Run(proc);
  286.     end;
  287.  
  288.   for repeatCount := 1 to 10 do
  289.     begin
  290.       GenerateData(8, 2*repeatCount);
  291.       WriteLn(#10'repeatMillionsCount = ', 2*repeatCount);
  292.       for proc in procedures do
  293.         Run(proc);
  294.     end;
  295. end.
Some typical output is as follows:
Code: Pascal  [Select]
  1. repeatMillionsCount = 16
  2.  Julkas's time: 15.8210 #unique: 800000 #total: 16000000
  3.  Akira's time: 4.7230 #unique: 800000 #total: 16000000
  4.  Howard's time: 4.4930 #unique: 800000 #total: 16000000
  5.   Avk1's time: 3.9390 #unique: 800000 #total: 16000000
  6.    Avk2's time: 2.4480 #unique: 800000 #total: 16000000
  7.  
  8. repeatMillionsCount = 18
  9. Julkas's time: 17.4640 #unique: 800000 #total: 18000000
  10.   Akira's time: 5.2670 #unique: 800000 #total: 18000000
  11. Howard's time: 5.0560 #unique: 800000 #total: 18000000
  12.    Avk1's time: 4.4020 #unique: 800000 #total: 18000000
  13.   Avk2's time: 2.7890 #unique: 800000 #total: 18000000
  14.  
  15. repeatMillionsCount = 20
  16.  Julkas's time: 19.2770 #unique: 800000 #total: 20000000
  17.  Akira's time: 5.7910 #unique: 800000 #total: 20000000
  18.  Howard's time: 5.6010 #unique: 800000 #total: 20000000
  19.   Avk1's time: 4.8510 #unique: 800000 #total: 20000000
  20.    Avk2's time: 3.0690 #unique: 800000 #total: 20000000
For avk to design an algorithm that can analyse a 120 MB text file of 20,000,000 items, sort and count its data and and write out the result to a second file in 3 seconds on my ageing machine is pretty impressive.

mangakissa

  • Hero Member
  • *****
  • Posts: 944
Re: Sorting and Counting
« Reply #50 on: July 19, 2019, 08:36:47 am »
Can this be done with {$mode objfpc}?
Lazarus 1.84 (32b) / FPC 3.0.4
Windows 10

Thaddy

  • Hero Member
  • *****
  • Posts: 9187
Re: Sorting and Counting
« Reply #51 on: July 19, 2019, 09:20:20 am »
Can this be done with {$mode objfpc}?
Obviously, yes!, See Howards comments above.
also related to equus asinus.

avk

  • Full Member
  • ***
  • Posts: 154
    • my self-education project
Re: Sorting and Counting
« Reply #52 on: July 19, 2019, 09:27:31 am »
@howardpc, nice, very suitable.
Heh, highly likely my machine is even older than yours(comparing runtimes).
But as for my "algorithm", it seems that you are flattering me, it's just sorting an array and then counting duplicates.
Array helpers do about the same to extract unique(distinct) values.
Your counting sort does the same thing, only at the same time.
I'm more glad for the multiset, it seems it shows itself well.

@mangakissa, sure, just changed my version:
Code: Pascal  [Select]
  1. program OccurrenceCounter;
  2.  
  3. {$mode objfpc}{$H+}
  4. {$ImplicitExceptions Off}
  5. {$MODESWITCH NESTEDPROCVARS}
  6.  
  7. uses
  8.   SysUtils, DateUtils,
  9.   Generics.Defaults, Generics.Collections,
  10.   LGUtils, LGHashMultiSet, LGArrayHelpers,
  11.   gutil, gmap;
  12.  
  13. type
  14.   TIntPair = specialize TPair<LongInt, LongInt>;
  15.  
  16.   function ComparePairs(constref L, R: TIntPair): LongInt;
  17.   begin
  18.     if L.Key < R.Key then
  19.       Result := -1
  20.     else if L.Key = R.Key then
  21.       Result := 0
  22.     else
  23.       Result := 1;
  24.   end;
  25.  
  26. var
  27.   Total, Unique: Integer;
  28.   Start: TDateTime;
  29.  
  30.   procedure SortCountAkira;
  31.   var
  32.     I: LongInt;
  33.     InOut: Text;
  34.     Map: specialize TDictionary<LongInt, LongInt>;
  35.     Pair: TIntPair;
  36.     Pairs: specialize TArray<TIntPair>;
  37.   begin
  38.     Map := specialize TDictionary<LongInt, LongInt>.Create();
  39.     Map.Capacity := 10000000;
  40.     Assign(InOut, ParamStr(1));
  41.     Reset(InOut);
  42.     while not EOF(InOut) do begin
  43.       ReadLn(InOut, I);
  44.       Inc(Total);
  45.       if not Map.ContainsKey(I) then
  46.         begin
  47.           Map.Add(I, 1);
  48.           Inc(Unique);
  49.         end
  50.       else
  51.         Map[I] := Map[I] + 1;
  52.     end;
  53.     Close(InOut);
  54.     Pairs := Map.ToArray();
  55.     Map.Free();
  56.     specialize TArrayHelper<TIntPair>.Sort(
  57.       Pairs,
  58.       specialize TComparer<TIntPair>.Construct(@ComparePairs)
  59.     );
  60.     Assign(InOut, ParamStr(2));
  61.     Rewrite(InOut);
  62.     for Pair in Pairs do with Pair do
  63.       WriteLn(InOut, Key, ' - ', Value);
  64.     Close(InOut);
  65.   end;
  66.  
  67.   procedure SortCountHoward;
  68.   var
  69.     arr: array of Integer;
  70.     textf: TextFile;
  71.     min: Integer = High(Integer);
  72.     max: Integer = -1;
  73.     i: Integer;
  74.   begin
  75.     AssignFile(textf, ParamStr(1));
  76.     Reset(textf);
  77.     while not EOF(textf) do
  78.       begin
  79.         ReadLn(textf, i);
  80.         Inc(Total);
  81.         if i < min then
  82.           min := i;
  83.         if i > max then
  84.           max := i;
  85.       end;
  86.     SetLength(arr, max-min+1);
  87.  
  88.     Reset(textf);
  89.     while not EOF(textf) do
  90.       begin
  91.         ReadLn(textf, i);
  92.         Dec(i, min);
  93.         Inc(arr[i]);
  94.       end;
  95.     CloseFile(textf);
  96.  
  97.     AssignFile(textf, ParamStr(2));
  98.     Rewrite(textf);
  99.     for i := Low(arr) to High(arr) do
  100.       case (arr[i] > 0) of
  101.         True:
  102.           begin
  103.             WriteLn(textf, Format('%d - %d',[i+min, arr[i]]));
  104.             Inc(Unique);
  105.           end;
  106.       end;
  107.     CloseFile(textf);
  108.     SetLength(arr, 0);
  109.   end;
  110.  
  111.   procedure SortCountAvk1;
  112.   type
  113.     TCounter  = specialize TGHashMultiSetLP<Integer>;
  114.     TCountRef = specialize TGAutoRef<TCounter>;
  115.     TEntry    = TCounter.TEntry;
  116.  
  117.     function EntryCmp(constref L, R: TEntry): SizeInt;
  118.     begin
  119.       if L.Key > R.Key then
  120.         Result := 1
  121.       else
  122.         if L.Key < R.Key then
  123.           Result := -1
  124.         else
  125.           Result := 0;
  126.     end;
  127.  
  128.   var
  129.     CountRef: TCountRef;
  130.     InOut: Text;
  131.     Counter: TCounter;
  132.     e: TEntry;
  133.     I: Integer;
  134.   begin
  135.     Counter := CountRef;
  136.     //Counter.LoadFactor := 0.7;
  137.     Assign(InOut, ParamStr(1));
  138.     Reset(InOut);
  139.     while not EOF(InOut) do
  140.       begin
  141.         ReadLn(InOut, I);
  142.         Counter.Add(I);
  143.       end;
  144.     Close(InOut);
  145.     Total := Counter.Count;
  146.     Unique := Counter.EntryCount;
  147.     if Counter.NonEmpty then
  148.       begin
  149.         Assign(InOut, ParamStr(2));
  150.         Rewrite(InOut);
  151.         for e in Counter.Entries.Sorted(@EntryCmp) do
  152.           with e do
  153.             WriteLn(InOut, Key, ' - ', Count);
  154.         Close(InOut);
  155.       end;
  156.   end;
  157.  
  158.   procedure SortCountAvk2;
  159.   var
  160.     List: array of Integer = nil;
  161.     InOut: Text;
  162.     I, J, Count, DupCount: Integer;
  163.   begin
  164.     Assign(InOut, ParamStr(1));
  165.     Reset(InOut);
  166.     SetLength(List, 4096);
  167.     I := 0;
  168.     while not EOF(InOut) do
  169.       begin
  170.         ReadLn(InOut, J);
  171.         Inc(Total);
  172.         if Length(List) = I then
  173.           SetLength(List, I * 2);
  174.         List[I] := J;
  175.         Inc(I);
  176.       end;
  177.     Close(InOut);
  178.     SetLength(List, I);
  179.     if List = nil then
  180.       exit;
  181.     specialize TGOrdinalArrayHelper<Integer>.Sort(List);
  182.     Count := I;
  183.     DupCount := 0;
  184.     I := 0;
  185.     Assign(InOut, ParamStr(2));
  186.     Rewrite(InOut);
  187.     repeat
  188.       J := List[I];
  189.       while (I < Count) and (List[I] = J) do
  190.         begin
  191.           Inc(DupCount);
  192.           Inc(I);
  193.         end;
  194.       WriteLn(InOut, J, ' - ', DupCount);
  195.       Inc(Unique);
  196.       DupCount := 0;
  197.     until I = Count;
  198.     Close(InOut);
  199.   end;
  200.  
  201.   procedure SortCountJulkas;
  202.   type
  203.     TIntLess = specialize TLess<LongInt>;
  204.     TDict = specialize TMap<LongInt, LongInt, TIntLess>;
  205.   var
  206.     sc: TDict;
  207.     scit: TDict.TIterator;
  208.     InOut: Text;
  209.     key, cnt: LongInt;
  210.   begin
  211.     sc := TDict.Create;
  212.     Assign(InOut, ParamStr(1));
  213.     Reset(InOut);
  214.     while not EOF(InOut) do
  215.       begin
  216.         ReadLn(InOut, key);
  217.         Inc(Total);
  218.         cnt := 0;
  219.         sc.TryGetValue(key, cnt);
  220.         sc[key] := cnt + 1;
  221.       end;
  222.     Close(InOut);
  223.     Unique := sc.Size;
  224.     if Unique > 0 then
  225.       begin
  226.         Assign(InOut, ParamStr(2));
  227.         Rewrite(InOut);
  228.         scit := sc.Min;
  229.         repeat
  230.           WriteLn(InOut, scit.Key, ' - ', scit.Value);
  231.         until not scit.Next;
  232.         Close(InOut);
  233.         scit.Free;
  234.       end;
  235.     sc.Free;
  236.   end;
  237.  
  238.   procedure SortCountMangakissa;
  239.   type
  240.     TMyTime = packed record
  241.       Unixtime : Integer;
  242.       Counter  : word;
  243.     end;
  244.  
  245.     function Find(var aMyTime : array of TMytime; aLine : Integer) : boolean;
  246.     var index : integer;
  247.     begin
  248.       result := false;
  249.       if length(aMyTime) > 0 then
  250.       begin
  251.         for index := low(aMyTime) to high(aMyTime) do
  252.         begin
  253.           if aMyTime[index].Unixtime = aLine then
  254.           begin
  255.             aMyTime[index].Counter := aMyTime[index].Counter + 1;
  256.             result := true;
  257.             break;
  258.           end;
  259.         end;
  260.       end;
  261.     end;
  262.  
  263.     procedure QuickSort(var A: array of tMytime; iLo, iHi: Integer) ;
  264.      var
  265.        Lo, Hi, Pivot : Integer;
  266.        T             : TMyTime;
  267.      begin
  268.        Lo := iLo;
  269.        Hi := iHi;
  270.        Pivot := A[(Lo + Hi) div 2].Unixtime;
  271.        repeat
  272.          while A[Lo].Unixtime < Pivot do Inc(Lo) ;
  273.          while A[Hi].Unixtime > Pivot do Dec(Hi) ;
  274.          if Lo <= Hi then
  275.          begin
  276.            T := A[Lo];
  277.            A[Lo] := A[Hi];
  278.            A[Hi] := T;
  279.            Inc(Lo) ;
  280.            Dec(Hi) ;
  281.          end;
  282.        until Lo > Hi;
  283.        if Hi > iLo then QuickSort(A, iLo, Hi) ;
  284.        if Lo < iHi then QuickSort(A, Lo, iHi) ;
  285.      end;
  286.   var
  287.     InOut         : Text;
  288.     MyTime        : array of TMyTime = nil;
  289.     Item          : TMyTime;
  290.     myline, index : integer;
  291.   begin
  292.     Assign(InOut, ParamStr(1));
  293.     Reset(InOut);
  294.     index := 0;
  295.     while not EOF(InOut) do
  296.       begin
  297.         ReadLn(InOut, myline);
  298.         Inc(Total);
  299.         if not find(MyTime, myline) then
  300.         begin
  301.           index := index + 1;
  302.           setlength(MyTime,index);
  303.           MyTime[index - 1].Unixtime := myLine;
  304.           MyTime[index - 1].Counter := 1;
  305.         end;
  306.       end;
  307.     Close(InOut);
  308.     if MyTime = nil then
  309.       exit;
  310.     Unique := Length(MyTime);
  311.     QuickSort(Mytime, low(MyTime), high(mytime));
  312.     Assign(InOut, ParamStr(2));
  313.     Rewrite(InOut);
  314.     for Item in Mytime do
  315.       with Item do
  316.         WriteLn(InOut, Unixtime, ' - ', Counter);
  317.     Close(InOut);
  318.   end;
  319.  
  320.   procedure Run(aProc: TProcedure);
  321.   begin
  322.     Total := 0;
  323.     Unique := 0;
  324.     Start := Now;
  325.     try
  326.       aProc();
  327.       WriteLn('elapsed time: ', MilliSecondsBetween(Now(), Start) / 1000.0 : 0 : 4);
  328.       WriteLn('#total: ', Total, ', #unique: ', Unique);
  329.     except
  330.       on e: Exception do
  331.         WriteLn('crashes with message "', e.Message, '"');
  332.     end;
  333.   end;
  334.  
  335. begin
  336.   if ParamCount <> 2 then
  337.     begin
  338.       WriteLn('Usage: OccurrenceCounter infilename outfilename');
  339.       exit;
  340.     end;
  341.   if not FileExists(ParamStr(1)) then
  342.     begin
  343.       WriteLn('Input file "', ParamStr(1), '" not found');
  344.       exit;
  345.     end;
  346.  
  347.   WriteLn('running SortCountAkira:');
  348.   Run(@SortCountAkira);
  349.   WriteLn;
  350.  
  351.   WriteLn('running SortCountHoward:');
  352.   Run(@SortCountHoward);
  353.   WriteLn;
  354.  
  355.  
  356.   WriteLn('running SortCountAvk1:');
  357.   Run(@SortCountAvk1);
  358.   WriteLn;
  359.  
  360.   WriteLn('running SortCountAvk2:');
  361.   Run(@SortCountAvk2);
  362.   WriteLn;
  363.  
  364.   WriteLn('running SortCountJulkas:');
  365.   Run(@SortCountJulkas);
  366.   WriteLn;
  367.  
  368.   WriteLn('running SortCountMangakissa:');
  369.   Run(@SortCountMangakissa);
  370. end.  
  371.  

output:
Code: Text  [Select]
  1. running SortCountAkira:
  2. elapsed time: 1.4130
  3. #total: 1750000, #unique: 710248
  4.  
  5. running SortCountHoward:
  6. elapsed time: 1.5600
  7. #total: 1750000, #unique: 710248
  8.  
  9. running SortCountAvk1:
  10. elapsed time: 1.0770
  11. #total: 1750000, #unique: 710248
  12.  
  13. running SortCountAvk2:
  14. elapsed time: 0.7170
  15. #total: 1750000, #unique: 710248
  16.  
  17. running SortCountJulkas:
  18. elapsed time: 2.6370
  19. #total: 1750000, #unique: 710248
  20.  
  21. running SortCountMangakissa:
  22. elapsed time: 375.5530
  23. #total: 1750000, #unique: 710248
  24.  

440bx

  • Hero Member
  • *****
  • Posts: 1200
Re: Sorting and Counting
« Reply #53 on: July 19, 2019, 09:31:39 am »
Here are a couple of implementations using the Windows API.

The first version does not use Readln.

The second version does not use Readln nor Writeln except for 2 writeln(s) to the console.

NOTE: the input file name is hardcoded to be "infile.txt" for convenience (easy to change that if needed.)
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

julkas

  • Sr. Member
  • ****
  • Posts: 416
  • KISS principle / Lazarus 2.0.0 / FPC 3.0.4
Re: Sorting and Counting
« Reply #54 on: July 19, 2019, 10:28:35 am »
Here is my faster (more compicated) version with TVector. Output file ~ 147 MB. Uniq count fixed.
Code: Pascal  [Select]
  1. program sc2;
  2. {$mode delphi}
  3. uses gvector, gutil, garrayutils, SysUtils;
  4. const
  5.   keyNum = 10000000;
  6. type
  7.   TIntLess = TLess<LongInt>;
  8.   TIntVect = TVector<LongInt>;
  9.   TOrd = TOrderingArrayUtils<TIntVect, LongInt, TIntLess>;
  10. var
  11.   sc: array[0..21474] of TIntVect;
  12.   i: LongInt;
  13.   pkey, key, cnt, uniq: LongInt;
  14.   offset: LongInt;
  15.   start: QWord;
  16.   outFile: Text;
  17. begin
  18.   start := GetTickCount64();
  19.   for i := Low(sc) to High(sc) do sc[i] := TIntVect.Create;
  20.  
  21.   for i := 0 to keyNum do
  22.   begin
  23.     key := Random(2147483647);
  24.     sc[key div 100000].PushBack(key mod 100000);
  25.   end;
  26.   WriteLn('Populated (ticks) - ', GetTickCount64() - start);
  27.  
  28.   Assign(outFile, 'out.txt');
  29.   Rewrite(outFile);
  30.  
  31.   for i := Low(sc) to High(sc) do if sc[i].Size > 1 then TOrd.Sort(sc[i], sc[i].Size);
  32.  
  33.   offset := -100000;
  34.   uniq := 0;
  35.   for i := Low(sc) to High(sc) do
  36.   begin
  37.     Inc(offset, 100000);
  38.     pkey := -1;
  39.     cnt := 0;
  40.     for key in sc[i] do
  41.     begin
  42.       if pkey <> key then
  43.       begin
  44.         if cnt <> 0 then
  45.         begin
  46.           WriteLn(outFile, offset + pkey, ' - ', cnt);
  47.           Inc(uniq);
  48.         end;
  49.         pkey := key;
  50.         cnt := 0;
  51.       end;
  52.       Inc(cnt);
  53.     end;
  54.     if cnt <> 0 then
  55.     begin
  56.       WriteLn(outFile, offset + pkey, ' - ', cnt);
  57.       Inc(uniq);
  58.     end;
  59.   end;
  60.  
  61.   Close(outFile);
  62.   for i := Low(sc) to High(sc) do
  63.     sc[i].Destroy;
  64.  
  65.   WriteLn('Total (ticks) - ', GetTickCount64() - start);
  66.   WriteLn('Uniq keys - ', uniq, ', out of - ', keyNum);
  67.   ReadLn;
  68. end.
Console output.
Code: Text  [Select]
  1. Populated (ticks) - 1312
  2. Total (ticks) - 4922
  3. Uniq keys - 9976566, out of - 10000000
« Last Edit: July 19, 2019, 01:38:10 pm by julkas »
procedure mulu64(a, b: QWORD; out clo, chi: QWORD); assembler;
asm
  mov rax, a
  mov rdx, b
  mul rdx
  mov [clo], rax
  mov [chi], rdx
end;

julkas

  • Sr. Member
  • ****
  • Posts: 416
  • KISS principle / Lazarus 2.0.0 / FPC 3.0.4
Re: Sorting and Counting
« Reply #55 on: July 19, 2019, 02:43:43 pm »
@Akira1364, @avk can you include my second algo in benchmark and run again? I have 3.0.4 and can't use LGenerics.
Thanks.
procedure mulu64(a, b: QWORD; out clo, chi: QWORD); assembler;
asm
  mov rax, a
  mov rdx, b
  mul rdx
  mov [clo], rax
  mov [chi], rdx
end;

avk

  • Full Member
  • ***
  • Posts: 154
    • my self-education project
Re: Sorting and Counting
« Reply #56 on: July 19, 2019, 08:10:44 pm »
@julkas, done
@440bx:
So, do you offer to compete with the Windows file mapping and ntdll? Okay, why not. :)
To run your SortCount (from SortCount2) inside a Howard's benchmark, I put the SortCount2 code in a separate unit:
Code: Pascal  [Select]
  1. // LONGSTRINGS ON needed for ParamStr to return a null terminated string
  2.  
  3. {$LONGSTRINGS ON}
  4.  
  5. unit WinSortCount2;
  6.  
  7. interface
  8. uses
  9.   windows,
  10.   sysutils,
  11.   dateutils
  12.   ;
  13.  
  14. const
  15.   ntdll            = 'ntdll.dll';
  16.   kernel32         = 'kernel32.dll';
  17.  
  18. var
  19.   // constants to replace ParamStr(1) and ParamStr(2) respectively
  20.  
  21.   InFileName: string = '';
  22.   OutFileName: string = '';
  23.  
  24.   DataCount : ptruint;
  25.   Unique    : ptruint;
  26.  
  27.   procedure SortCount;
  28.  
  29. implementation
  30. // -----------------------------------------------------------------------------
  31. // input file related types
  32.  
  33. type
  34.   // the input file is made up of unixtimes occupying 10 bytes followed by
  35.   // a CR/LF (on Windows)  To avoid comparing strings (which is slow) we define
  36.   // the unixtime characters as being composed on a qword and a word followed
  37.   // the the CR/LF ending.
  38.  
  39.   TINPUT_FILE_ELEMENT = packed record
  40.     case boolean of
  41.     0 : (
  42.          ife_hi                : qword;       // first 8 bytes of unixtime
  43.          ife_lo                : word;        // trailing 2 bytes of unixtime
  44.  
  45.          ife_LineEnding        : word;        // CRLF
  46.         );
  47.     1 : (
  48.           ife_unixtime         : packed array[0..9] of char;
  49.           ife_crlf             : word;
  50.         );
  51.   end;
  52.   PINPUT_FILE_ELEMENT = ^TINPUT_FILE_ELEMENT;
  53.  
  54.  
  55.  
  56. // -----------------------------------------------------------------------------
  57. // kernel32 related types and functions
  58.  
  59. type
  60.   PSECURITY_ATTRIBUTES = ^TSECURITY_ATTRIBUTES;
  61.   TSECURITY_ATTRIBUTES = record
  62.     Length                      : DWORD;
  63.     PointerToSecurityDescriptor : pointer;
  64.     InheritHandle               : boolean32;
  65.   end;
  66.  
  67.  
  68. function GetFileSizeEx(    FileHandle        : THANDLE;
  69.                        var PointerToFileSize : qword)
  70.          : boolean32; stdcall; external kernel32;
  71.  
  72. function CreateFileA(PointerToFileName           : pchar;
  73.                      DesiredAccess               : DWORD;
  74.                      ShareMode                   : DWORD;
  75.                      PointerToSecurityAttributes : PSECURITY_ATTRIBUTES;
  76.                      CreationDisposition         : DWORD;
  77.                      FlagsAndAttributes          : DWORD;
  78.                      TemplateFileHandle          : THANDLE)
  79.          : THANDLE; stdcall; external kernel32;
  80.  
  81.  
  82. // -----------------------------------------------------------------------------
  83. // ntdll related types and functions
  84.  
  85. type
  86.   TCompareFunction = function (key : pointer; data : pointer) : ptrint; cdecl;
  87.  
  88. const
  89.   COMPARE_EQUAL    =  0;
  90.   COMPARE_GREATER  =  1;
  91.   COMPARE_LESS     = -1;
  92.  
  93.  
  94.  
  95. procedure qsort(Base            : pointer;
  96.                 ElementCount    : ptruint;
  97.                 ElementSize     : ptruint;
  98.                 CompareFunction : TCompareFunction);
  99.           cdecl; external ntdll;
  100.  
  101. procedure RtlMoveMemory(Destination : pointer;
  102.                         Source      : pointer;
  103.                         BytesToCopy : ptruint);
  104.           stdcall; external ntdll;
  105.  
  106.  
  107. const
  108.   CRLF                      = #13#10;
  109.   FORMAT            : pchar = '%10.10s  -  %d' + CRLF;
  110.  
  111.   // if there are no duplicates at all, then the output file will be the size
  112.   // of the input file plus 6 additional characters (see FORMAT above)
  113.  
  114.   ADDITIONAL_OUTPUT         = 6;
  115.  
  116. // NOTE : don't user user32.dll wsprintf, it's a paraplegic dog.
  117.  
  118. function sprintf(OutputDestination : pchar;
  119.                  Format            : pchar;
  120.                  UnixTime          : pchar;
  121.                  Count             : integer) : integer; cdecl;    { CDECL !!}
  122.   external ntdll name 'sprintf';
  123.  
  124. // -----------------------------------------------------------------------------
  125.  
  126. function CompareUnixTimes(EntryA, EntryB : PINPUT_FILE_ELEMENT)
  127.          : ptrint; cdecl;
  128. begin
  129.   if EntryA^.ife_hi > EntryB^.ife_hi then exit(COMPARE_GREATER);
  130.   if EntryA^.ife_hi < EntryB^.ife_hi then exit(COMPARE_LESS);
  131.  
  132.   // the first qword of both entries is the same, use the last 2 bytes of the
  133.   // unixtime
  134.  
  135.   if EntryA^.ife_lo > EntryB^.ife_lo then exit(COMPARE_GREATER);
  136.   if EntryA^.ife_lo < EntryB^.ife_lo then exit(COMPARE_LESS);
  137.  
  138.   // they are the same
  139.  
  140.   result := COMPARE_EQUAL;
  141. end;
  142.  
  143. // -----------------------------------------------------------------------------
  144.  
  145. procedure Error(Id : ptruint);
  146. begin
  147.   write('FATAL : ');
  148.   case Id of
  149.     1 : writeln('LoadInputFileIntoMemory failed');
  150.     2 : writeln('WriteOutputFile - failed to create output file');
  151.     3 : writeln('WriteOutputFile - failed to write to output file');
  152.   end;
  153.  
  154.   halt(Id);
  155. end;
  156.  
  157. // -----------------------------------------------------------------------------
  158.  
  159. function LoadInputFileIntoMemory (Filename     : pchar;
  160.                               var Filesize     : qword;
  161.                               var OutputBuffer : pchar)
  162.          : PINPUT_FILE_ELEMENT;
  163.   // maps the input file in memory, determines its size and copies the input
  164.   // file into a separate memory block (because the input file is not supposed
  165.   // to be sorted.)  Also allocates a buffer for the output file.
  166. var
  167.   FileHandle           : THANDLE =   0;
  168.   FileMapping          : THANDLE =   0;
  169.   FileMapAddress       : pointer = nil;
  170.  
  171.   // variables that will be returned upon success
  172.  
  173.   FileData             : pointer = nil;
  174.   Size                 : qword   =   0;
  175.   FileOutBuffer        : pointer = nil;
  176.  
  177.   // to create a local scope (strictly local)
  178.  
  179.   SCOPE                : integer =   0;
  180.  
  181.   UnixTimesCount       : qword   =   0;
  182.  
  183. const
  184.   // constants used by CreateFile
  185.  
  186.   NO_TEMPLATE_FILE     =       0;
  187.  
  188.   // constants used by CreateFileMapping
  189.  
  190.   NO_MAXIMUM_SIZE_HIGH =       0;      // 0 indicates to use the size of the
  191.   NO_MAXIMUM_SIZE_LOW  =       0;      //   file
  192.  
  193.   // constants used by MapViewOfFileEx
  194.  
  195.   FILE_OFFSET_HIGH     =       0;      //   file offset to map from
  196.   FILE_OFFSET_LOW      =       0;
  197.  
  198.   BEGINNING_TO_END     =       0;
  199.  
  200. begin
  201.   // initialize return values
  202.  
  203.   result       := nil;
  204.   Filesize     := 0;
  205.   OutputBuffer := nil;
  206.  
  207.   // map the input file and allocate necessary resources.
  208.  
  209.   for SCOPE := 1 to 1 do       // trick to create a scope one can break out of
  210.   begin
  211.     FileHandle := CreateFileA(Filename,
  212.                               GENERIC_READ,
  213.                               FILE_SHARE_READ,
  214.                               nil,
  215.                               OPEN_EXISTING,
  216.                               FILE_ATTRIBUTE_NORMAL,
  217.                               NO_TEMPLATE_FILE);
  218.  
  219.     if FileHandle = INVALID_HANDLE_VALUE   then break;
  220.  
  221.     if not GetFileSizeEx(FileHandle, Size) then break;
  222.     if Size = 0                            then break;       // empty file
  223.  
  224.     // with the file handle, create a mapping for it
  225.  
  226.     FileMapping := CreateFileMappingA(FileHandle,
  227.                                       nil,
  228.                                       PAGE_READONLY,
  229.                                       NO_MAXIMUM_SIZE_HIGH,  // use file size
  230.                                       NO_MAXIMUM_SIZE_LOW,
  231.                                       nil);
  232.  
  233.     if (FileMapping = 0)                   then break;
  234.  
  235.     FileMapAddress := MapViewOfFileEx(FileMapping,
  236.                                       FILE_MAP_READ,
  237.                                       FILE_OFFSET_HIGH,      // from beginning
  238.                                       FILE_OFFSET_LOW,
  239.                                       BEGINNING_TO_END,      // to end
  240.                                       nil);                  // map anywhere
  241.  
  242.     if FileMapAddress = nil                then break;
  243.  
  244.     // allocate a memory block to hold the file data since the file itself
  245.     // won't be sorted
  246.  
  247.     FileData := HeapAlloc(GetProcessHeap(),
  248.                           0,
  249.                           Size);
  250.  
  251.     if FileData = nil                      then break;
  252.  
  253.  
  254.     // copy the data in the file to the block we just allocated
  255.  
  256.     RtlMoveMemory(FileData, FileMapAddress, Size);
  257.  
  258.     // allocate a buffer for the output file
  259.  
  260.     UnixTimesCount := Size div sizeof(TINPUT_FILE_ELEMENT);
  261.     FileOutBuffer  := HeapAlloc(GetProcessHeap(),
  262.                                 0,
  263.                                 Size + (UnixTimesCount * ADDITIONAL_OUTPUT));
  264.  
  265.     if FileOutBuffer = nil then break;        // just in case additional
  266.                                               // instructions are added at a
  267.                                               // later time.
  268.  
  269.   end;
  270.  
  271.   if (FileHandle     <> INVALID_HANDLE_VALUE) then CloseHandle(FileHandle);
  272.   if (FileMapping    <>                    0) then CloseHandle(FileMapping);
  273.  
  274.   if (FileMapAddress <> nil) then UnmapViewOfFile(FileMapAddress);
  275.  
  276.   if (FileData <> nil) then
  277.   begin
  278.     Filesize     := Size;
  279.     OutputBuffer := FileOutBuffer;
  280.     result       := FileData;
  281.   end;
  282. end;
  283.  
  284. // -----------------------------------------------------------------------------
  285.  
  286. procedure WriteOutputFile(OutputFilename               : pchar;
  287.                           DataOutputBuffer, DataOutPtr : pchar);
  288. const
  289.   FILE_NO_SHARE    = 0;
  290.   NO_TEMPLATE_FILE = 0;
  291.  
  292. var
  293.   FileHandle : THANDLE;
  294.   ByteCount  : ptruint;
  295.  
  296.   Ok           : BOOL  = FALSE;
  297.   BytesWritten : DWORD = 0;
  298.  
  299. begin
  300.   FileHandle := CreateFileA(OutputFilename,
  301.                             GENERIC_READ or GENERIC_WRITE,
  302.                             FILE_NO_SHARE,
  303.                             nil,
  304.                             CREATE_ALWAYS,
  305.                             FILE_ATTRIBUTE_NORMAL,
  306.                             NO_TEMPLATE_FILE);
  307.  
  308.   if FileHandle = INVALID_HANDLE_VALUE then Error(2);
  309.  
  310.   ByteCount := DataOutPtr - DataOutputBuffer;
  311.  
  312.  
  313.   Ok := WriteFile(FileHandle,
  314.                   DataOutputBuffer^,
  315.                   ByteCount,
  316.                   BytesWritten,
  317.                   nil);
  318.  
  319.   if not Ok then Error(3);
  320.  
  321.   CloseHandle(FileHandle);
  322. end;
  323.  
  324. // -----------------------------------------------------------------------------
  325.  
  326. procedure SortCount;
  327. var
  328.   Data             : PINPUT_FILE_ELEMENT = nil;
  329.  
  330.   DataEnd          : PINPUT_FILE_ELEMENT = nil;
  331.   DataOutputBuffer : pchar               = nil;
  332.   DataOutPtr       : pchar               = nil;
  333.   DataOutLength    : integer             =   0;
  334.  
  335.   Filesize         : qword               =   0;  // compiler whines otherwise
  336.  
  337.   InstanceCount    : integer             =   0;
  338.  
  339.   i, j             : PINPUT_FILE_ELEMENT;
  340.  
  341. begin
  342.   Data := LoadInputFileIntoMemory(PChar(InFileName), Filesize, DataOutputBuffer);
  343.  
  344.   if Data = nil then Error(1);    // an empty file is treated as an error
  345.  
  346.   // sort the data
  347.  
  348.   DataCount := Filesize div sizeof(TINPUT_FILE_ELEMENT);
  349.  
  350.   qsort(Data,
  351.         DataCount,
  352.         sizeof(Data^),
  353.         TCompareFunction(@CompareUnixTimes)
  354.        );
  355.  
  356.   // use the same algorithm used by Avk to produce the output file.
  357.  
  358.   Unique        := 0;
  359.   InstanceCount := 0;
  360.  
  361.   i        := Data;
  362.   j        := i;
  363.   DataEnd  := Data + DataCount;
  364.  
  365.   // determine the duplicate counts
  366.  
  367.   DataOutPtr := DataOutputBuffer;
  368.  
  369.   repeat
  370.     while (j < DataEnd)
  371.       and
  372.           ((j^.ife_hi = i^.ife_hi) and (j^.ife_lo = i^.ife_lo))
  373.        do
  374.     begin
  375.       inc(InstanceCount);
  376.       inc(j);
  377.     end;
  378.  
  379.     DataOutLength := sprintf(DataOutPtr,
  380.                              FORMAT,
  381.                              i^.ife_unixtime, InstanceCount);
  382.  
  383.     inc(DataOutPtr, DataOutLength);
  384.  
  385.     InstanceCount := 0;
  386.     inc(Unique);
  387.  
  388.     i := j;
  389.   until j >= DataEnd;
  390.  
  391.   WriteOutputFile(PChar(OutFileName), DataOutputBuffer, DataOutPtr);
  392.  
  393.   //writeln('#unique : ', Unique, '   #Total : ', DataCount);
  394.  
  395.   HeapFree(GetProcessHeap(), 0, Data);
  396.   HeapFree(GetProcessHeap(), 0, DataOutputBuffer);
  397. end;
  398.  
  399. // -----------------------------------------------------------------------------
  400. //
  401. //var
  402. //  Start : TDATETIME;
  403. //
  404. //begin
  405. //  Start := Now;
  406. //
  407. //  SortCount;
  408. //
  409. //  writeln;
  410. //  writeln('elapsed time: ', MilliSecondsBetween(Now(), Start) / 1000.0 : 0 : 4);
  411. end.
  412.  

output:
Code: Text  [Select]
  1.  
  2. RandomRange = 5
  3. Julkas1's time: 2.7450 #unique: 490842 #total: 2000000
  4. Julkas2's time: 0.9680 #unique: 0 #total: 2000000
  5.   Akira's time: 1.4500 #unique: 490842 #total: 2000000
  6.  Howard's time: 1.2950 #unique: 490842 #total: 2000000
  7.    Avk1's time: 0.9830 #unique: 490842 #total: 2000000
  8.    Avk2's time: 0.7330 #unique: 490842 #total: 2000000
  9.   440bx's time: 0.7490 #unique: 490842 #total: 2000000
  10.  
  11. RandomRange = 6
  12. Julkas1's time: 2.8080 #unique: 578734 #total: 2000000
  13. Julkas2's time: 0.9820 #unique: 0 #total: 2000000
  14.   Akira's time: 1.5140 #unique: 578734 #total: 2000000
  15.  Howard's time: 1.3260 #unique: 578734 #total: 2000000
  16.    Avk1's time: 1.0290 #unique: 578734 #total: 2000000
  17.    Avk2's time: 0.7800 #unique: 578734 #total: 2000000
  18.   440bx's time: 0.7180 #unique: 578734 #total: 2000000
  19.  
  20. RandomRange = 7
  21. Julkas1's time: 2.9330 #unique: 659876 #total: 2000000
  22. Julkas2's time: 0.9990 #unique: 0 #total: 2000000
  23.   Akira's time: 1.5750 #unique: 659876 #total: 2000000
  24.  Howard's time: 1.3420 #unique: 659876 #total: 2000000
  25.    Avk1's time: 1.0920 #unique: 659876 #total: 2000000
  26.    Avk2's time: 0.7950 #unique: 659876 #total: 2000000
  27.   440bx's time: 0.7340 #unique: 659876 #total: 2000000
  28.  
  29. RandomRange = 8
  30. Julkas1's time: 3.0110 #unique: 733819 #total: 2000000
  31. Julkas2's time: 1.0450 #unique: 0 #total: 2000000
  32.   Akira's time: 1.6540 #unique: 733819 #total: 2000000
  33.  Howard's time: 1.3880 #unique: 733819 #total: 2000000
  34.    Avk1's time: 1.1230 #unique: 733819 #total: 2000000
  35.    Avk2's time: 0.8270 #unique: 733819 #total: 2000000
  36.   440bx's time: 0.7490 #unique: 733819 #total: 2000000
  37.  
  38. RandomRange = 9
  39. Julkas1's time: 3.0880 #unique: 802323 #total: 2000000
  40. Julkas2's time: 1.0460 #unique: 0 #total: 2000000
  41.   Akira's time: 1.7310 #unique: 802323 #total: 2000000
  42.  Howard's time: 1.3890 #unique: 802323 #total: 2000000
  43.    Avk1's time: 1.2320 #unique: 802323 #total: 2000000
  44.    Avk2's time: 0.8420 #unique: 802323 #total: 2000000
  45.   440bx's time: 0.7650 #unique: 802323 #total: 2000000
  46.  
  47. RandomRange = 10
  48. Julkas1's time: 3.1520 #unique: 864809 #total: 2000000
  49. Julkas2's time: 1.0600 #unique: 0 #total: 2000000
  50.   Akira's time: 1.8260 #unique: 864809 #total: 2000000
  51.  Howard's time: 1.4190 #unique: 864809 #total: 2000000
  52.    Avk1's time: 1.2330 #unique: 864809 #total: 2000000
  53.    Avk2's time: 0.8730 #unique: 864809 #total: 2000000
  54.   440bx's time: 0.7650 #unique: 864809 #total: 2000000
  55.  
  56. repeatMillionsCount = 10
  57. Julkas1's time: 13.8380 #unique: 799999 #total: 10000000
  58. Julkas2's time: 4.2580 #unique: 0 #total: 10000000
  59.   Akira's time: 4.9460 #unique: 799999 #total: 10000000
  60.  Howard's time: 5.8180 #unique: 799999 #total: 10000000
  61.    Avk1's time: 4.0100 #unique: 799999 #total: 10000000
  62.    Avk2's time: 3.0730 #unique: 799999 #total: 10000000
  63.   440bx's time: 3.1670 #unique: 799999 #total: 10000000
  64.  
  65. repeatMillionsCount = 12
  66. Julkas1's time: 16.4420 #unique: 799999 #total: 12000000
  67. Julkas2's time: 5.0550 #unique: 0 #total: 12000000
  68.   Akira's time: 5.7870 #unique: 799999 #total: 12000000
  69.  Howard's time: 6.9110 #unique: 799999 #total: 12000000
  70.    Avk1's time: 4.7110 #unique: 799999 #total: 12000000
  71.    Avk2's time: 3.6350 #unique: 799999 #total: 12000000
  72.   440bx's time: 3.7600 #unique: 799999 #total: 12000000
  73.  
  74. repeatMillionsCount = 14
  75. Julkas1's time: 19.2500 #unique: 800000 #total: 14000000
  76. Julkas2's time: 5.8660 #unique: 0 #total: 14000000
  77.   Akira's time: 6.6300 #unique: 800000 #total: 14000000
  78.  Howard's time: 8.0650 #unique: 800000 #total: 14000000
  79.    Avk1's time: 5.4290 #unique: 800000 #total: 14000000
  80.    Avk2's time: 4.1970 #unique: 800000 #total: 14000000
  81.   440bx's time: 4.3360 #unique: 800000 #total: 14000000
  82.  
  83. repeatMillionsCount = 16
  84. Julkas1's time: 22.1520 #unique: 800000 #total: 16000000
  85. Julkas2's time: 6.6620 #unique: 0 #total: 16000000
  86.   Akira's time: 7.4250 #unique: 800000 #total: 16000000
  87.  Howard's time: 9.1400 #unique: 800000 #total: 16000000
  88.    Avk1's time: 6.1000 #unique: 800000 #total: 16000000
  89.    Avk2's time: 4.7580 #unique: 800000 #total: 16000000
  90.   440bx's time: 4.9300 #unique: 800000 #total: 16000000
  91.  
  92. repeatMillionsCount = 18
  93. Julkas1's time: 24.3990 #unique: 800000 #total: 18000000
  94. Julkas2's time: 7.4720 #unique: 0 #total: 18000000
  95.   Akira's time: 8.2060 #unique: 800000 #total: 18000000
  96.  Howard's time: 10.2340 #unique: 800000 #total: 18000000
  97.    Avk1's time: 6.8170 #unique: 800000 #total: 18000000
  98.    Avk2's time: 5.3040 #unique: 800000 #total: 18000000
  99.   440bx's time: 5.5220 #unique: 800000 #total: 18000000
  100.  
  101. repeatMillionsCount = 20
  102. Julkas1's time: 27.5810 #unique: 800000 #total: 20000000
  103. Julkas2's time: 8.3310 #unique: 0 #total: 20000000
  104.   Akira's time: 9.0160 #unique: 800000 #total: 20000000
  105.  Howard's time: 11.3890 #unique: 800000 #total: 20000000
  106.    Avk1's time: 7.5340 #unique: 800000 #total: 20000000
  107.    Avk2's time: 5.9130 #unique: 800000 #total: 20000000
  108.   440bx's time: 6.0840 #unique: 800000 #total: 20000000
  109.  
« Last Edit: July 19, 2019, 08:18:35 pm by avk »

julkas

  • Sr. Member
  • ****
  • Posts: 416
  • KISS principle / Lazarus 2.0.0 / FPC 3.0.4
Re: Sorting and Counting
« Reply #57 on: July 19, 2019, 08:33:07 pm »
@avk Thanks. Strange, why uniq is zero in my second algorithm.
I have tested it with @Akira1364 benchmark (without LGenerics).
« Last Edit: July 19, 2019, 08:43:41 pm by julkas »
procedure mulu64(a, b: QWORD; out clo, chi: QWORD); assembler;
asm
  mov rax, a
  mov rdx, b
  mul rdx
  mov [clo], rax
  mov [chi], rdx
end;

howardpc

  • Hero Member
  • *****
  • Posts: 3181
Re: Sorting and Counting
« Reply #58 on: July 19, 2019, 08:37:56 pm »
I don't get a zero Unique with julkas' second implementation.
A typical comparison looks like this (I'm on Linux here, so have sadly omitted 440bx)
Code: Pascal  [Select]
  1.  Julkas1's time: 4.1120 #unique: 794594 #total: 4000000
  2.   Akira's time: 1.4990 #unique: 794594 #total: 4000000
  3.   Howard's time: 1.2950 #unique: 794594 #total: 4000000
  4.    Avk1's time: 1.2910 #unique: 794594 #total: 4000000
  5.  Julkas2's time: 1.0920 #unique: 794594 #total: 4000000
  6.    Avk2's time: 0.7510 #unique: 794594 #total: 4000000

julkas

  • Sr. Member
  • ****
  • Posts: 416
  • KISS principle / Lazarus 2.0.0 / FPC 3.0.4
Re: Sorting and Counting
« Reply #59 on: July 19, 2019, 09:10:48 pm »
Hm, I can improve my second edition ;D
procedure mulu64(a, b: QWORD; out clo, chi: QWORD); assembler;
asm
  mov rax, a
  mov rdx, b
  mul rdx
  mov [clo], rax
  mov [chi], rdx
end;