Recent

Author Topic: Programming language comparison by implementing the same transit data app  (Read 10685 times)

avk

  • Hero Member
  • *****
  • Posts: 752
Re: Programming language comparison by implementing the same transit data app
« Reply #15 on: November 16, 2022, 10:03:46 am »
Same Linux x86_64, but on a real machine, i7-7700HQ, DDR4-2400 dual channel, Sandisk Extreme Portable SSD V2 500GB over USB 3.0.

Hmm, then it's even more curious, my virtual Linux is running on an ancient i3-4150, HDD.

EDIT: Optimized by replacing direct dynamic array with TVector which has quite efficient growth factor, I got additional 250ms.

It seems even more:
Code: Text  [Select][+][-]
  1. parsed 1739278 stop times in 2.124 seconds
  2.  

Thaddy

  • Hero Member
  • *****
  • Posts: 14371
  • Sensorship about opinions does not belong here.
Re: Programming language comparison by implementing the same transit data app
« Reply #16 on: November 16, 2022, 11:01:51 am »
I may have missed something, but what are the optimization settings for Go and FPC? And which linker is used?
To me that is valuable information. The latter even matters for FPC e.g fpc-llvm should probably have similar results as Go.

Futhermore, language comparisons are futile, since what really matters is the compiler and linker implementation.
In principle there is no such thing as one language being faster than some other language. It is all about how a compiler generates efficient code and a linker can further optimize.
The above is a constant sorrow and shows not many people get how it really works....

When you read about - compiled - language comparisons related to speed all alarm bells should start sounding. https://www.youtube.com/watch?v=KcizmrkNcas
The people  who propose such things lack any knowledge. The comparison is always about the compiler and linker.

Any language that is Turing complete should, given a fictitious common compiler/linker, generate executables that have the same speed.
That is not opinion but fact.

(Note to test this, compare a fpc-llvm executable with a go executable: they are already very similar in execution speed, even although fpc-llvm is still a moving target with room to optimize the llvm tool chain. At least a level playing ground.)
« Last Edit: November 16, 2022, 01:05:38 pm by Thaddy »
Object Pascal programmers should get rid of their "component fetish" especially with the non-visuals.

Leledumbo

  • Hero Member
  • *****
  • Posts: 8757
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Programming language comparison by implementing the same transit data app
« Reply #17 on: November 16, 2022, 09:51:29 pm »
Hmm, then it's even more curious, my virtual Linux is running on an ancient i3-4150, HDD.
That's sick, how come it beats my 4 generations younger CPU with SSD?
It seems even more:
Code: Text  [Select][+][-]
  1. parsed 1739278 stop times in 2.124 seconds
  2.  
Same reason as above, I think.
I may have missed something, but what are the optimization settings for Go and FPC? And which linker is used?
Go: none, they don't have individual optimizations switch, only all (default) or none. FPC uses -CX -XXs -O3 (tried -O4, doesn't matter much). Linker also default for both.
Futhermore, language comparisons are futile, since what really matters is the compiler and linker implementation.
In principle there is no such thing as one language being faster than some other language. It is all about how a compiler generates efficient code and a linker can further optimize.
The above is a constant sorrow and shows not many people get how it really works....
Sure, I do realize this, that's why I said FPC, not Pascal. I still want to know where FPC lies in the performance realm with more mainstream languages implementation. I consider only Google Go's gc, ignoring gccgo and gollvm despite all 3 are officially supported.
« Last Edit: November 16, 2022, 09:56:03 pm by Leledumbo »

avk

  • Hero Member
  • *****
  • Posts: 752
Re: Programming language comparison by implementing the same transit data app
« Reply #18 on: November 17, 2022, 08:22:25 am »
Well, nothing out of the ordinary on my end either, the app is built as a Lazarus project in release mode(-CX -Xs -XX -O3).
And Go version:
Code: Text  [Select][+][-]
  1. go build <module_name>
  2.  

BTW, lines 245-246 of app.pas look suspicious. Seems like it should be:
Code: Pascal  [Select][+][-]
  1. LStopTimesIx.Add(i - 1);
  2. AStopTimes[i - 1] := TStopTime.Create(LTrip, LCSV.Cells[3, i], LCSV.Cells[1, i], LCSV.Cells[2, i]);
  3.  

There is another question: is it really necessary to fully parse a multi-megabyte CSV document if only its first 4 columns are needed?
For example, this version
Code: Pascal  [Select][+][-]
  1.   ...
  2. type
  3.   ...
  4.   TIntList          = specialize TList<Integer>;
  5.   TStringIntListMap = specialize TObjectDictionary<string, TIntList>;
  6.   TStopTimeDynArr   = specialize TObjectList<TStopTime>;
  7.   ...
  8. procedure GetStopTimes(out AStopTimes: TStopTimeDynArr; out AStopTimesIxByTrip: TStringIntListMap);
  9. type
  10.   TKeySet = array[0..3] of string;
  11.   procedure ParseLine(p, pEnd: PChar; out Keys: TKeySet);
  12.   var
  13.     Idx: Integer;
  14.     pStart: PChar;
  15.   begin
  16.     pStart := p;
  17.     Idx := 0;
  18.     while p <= pEnd do begin
  19.       if p^ = ',' then begin
  20.         SetLength(Keys[Idx], p - pStart);
  21.         Move(pStart^, Keys[Idx][1], p - pStart);
  22.         if Idx = 3 then exit;
  23.         Inc(Idx);
  24.         pStart := p+1;
  25.       end;
  26.       Inc(p);
  27.     end;
  28.   end;
  29. var
  30.   ms: TMemoryStream;
  31.   LStart,LEnd: TDateTime;
  32.   p, pStart, pStop: PChar;
  33.   LStopTimesIx: TIntList;
  34.   k: TKeySet;
  35.   HeaderOk: Boolean = False;
  36. begin
  37.   ms := TMemoryStream.Create;
  38.   try
  39.     LStart := Now;
  40.  
  41.     ms.LoadFromFile('../MBTA_GTFS/stop_times.txt');
  42.     p := ms.Memory;
  43.     pStop := p + ms.Size;
  44.     pStart := nil;
  45.     AStopTimesIxByTrip := TStringIntListMap.Create([doOwnsValues]);
  46.     AStopTimes := TStopTimeDynArr.Create;
  47.     while p < pStop do begin
  48.       if p^ in [#10, #13] then
  49.         if pStart <> nil then begin
  50.           ParseLine(pStart, p - 1, k);
  51.           if HeaderOk then begin
  52.             if not AStopTimesIxByTrip.TryGetValue(k[0], LStopTimesIx) then begin
  53.               LStopTimesIx := TIntList.Create;
  54.               AStopTimesIxByTrip.Add(k[0], LStopTimesIx);
  55.             end;
  56.             LStopTimesIx.Add(AStopTimes.Count);
  57.             AStopTimes.Add(TStopTime.Create(k[0], k[3], k[1], k[2]));
  58.           end else begin
  59.             if (k[0] <> 'trip_id') or (k[3] <> 'stop_id') or (k[1] <> 'arrival_time') or (k[2] <> 'departure_time') then begin
  60.               WriteLn('stop_times.txt not in expected format.');
  61.               Halt(1);
  62.             end;
  63.             HeaderOk := True;
  64.           end;
  65.           pStart := nil;
  66.         end else
  67.       else
  68.         if pStart = nil then
  69.           pStart := p;
  70.       Inc(p);
  71.     end;
  72.   finally
  73.     ms.Free;
  74.   end;
  75.  
  76.   LEnd := Now;
  77.   WriteLn('parsed ', AStopTimes.Count, ' stop times in ', SecondSpan(LStart, LEnd):1:3,' seconds');
  78. end;
  79. ...
  80.  
works out in 1.317 s, and if primitives from LGenerics are used, then in 0.810 s.

Futhermore, language comparisons are futile, since what really matters is the compiler and linker implementation.

Don't forget about libraries too.

Thaddy

  • Hero Member
  • *****
  • Posts: 14371
  • Sensorship about opinions does not belong here.
Re: Programming language comparison by implementing the same transit data app
« Reply #19 on: November 17, 2022, 09:37:43 am »
Don't forget about libraries too.
Libraries by themselves are by their very nature not part of it: they have nothing to do with code generation.
Object Pascal programmers should get rid of their "component fetish" especially with the non-visuals.

Leledumbo

  • Hero Member
  • *****
  • Posts: 8757
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Programming language comparison by implementing the same transit data app
« Reply #20 on: November 17, 2022, 11:07:26 am »
BTW, lines 245-246 of app.pas look suspicious. Seems like it should be:
Code: Pascal  [Select][+][-]
  1. LStopTimesIx.Add(i - 1);
  2. AStopTimes[i - 1] := TStopTime.Create(LTrip, LCSV.Cells[3, i], LCSV.Cells[1, i], LCSV.Cells[2, i]);
  3.  
Nice catch, it might be a problem for correctness, not performance though.
There is another question: is it really necessary to fully parse a multi-megabyte CSV document if only its first 4 columns are needed?
For example, this version
Code: Pascal  [Select][+][-]
  1.   ...
  2. type
  3.   ...
  4.   TIntList          = specialize TList<Integer>;
  5.   TStringIntListMap = specialize TObjectDictionary<string, TIntList>;
  6.   TStopTimeDynArr   = specialize TObjectList<TStopTime>;
  7.   ...
  8. procedure GetStopTimes(out AStopTimes: TStopTimeDynArr; out AStopTimesIxByTrip: TStringIntListMap);
  9. type
  10.   TKeySet = array[0..3] of string;
  11.   procedure ParseLine(p, pEnd: PChar; out Keys: TKeySet);
  12.   var
  13.     Idx: Integer;
  14.     pStart: PChar;
  15.   begin
  16.     pStart := p;
  17.     Idx := 0;
  18.     while p <= pEnd do begin
  19.       if p^ = ',' then begin
  20.         SetLength(Keys[Idx], p - pStart);
  21.         Move(pStart^, Keys[Idx][1], p - pStart);
  22.         if Idx = 3 then exit;
  23.         Inc(Idx);
  24.         pStart := p+1;
  25.       end;
  26.       Inc(p);
  27.     end;
  28.   end;
  29. var
  30.   ms: TMemoryStream;
  31.   LStart,LEnd: TDateTime;
  32.   p, pStart, pStop: PChar;
  33.   LStopTimesIx: TIntList;
  34.   k: TKeySet;
  35.   HeaderOk: Boolean = False;
  36. begin
  37.   ms := TMemoryStream.Create;
  38.   try
  39.     LStart := Now;
  40.  
  41.     ms.LoadFromFile('../MBTA_GTFS/stop_times.txt');
  42.     p := ms.Memory;
  43.     pStop := p + ms.Size;
  44.     pStart := nil;
  45.     AStopTimesIxByTrip := TStringIntListMap.Create([doOwnsValues]);
  46.     AStopTimes := TStopTimeDynArr.Create;
  47.     while p < pStop do begin
  48.       if p^ in [#10, #13] then
  49.         if pStart <> nil then begin
  50.           ParseLine(pStart, p - 1, k);
  51.           if HeaderOk then begin
  52.             if not AStopTimesIxByTrip.TryGetValue(k[0], LStopTimesIx) then begin
  53.               LStopTimesIx := TIntList.Create;
  54.               AStopTimesIxByTrip.Add(k[0], LStopTimesIx);
  55.             end;
  56.             LStopTimesIx.Add(AStopTimes.Count);
  57.             AStopTimes.Add(TStopTime.Create(k[0], k[3], k[1], k[2]));
  58.           end else begin
  59.             if (k[0] <> 'trip_id') or (k[3] <> 'stop_id') or (k[1] <> 'arrival_time') or (k[2] <> 'departure_time') then begin
  60.               WriteLn('stop_times.txt not in expected format.');
  61.               Halt(1);
  62.             end;
  63.             HeaderOk := True;
  64.           end;
  65.           pStart := nil;
  66.         end else
  67.       else
  68.         if pStart = nil then
  69.           pStart := p;
  70.       Inc(p);
  71.     end;
  72.   finally
  73.     ms.Free;
  74.   end;
  75.  
  76.   LEnd := Now;
  77.   WriteLn('parsed ', AStopTimes.Count, ' stop times in ', SecondSpan(LStart, LEnd):1:3,' seconds');
  78. end;
  79. ...
  80.  
works out in 1.317 s, and if primitives from LGenerics are used, then in 0.810 s.
My expectation is to use a generic (in the sense of not specifically tailored for this needs) csv loading code, because the Go version also uses their generic encoding/csv package. Hence, whole parsing still needs to be done. Using TMemoryStream might be a good idea since FPC doesn't yet have optimizations for array indexing with loop variables (I requested years ago, but FPK only came up with a showcase but never really committed to the repo, AFAIR) which I believe, Go's gc has. So using explicit pointer is the way to go.

avk

  • Hero Member
  • *****
  • Posts: 752
Re: Programming language comparison by implementing the same transit data app
« Reply #21 on: November 17, 2022, 12:04:49 pm »
Libraries by themselves are by their very nature not part of it: they have nothing to do with code generation.

Meanwhile, it looks like the benchmark under discussion is mostly about libraries. Or maybe you believe that when compiled under LLVM, FCL.TCSVDocument will suddenly start flying like a starfighter?

Thaddy

  • Hero Member
  • *****
  • Posts: 14371
  • Sensorship about opinions does not belong here.
« Last Edit: November 17, 2022, 01:51:27 pm by Thaddy »
Object Pascal programmers should get rid of their "component fetish" especially with the non-visuals.

Leledumbo

  • Hero Member
  • *****
  • Posts: 8757
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Programming language comparison by implementing the same transit data app
« Reply #23 on: November 17, 2022, 01:59:08 pm »
Interestingly, I tried:
  • Adapting avk's parsing algorithm: nope, slower, back to 5s
  • Using TMemoryStream instead of TFileStream + single Read to a string: nope, slower, about 4.8s
My version is still the fastest at 4.6s.

I don't think the app part needs modification yet, as the bottleneck there is still the HTTP server (tried comparing with curl to see the download speed, fphttpserver is around 4.5MBps while Go http server is 100-160MBps, huge difference). It's however quite fun to test various list/vector data structures. generics.collections.TList is damn slow compared to gvector.TVector, making the code a whole reading time 20% slower to 6s. I'll test LGenerics, despite only map benchmark is available, I hope the other data structures (list especially) is coded with similar performance characteristics in mind.

BrunoK

  • Sr. Member
  • ****
  • Posts: 452
  • Retired programmer
Re: Programming language comparison by implementing the same transit data app
« Reply #24 on: November 17, 2022, 04:02:45 pm »
Modifying csvutils.TCSVDocument.LoadFromFile(const AFileName: String); to pre compute number of rows and do a single initial SetLength(FCells, lRows) seem to improve GetStopTimes by roughly 20 %
Code: Pascal  [Select][+][-]
  1. procedure TCSVDocument.LoadFromFile(const AFileName: String);
  2. var
  3.   fs: TFileStream;
  4.   s: String;
  5.   n,i,j,r,c: SizeInt;
  6.   lRows : integer;
  7. begin
  8.   fs := TFileStream.Create(AFileName, fmOpenRead);
  9.   n := fs.Size;
  10.   SetLength(s, n);
  11.   fs.Read(s[1], n);
  12.  
  13.   i := 1;
  14.   j := i;
  15.   r := 0;
  16.   c := 0;
  17.   if n > 0 then begin
  18.     lRows := 0;
  19.     while i <= n do begin
  20.       if s[i] = #10 then
  21.         inc(lRows);
  22.       inc(i);
  23.     end;
  24.     // SetLength(FCells, 1);
  25.     SetLength(FCells, lRows);
  26.     i := 1;
  27.     while i <= n do begin
  28.         case s[i] of
  29.         ',': begin
  30.           SetLength(FCells[r], c + 1);
  31.           FCells[r, c] := Copy(s, j, i - j);
  32.           Inc(c);
  33.           j := i + 1;
  34.         end;
  35.         #10: begin
  36.           SetLength(FCells[r], c + 1);
  37.           FCells[r, c] := Copy(s, j, i - j);
  38.           Inc(r);
  39.           c := 0;
  40.           // SetLength(FCells, r + 1);
  41.           j := i + 1;
  42.         end;
  43.       end;
  44.       Inc(i);
  45.     end;
  46.   end;
  47.   SetLength(FCells, r);
  48.  
  49.   fs.Free;
  50. end;

Leledumbo

  • Hero Member
  • *****
  • Posts: 8757
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Programming language comparison by implementing the same transit data app
« Reply #25 on: November 17, 2022, 10:19:07 pm »
Modifying csvutils.TCSVDocument.LoadFromFile(const AFileName: String); to pre compute number of rows and do a single initial SetLength(FCells, lRows) seem to improve GetStopTimes by roughly 20 %
Still slower than the latest commit which is based on TVector:
Code: [Select]
parsed 1790905 stop times in 4600.000ms
parsed 71091 trips in 180.000ms
vs:
Code: [Select]
parsed 1790905 stop times in 5266.000ms
parsed 71091 trips in 211.000ms

And so I've cloned LGenerics only to find out that its lgList unit contains no simple list class that can hold strings, only objects, so sad.

avk

  • Hero Member
  • *****
  • Posts: 752
Re: Programming language comparison by implementing the same transit data app
« Reply #26 on: November 18, 2022, 09:15:18 am »
Probably, lgList is not a very good name, this unit contains some special kind of lists: sorted, hashed. You can look into the lgVector unit.
Although TVector from fcl-stl is quite fast, it seems that this version of CSVDocument
Code: Pascal  [Select][+][-]
  1. unit ucsv;
  2. {$mode objfpc}{$h+}
  3. interface
  4.  
  5. uses
  6.   SysUtils, lgVector;
  7.  
  8. type
  9.   TCSVDoc = class
  10.   private
  11.   type
  12.     TStrList   = specialize TGVector<string>;
  13.     TStrList2D = specialize TGObjectVector<TStrList>;
  14.   var
  15.     FCells: TStrList2D;
  16.     function  GetCell(aCol, aRow: SizeInt): string; inline;
  17.     function  GetColCount(aRow: SizeInt): SizeInt; inline;
  18.     function  GetRowCount: SizeInt; inline;
  19.   public
  20.     destructor Destroy; override;
  21.     procedure LoadFromFile(const aFileName: string);
  22.     property  Cells[aCol, aRow: SizeInt]: string read GetCell; default;
  23.     property  ColCount[aRow: SizeInt]: SizeInt read GetColCount;
  24.     property  RowCount: SizeInt read GetRowCount;
  25.   end;
  26.  
  27. implementation
  28.  
  29. uses
  30.   Classes, Math, lgUtils;
  31.  
  32. function TCSVDoc.GetCell(aCol, aRow: SizeInt): string;
  33. begin
  34.   Result := FCells[aRow][aCol];
  35. end;
  36.  
  37. function TCSVDoc.GetColCount(aRow: SizeInt): SizeInt;
  38. begin
  39.   Result := FCells[aRow].Count;
  40. end;
  41.  
  42. function TCSVDoc.GetRowCount: SizeInt; inline;
  43. begin
  44.   Result := FCells.Count;
  45. end;
  46.  
  47. destructor TCSVDoc.Destroy;
  48. begin
  49.   FCells.Free;
  50.   inherited;
  51. end;
  52.  
  53. procedure TCSVDoc.LoadFromFile(const aFileName: string);
  54. var
  55.   ss: specialize TGAutoRef<TStringStream>;
  56.   I, CellStart, Size: SizeInt;
  57.   Row: TStrList;
  58.   s: string;
  59. begin
  60.   ss.Instance.LoadFromFile(aFileName);
  61.   s := ss.Instance.DataString;
  62.   ss.Clear;
  63.   if FCells = nil then
  64.     FCells := TStrList2D.Create;
  65.   FCells.Clear;
  66.   CellStart := 0;
  67.   Size := -1;
  68.   Row := nil;
  69.  
  70.   for I := 1 to Length(s) do
  71.     case s[I] of
  72.       ',':
  73.         begin
  74.           if Row = nil then begin
  75.             Row := TStrList.Create(Max(Size, DEFAULT_CONTAINER_CAPACITY));
  76.             FCells.Add(Row);
  77.           end;
  78.           Row.Add(Copy(s, CellStart, I - CellStart));
  79.           CellStart := I + 1;
  80.         end;
  81.       #13, #10:
  82.         begin
  83.           if CellStart = 0 then continue;
  84.           Row.Add(Copy(s, CellStart, I - CellStart));
  85.           if Size = -1 then
  86.             Size := Row.Count;
  87.           CellStart := 0;
  88.           Row := nil;
  89.         end;
  90.     else
  91.       if CellStart = 0 then
  92.         CellStart := I;
  93.     end;
  94. end;
  95.  
  96. end.  
  97.  
is 8-9 percent faster than your TVector-based one.
And accordingly, this version of GetStopTimes()
Code: Pascal  [Select][+][-]
  1.   ...
  2. uses
  3.   ...
  4.   lgUtils, lgHashMap, lgVector,
  5.   ...
  6. type
  7.   TIntList          = specialize TGVector<Integer>;
  8.   TStringIntListMap = specialize TGObjHashMapLP<String, TIntList>;
  9.   ...
  10. procedure GetStopTimes(var AStopTimes: TStopTimeDynArr; var AStopTimesIxByTrip: TStringIntListMap);
  11. var
  12.   LCSV: TCSVDoc;
  13.   LStart,LEnd: TDateTime;
  14.   i: Integer;
  15.   LTrip: String;
  16.   LStopTimesIx: ^TIntList;
  17. begin
  18.   LCSV := TCSVDoc.Create;
  19.   try
  20.     LStart := Now;
  21.  
  22.     LCSV.LoadFromFile('../MBTA_GTFS/stop_times.txt');
  23.  
  24.     if (LCSV[0, 0] <> 'trip_id') or (LCSV[3, 0] <> 'stop_id') or (LCSV[1, 0] <> 'arrival_time') or (LCSV[2, 0] <> 'departure_time') then begin
  25.       WriteLn('stop_times.txt not in expected format:');
  26.       for i := 0 to LCSV.ColCount[0] - 1 do begin
  27.         WriteLn(i, ' ' + LCSV[i, 0]);
  28.       end;
  29.       Halt(1);
  30.     end;
  31.  
  32.     SetLength(AStopTimes, LCSV.RowCount - 1);
  33.     AStopTimesIxByTrip := TStringIntListMap.Create([moOwnsValues]);
  34.     for i := 1 to LCSV.RowCount - 1 do begin
  35.       LTrip := LCSV[0, i];
  36.       if not AStopTimesIxByTrip.FindOrAddMutValue(LTrip, LStopTimesIx) then
  37.         LStopTimesIx^ := TIntList.Create;
  38.       LStopTimesIx^.Add(i - 1);
  39.       AStopTimes[i - 1] := TStopTime.Create(LTrip, LCSV[3, i], LCSV[1, i], LCSV[2, i]);
  40.     end;
  41.  
  42.     LEnd := Now;
  43.  
  44.     WriteLn('parsed ', Length(AStopTimes), ' stop times in ', SecondSpan(LStart, LEnd):1:3,' seconds');
  45.   finally
  46.     LCSV.Free;
  47.   end;
  48. end;
  49.  
works out in about 1.640 s.

avk

  • Hero Member
  • *****
  • Posts: 752
Re: Programming language comparison by implementing the same transit data app
« Reply #27 on: November 18, 2022, 10:33:42 am »
starfighers usually crashed.
https://en.wikipedia.org/wiki/Lockheed_F-104_Starfighter

I remember reading a long time ago that pilots had very mixed feelings about the F-104, something like "I love it and I hate it".

Leledumbo

  • Hero Member
  • *****
  • Posts: 8757
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Programming language comparison by implementing the same transit data app
« Reply #28 on: November 18, 2022, 07:36:32 pm »
Probably, lgList is not a very good name, this unit contains some special kind of lists: sorted, hashed. You can look into the lgVector unit.
Oh, my dear eyes. How the heck they skip "vector" while it's so clear... maybe I shouldn't code after midnight haha.
Although TVector from fcl-stl is quite fast, it seems that this version of CSVDocument
<code intentionally skipped for brevity>
is 8-9 percent faster than your TVector-based one.
And accordingly, this version of GetStopTimes()
<code intentionally skipped for brevity>
works out in about 1.640 s.
Wow! Almost another full second drop on my system!
Code: [Select]
parsed 1790905 stop times in 3.846 seconds
parsed 71091 trips in 155.000ms
This is how the fun in code optimizing should be! Great job, avk!
So basically now on my system the loading code is only about twice as slow than the Go counterpart. The original repo has its Go loading time at around 850ms, which is 0.45x the time needed on my system (about 1900ms). Assuming linear result, on the original repo system it should only take about 1725ms, making it still the slowest among the mainstreams, but at least still beating Deno and Elixir. Now on to the HTTP server...

avk

  • Hero Member
  • *****
  • Posts: 752
Re: Programming language comparison by implementing the same transit data app
« Reply #29 on: November 19, 2022, 07:01:36 am »
Thank you. It seems possible to squeeze a little more out of TCSVDoc, at least my Windows version of this
Code: Pascal  [Select][+][-]
  1. procedure TCSVDoc.LoadFromFile(const aFileName: string);
  2. var
  3.   ms: specialize TGAutoRef<TMemoryStream>;
  4.   p, pCell, pEnd: pChar;
  5.   Size: SizeInt;
  6.   Row: TStrList;
  7.   pItem: TStrList.PItem;
  8. begin
  9.   ms.Instance.LoadFromFile(aFileName);
  10.   if FCells = nil then
  11.     FCells := TStrList2D.Create;
  12.   FCells.Clear;
  13.   p := ms.Instance.Memory;
  14.   pEnd := p + ms.Instance.Size;
  15.   pCell := nil;
  16.   Size := -1;
  17.   Row := nil;
  18.  
  19.   while p < pEnd do begin
  20.     case p^ of
  21.       ',':
  22.         begin
  23.           if pCell = nil then pCell := p;
  24.           if Row = nil then begin
  25.             Row := TStrList.Create(Max(Size, DEFAULT_CONTAINER_CAPACITY));
  26.             FCells.Add(Row);
  27.           end;
  28.           pItem := Row.UncMutable[Row.Add('')];
  29.           if p - pCell > 0 then begin
  30.             SetLength(pItem^, p - pCell);
  31.             Move(pCell^, PChar(pItem^)^, p - pCell);
  32.           end;
  33.           pCell := p + 1;
  34.         end;
  35.       #10, #13:
  36.         if pCell <> nil then begin
  37.           pItem := Row.UncMutable[Row.Add('')];
  38.           if p - pCell > 0 then begin
  39.             SetLength(pItem^, p - pCell);
  40.             Move(pCell^, PChar(pItem^)^, p - pCell);
  41.           end;
  42.           if Size = -1 then
  43.             Size := Row.Count;
  44.           pCell := nil;
  45.           Row := nil;
  46.         end;
  47.     else
  48.       if pCell = nil then
  49.         pCell := p;
  50.     end;
  51.     Inc(p);
  52.   end;
  53.   if pCell <> nil then begin
  54.     pItem := Row.UncMutable[Row.Add('')];
  55.     SetLength(pItem^, p - pCell);
  56.     Move(pCell^, PChar(pItem^)^, p - pCell);
  57.   end;
  58. end;
  59.  

shows a 7-8 percent performance improvement.

 

TinyPortal © 2005-2018