Recent

Author Topic: What is faster string concatenation  (Read 3515 times)

lainz

  • Hero Member
  • *****
  • Posts: 4460
    • https://lainz.github.io/
What is faster string concatenation
« on: December 10, 2019, 01:24:30 am »
Hi, extending a string a lot of times with str:= str+ something is slower than using tstringlist.add and then joining everything at the end with tstringlist.text property?

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: What is faster string concatenation
« Reply #1 on: December 10, 2019, 01:50:58 am »
HI!

I do know that very long strings cutting the performance horrible.

How long will the final string be in the end?

Winni
« Last Edit: December 10, 2019, 02:49:17 am by winni »

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9791
  • Debugger - SynEdit - and more
    • wiki
Re: What is faster string concatenation
« Reply #2 on: December 10, 2019, 02:25:20 am »
Appending to a string, often means that the new string needs more memory than the old one. If there is not enough free memory at the end of the string (assuming a smart enough mem manager), then it needs to be relocated.

So doing append over and over again, means that the string will over and over be copied to new memory.

Collecting all parts in a TStringList only copies the pointers. But in the end one big string must be created.
I am not sure if TStringList optimizes this in the final join.

You need to calculate the total size needed.
SetLength(target, totalsize);

and then for each string
move(to_be_appended[1], target[insert_pos], length(to_be_appended));

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: What is faster string concatenation
« Reply #3 on: December 10, 2019, 03:32:09 am »
Hi!

To speed it up do it the old school Turbo-trick way:

Code: Pascal  [Select][+][-]
  1. var a : array[0..2] of String = (
  2.     'This is a string ',
  3.     'This is the middle ',
  4.     'The End 📉 '
  5.     );
  6.     dest : String;
  7.     idx,k,len : integer;
  8. begin
  9.  len := length(a[0])+Length(a[1])+Length(a[2]);
  10.  setLength(dest,len);
  11.  idx := 1 ;
  12.  for k := 0 to 2 do
  13.   begin
  14.   move (a[k,1],dest[idx], length(a[k]));
  15.   inc (idx,length(a[k]));
  16.   end;
  17. showMessage (dest);  
  18. end;
  19.  
  20.  

Winni
« Last Edit: December 10, 2019, 03:34:34 am by winni »

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: What is faster string concatenation
« Reply #4 on: December 10, 2019, 09:22:07 am »
FPC 3.2 and newer comes with the TStringBuilder class which should handle this in the most performant way (it might not do currently, but that can be improved then :) ).

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11382
  • FPC developer.
Re: What is faster string concatenation
« Reply #5 on: December 10, 2019, 10:50:47 am »
I am not sure if TStringList optimizes this in the final join.

Yes. It walkes the array twice, first to calculate the length, then to move the strings.

This makes it less codepage aware though, compared to the concatenation, so be careful if "string" is not your most used string type.

BeniBela

  • Hero Member
  • *****
  • Posts: 905
    • homepage
Re: What is faster string concatenation
« Reply #6 on: December 10, 2019, 06:31:49 pm »
FPC 3.2 and newer comes with the TStringBuilder class which should handle this in the most performant way (it might not do currently, but that can be improved then :) ).

OMG, that is completely broken. Do not use it

Code: Pascal  [Select][+][-]
  1.  
  2.  
  3. procedure TStringBuilder.DoAppend(const S: {$IFDEF SBUNICODE}SBString{$ELSE}RawByteString{$ENDIF});
  4.  
  5. Var
  6.   L,SL : Integer;
  7.  
  8. begin
  9.   SL:=System.Length(S);
  10.   if SL>0 then
  11.     begin
  12.     L:=Length;
  13.     Length:=L+SL;
  14.     Move(S[1], FData[L],SL*SizeOf(SBChar));
  15.     end;
  16. end;
  17.  
  18. procedure TStringBuilder.SetLength(AValue: Integer);
  19.  
  20. begin
  21.   CheckNegative(AValue,'AValue');
  22.   CheckRange(AValue,0,MaxCapacity);
  23.   if AValue>Capacity then
  24.     Grow;
  25.   Flength:=AValue;
  26. end;
  27.  
  28.  
  29. procedure TStringBuilder.Grow;
  30.  
  31. var
  32.   NewCapacity: SizeInt;
  33.  
  34. begin
  35.   NewCapacity:=Capacity*2;
  36.   if NewCapacity>MaxCapacity then
  37.     NewCapacity:=MaxCapacity;
  38.   Capacity:=NewCapacity;
  39. end;
  40.  

If the buffer is too small, it only grows by doubling its size, even if the appended string is more than twice as long.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11382
  • FPC developer.
Re: What is faster string concatenation
« Reply #7 on: December 10, 2019, 06:38:48 pm »
If the buffer is too small, it only grows by doubling its size, even if the appended string is more than twice as long.

Worse, the exponential growth is never capped, so at some point it will grow with gigabytes at a time, which only increases fragmentation. Please file a bug.

Added later: Note that setlength regards the length parameter, so that means (oldlenght+currentlyaddedlength)*2. Not that bad IMHO
« Last Edit: December 10, 2019, 07:00:18 pm by marcov »

Thaddy

  • Hero Member
  • *****
  • Posts: 14197
  • Probably until I exterminate Putin.
Re: What is faster string concatenation
« Reply #8 on: December 10, 2019, 06:42:44 pm »
Heey, they are Pascal strings..... ONE READ per string. Length is stored......
Specialize a type, not a var.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11382
  • FPC developer.
Re: What is faster string concatenation
« Reply #9 on: December 10, 2019, 06:48:27 pm »
Heey, they are Pascal strings..... ONE READ per string. Length is stored......

I've no idea how this fits into the context. Please elaborate.

lainz

  • Hero Member
  • *****
  • Posts: 4460
    • https://lainz.github.io/
Re: What is faster string concatenation
« Reply #10 on: December 11, 2019, 12:54:11 am »
Thanks!! Also I used TStringList because is more simple to create a SQL statement with DelimitedText property.

The string was not so big just a SQL insert statement with at most 20 or 30 columns.

MvC

  • New Member
  • *
  • Posts: 25
    • Free Pascal Core team member
Re: What is faster string concatenation
« Reply #11 on: December 12, 2019, 04:10:52 pm »
Fixed the bug in stringbuilder.

https://bugs.freepascal.org/view.php?id=36425

Tz

  • Jr. Member
  • **
  • Posts: 54
  • Tz with FPC Pen Cil
Re: What is faster string concatenation
« Reply #12 on: December 26, 2019, 10:21:07 pm »
use initial size is much better   :)
sl := TStringList.Create;
sl.Capacity := INIT_SIZE;

 

TinyPortal © 2005-2018