Recent

Author Topic: What is the fastest way to make a copy of a TStringList  (Read 16929 times)

stoffman

  • Jr. Member
  • **
  • Posts: 67
Re: What is the fastest way to make a copy of a TStringList
« Reply #15 on: July 16, 2017, 01:39:43 pm »
@wp Thank you very much!

Of course this is an application specific optimization, yet setting
Code: Pascal  [Select][+][-]
  1. SortStyle:= TStringsSortStyle.sslNone;
  2.  


improves the performance on my machine by 20% 
« Last Edit: July 16, 2017, 01:48:30 pm by stoffman »

taazz

  • Hero Member
  • *****
  • Posts: 5368
Re: What is the fastest way to make a copy of a TStringList
« Reply #16 on: July 16, 2017, 02:02:13 pm »
So my question is what is the fastest way to make a copy of a TStringList? did anyone did a benchmark?
There's one in the attachment. It demonstrates that a plain old "for" loop copying string by string is fastest, faster than Assign. It shows also that there is some speed advantage if the StringList Capacity is set to the known size of the source list (of course...).
actually the test with setcapacity and then assign is mute. assign calls clear which in return sets capacity to 0. It would be better to use addstrings in that test.
Good judgement is the result of experience … Experience is the result of bad judgement.

OS : Windows 7 64 bit
Laz: Lazarus 1.4.4 FPC 2.6.4 i386-win32-win32/win64

wp

  • Hero Member
  • *****
  • Posts: 11830
Re: What is the fastest way to make a copy of a TStringList
« Reply #17 on: July 16, 2017, 03:32:12 pm »
actually the test with setcapacity and then assign is mute. assign calls clear which in return sets capacity to 0. It would be better to use addstrings in that test.
I did not investigate what's happening here, but the test tells the opposite: Setting Capacity and then calling AddStrings is slower than setting Capacity and calling Assign...

Setting SortStyle to sslNone does not result in a significant advantage (because it is already set by default).

In the attachment there's a new test with new test cases and more flexibility.

J-G

  • Hero Member
  • *****
  • Posts: 953
Re: What is the fastest way to make a copy of a TStringList
« Reply #18 on: July 16, 2017, 03:45:50 pm »
. It's Sunday and I have a F1 race to watch..
... and what a race !!!    a shame Max couldn't quite catch Kimi
FPC 3.0.0 - Lazarus 1.6 &
FPC 3.2.2  - Lazarus 2.2.0 
Win 7 Ult 64

taazz

  • Hero Member
  • *****
  • Posts: 5368
Re: What is the fastest way to make a copy of a TStringList
« Reply #19 on: July 16, 2017, 04:33:13 pm »
actually the test with setcapacity and then assign is mute. assign calls clear which in return sets capacity to 0. It would be better to use addstrings in that test.
I did not investigate what's happening here, but the test tells the opposite: Setting Capacity and then calling AddStrings is slower than setting Capacity and calling Assign...

Setting SortStyle to sslNone does not result in a significant advantage (because it is already set by default).

In the attachment there's a new test with new test cases and more flexibility.
depends on the run I guess, you need to increase the test time for any meaningful results to start showing, I got any thing from 105% to 125% with the assign with capacity being at 110~120%. Most of the time the difference between the two is negligible I'm runing an 100 iteration test now to see what happens, but yeah to my eyes it seems equal to assign with capacity at the moment. Considering that assing call addstrings .....

EDIT:
last post of the day. I made a small change
Code: Pascal  [Select][+][-]
  1.     testname := 'Write to and load from stream';
  2.     Write(testname + ': ':LEN);
  3.     t := now;
  4.     for j:=1 to Nit do begin
  5.       L2 := TStringList.Create;
  6.       stream := TMemoryStream.Create;
  7.       try
  8.         stream.SetSize(N*Len); //Pre allocate the memory.
  9.         L1.SaveToStream(stream);
  10.         stream.Position := 0;
  11.         L2.LoadFromStream(stream);
  12.       finally
  13.         stream.Free;
  14.         L2.Free;
  15.       end;
  16.     end;
  17.     t := Now - t;
  18.     LResults.AddObject(testname, TDouble.Create(t));
  19.     WriteLn(FormatDateTime('s.zzz" s"', t));
  20.  
last shot shows the results.
« Last Edit: July 16, 2017, 05:00:01 pm by taazz »
Good judgement is the result of experience … Experience is the result of bad judgement.

OS : Windows 7 64 bit
Laz: Lazarus 1.4.4 FPC 2.6.4 i386-win32-win32/win64

wp

  • Hero Member
  • *****
  • Posts: 11830
Re: What is the fastest way to make a copy of a TStringList
« Reply #20 on: July 16, 2017, 05:20:28 pm »
I think putting it all together these are the lessons learned:
  • Copying by adding the individual strings from one stringlist to the other is fastest (use Add).
  • Assign and AddStrings are a bit slower because they internally call AddObject which requires the list to find also the object assigned to an item. If Objects are not used as in most cases this is a waste of time. (I think this could be improved by seeking for the TStringItem record stored internally from which both string and object can be obtained).
  • It is advantageous in most cases to set the Capacity of the destination list to the string count of the source list.
  • Don't copy the Text property.
  • Don't use streams as intermediate storage (Initially, I had thought this would be faster).

Almir.Bispo

  • Jr. Member
  • **
  • Posts: 91
  • CSV Comp DB is the Best NoSQL
    • CSV Comp DB (NoSQL)
Re: What is the fastest way to make a copy of a TStringList
« Reply #21 on: July 16, 2017, 06:12:52 pm »
I've developed a NoSQL type database where iterations in ram before writing to hard disk are essentially done using the class Tstringlist. All kind of perfomance tests have been performed. I leave an example of iteration.

Code: Pascal  [Select][+][-]
  1. procedure TForm1.Button1Click(Sender: TObject);
  2. var s1,s2:Tstringlist;
  3.    i:integer;
  4.     T:Ttime;
  5. begin
  6. s1:=Tstringlist.create;
  7. s2:=Tstringlist.create;
  8. //populate
  9. for i:= 0 to 1000000 do
  10. begin
  11. s1.add(inttostr(i));
  12. end;
  13. //start timer count
  14. T:=Ttime(now);//init time
  15. for i:= 0 to s1.count-1 do
  16. begin
  17. s2.add(s1.strings[i]);
  18. end;
  19. //end timer count
  20. memo1.text:='Time:(ms): '+milisegundo(T);
  21. s1.free;
  22. s2.free;
  23. end;                      
  24.  
« Last Edit: July 16, 2017, 06:15:21 pm by Almir.Bispo »
CSV Comp DB Developer {Pascal Lover}

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: What is the fastest way to make a copy of a TStringList
« Reply #22 on: July 18, 2017, 07:18:00 am »
I think putting it all together these are the lessons learned:
  • Copying by adding the individual strings from one stringlist to the other is fastest (use Add).
  • Assign and AddStrings are a bit slower because they internally call AddObject which requires the list to find also the object assigned to an item. If Objects are not used as in most cases this is a waste of time. (I think this could be improved by seeking for the TStringItem record stored internally from which both string and object can be obtained).
  • It is advantageous in most cases to set the Capacity of the destination list to the string count of the source list.
  • Don't copy the Text property.
  • Don't use streams as intermediate storage (Initially, I had thought this would be faster).


Nice summary. Using BeginUpdate/EndUpdate might give a little improvement.

To get the fastest possible copy, I hacked the class to get access to L1.FList. Made a copy of it for L2. Then increased the references of all the strings directly. This way was more than three times faster on my test computer.

On the other hand, I think these methods are not producing thread safe copy of TStringList because it shares the same strings with the main list.
« Last Edit: July 19, 2017, 06:35:50 pm by engkin »

Thaddy

  • Hero Member
  • *****
  • Posts: 14159
  • Probably until I exterminate Putin.
« Last Edit: July 18, 2017, 08:28:24 am by Thaddy »
Specialize a type, not a var.

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: What is the fastest way to make a copy of a TStringList
« Reply #24 on: July 19, 2017, 06:34:37 pm »
'because the reference count is "thread safe" '

Thanks Thaddy for both links.

taazz

  • Hero Member
  • *****
  • Posts: 5368
Re: What is the fastest way to make a copy of a TStringList
« Reply #25 on: July 19, 2017, 06:45:11 pm »
'because the reference count is "thread safe" '
it is, as long as you do it before assigning the reference to your variable and after you empty your variable from the reference, other wise while the reference counting mechanism remains thread safe your code can end up referencing an unallocated string or prematurely releasing a string.
Good judgement is the result of experience … Experience is the result of bad judgement.

OS : Windows 7 64 bit
Laz: Lazarus 1.4.4 FPC 2.6.4 i386-win32-win32/win64

Thaddy

  • Hero Member
  • *****
  • Posts: 14159
  • Probably until I exterminate Putin.
Re: What is the fastest way to make a copy of a TStringList
« Reply #26 on: July 19, 2017, 06:50:54 pm »
The real killer blow in that information is that strings with a refcount > 1 are immutable..... Not that the refcount is threadsafe (although that is true). And the internal use of UniqueString... O, Well..< for you two the old version of Fleetwood Mac's great song [edit] is now even playing on original vinyl>
« Last Edit: July 19, 2017, 06:53:44 pm by Thaddy »
Specialize a type, not a var.

taazz

  • Hero Member
  • *****
  • Posts: 5368
Re: What is the fastest way to make a copy of a TStringList
« Reply #27 on: July 19, 2017, 06:58:52 pm »
The real killer blow in that information is that strings with a refcount > 1 are immutable..... Not that the refcount is threadsafe (although that is true). And the internal use of UniqueString... O, Well..< for you two the old version of Fleetwood Mac's great song>
Technically you are correct, copy on write creates immutable behavior but only when the reference count is >1 which is not the same as immutable by design.

PS
  Being the thick headed I am I give permission to pound in me the facts if I missed them. As for the music reference I'll have to look it up :-[
Good judgement is the result of experience … Experience is the result of bad judgement.

OS : Windows 7 64 bit
Laz: Lazarus 1.4.4 FPC 2.6.4 i386-win32-win32/win64

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: What is the fastest way to make a copy of a TStringList
« Reply #28 on: July 19, 2017, 07:04:27 pm »
The real killer blow in that information is that strings with a refcount > 1 are immutable..... Not that the refcount is threadsafe (although that is true). And the internal use of UniqueString... O, Well..< for you two the old version of Fleetwood Mac's great song [edit] is now even playing on original vinyl>

You are a beast. But now I have to ask why the changes to the refcount are made threadsafe?

Thaddy

  • Hero Member
  • *****
  • Posts: 14159
  • Probably until I exterminate Putin.
Re: What is the fastest way to make a copy of a TStringList
« Reply #29 on: July 19, 2017, 07:05:57 pm »
That's a lock.... :-X https://www.youtube.com/watch?v=b19PcuJsQbA

They are made threadsafe for the obvious reason that immutables should be copied when they become mutants. We're not programming Turtle.... (well, actually I still do) No, it is simply done because it is a cheap protection with regard to multi-threading. On a very low-level compiler level.
« Last Edit: July 19, 2017, 07:17:47 pm by Thaddy »
Specialize a type, not a var.

 

TinyPortal © 2005-2018