Recent

Author Topic: TList<string> VS. TStringList  (Read 2385 times)

stoffman

  • Jr. Member
  • **
  • Posts: 67
TList<string> VS. TStringList
« on: January 21, 2021, 05:06:45 pm »
Hi,

I'm doing some string processing in my application, it doesn't have to interface with a UI and needs to be compatible with Delphi... So what would you prefer using for handling list of strings TStringList or TList<string> ? and why?

Did someone benched mark operation on them? memory consumption?

Thanks.

howardpc

  • Hero Member
  • *****
  • Posts: 4144
Re: TList<string> VS. TStringList
« Reply #1 on: January 21, 2021, 05:23:16 pm »
You don't specify what "processing" exactly your strings need to undergo.
However, don't overlook a simple dynamic array of strings.

Such arrays are lightweight compared to most list classes, and their memory usage is managed for you by the compiler. How many strings are you dealing with?

ASerge

  • Hero Member
  • *****
  • Posts: 2212
Re: TList<string> VS. TStringList
« Reply #2 on: January 21, 2021, 05:31:53 pm »
So what would you prefer using for handling list of strings TStringList or TList<string> ? and why?
TList<string> will take a little less RAM due to the absence of the Objects field, but much more in the executable file, because in the UI program, TStringList is definitely already used, but TList<string> is not. In addition, if I'm not mistaken, so far FPC does not combine the specialized generics for the same type from different modules.
Both use range checking and data modification events.
In my opinion, generics usually complicates the program. I recommend TStringList.

stoffman

  • Jr. Member
  • **
  • Posts: 67
Re: TList<string> VS. TStringList
« Reply #3 on: January 21, 2021, 06:54:52 pm »
@Aserge

Amazing. Every unit that makes use of TList<string> adds 10k for the executable! wow. That really caught me by surprise.


@howardpc

I really don't like the the ergonomics of dynamic arrays...

JuhaManninen

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 4458
  • I like bugs.
Re: TList<string> VS. TStringList
« Reply #4 on: January 21, 2021, 08:35:11 pm »
I agree that TStringList makes more sense than TList<string>. Generics are useful for types which do not have built-in container- etc. types.

Amazing. Every unit that makes use of TList<string> adds 10k for the executable! wow. That really caught me by surprise.
If you define a new type only in one place, like:
Code: Pascal  [Select][+][-]
  1. TMyStringList = specialize TList<string>;
and use it in other units, I think it will add 10k total.

Anyway, if you only add strings to a list and later read then out, TStringList is good. Often however it is used in wrong places which leads to poor performance.
If you want to search for strings fast, you sort the list. Then Find() and IndexOf() use binary search O(log2(n)). Good.
In some use cases you must add more items to a list and then search again. Adding items to a sorted list is very slow. First it does a binary search and then moves a big chunk of existing items a step forward. Same problem when deleting items.
Many Delphi programs do that because traditionally there were no good container libraries.
Now there are choices. For lots of data a HashMap is superior. It uses more memory than some other containers and resizing is "expensive".
A good compromise is a balanced tree. It does a fast binary search when adding and retrieving, but does not need to move data around.
Unit AvgLvlTree in LazUtils has a nice optimized TStringMap, and others like TStringToPointerTree and similar.
LazUtils also has unit LookupStringList with a hybrid unsorted StringList with a fast lookup feature. I don't know if anybody uses it.
« Last Edit: January 22, 2021, 04:35:19 pm by JuhaManninen »
Mostly Lazarus trunk and FPC 3.2 on Manjaro Linux 64-bit.

stoffman

  • Jr. Member
  • **
  • Posts: 67
Re: TList<string> VS. TStringList
« Reply #5 on: January 22, 2021, 01:27:17 pm »
@JuhaManninen

Thanks for the detailed answer. In my case a simple list is really all I need.  Defining a new type really did the trick and the exe size was reduced.

 

AlexTP

  • Hero Member
  • *****
  • Posts: 2365
    • UVviewsoft
Re: TList<string> VS. TStringList
« Reply #6 on: January 22, 2021, 02:15:36 pm »
>O(log2(n)).

Since the log2(x) is log10(x)*some_const, and for any X: logX is log10*some_const, we can write quoted text shorted:

O(log(n))

JuhaManninen

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 4458
  • I like bugs.
Re: TList<string> VS. TStringList
« Reply #7 on: January 22, 2021, 04:41:32 pm »
Since the log2(x) is log10(x)*some_const, and for any X: logX is log10*some_const, we can write quoted text shorted:
O(log(n))
Yes, I guess that is the right Big-O notation. I don't remember details.
Big-O also includes only the highest degree exponent term and leaves out any constant coefficients.
It is all about scalability.
Mostly Lazarus trunk and FPC 3.2 on Manjaro Linux 64-bit.

 

TinyPortal © 2005-2018