Philosophically speaking: TFPSortedObjectList or TFPStringList?

EganSolo

Sr. Member
Posts: 290

Philosophically speaking: TFPSortedObjectList or TFPStringList?

« on: October 03, 2021, 10:24:32 am »

So I need an object list that can stay sorted.

Start Paren:
In case you were like me at some point and you didn't know about TFPObjectList and TFPStringList, these are cousins to TObjectList and TStringList but without event-related overhead: they don't notify when an action is taken.
End Paren.

I need an object list that can stay sorted. I could be lazy and use TFPStringList and add a field to my objects to act as a unique string, or I could write a descendant to TFPObjectList that has a sorted property and in case sorted is true, IndexOf would rely on a Compare function to do a binary search over the list.

The reason why I'm hesitant to do that is that when I googled "free pascal Lazarus TFPSortedObjectList TSortedFPObjectList TSortedObjectList" Mr. Google came back empty-handed. Am I the _only_ one who needs a sorted object list?

Suddenly I feel like the guy who looks up and finds out that he's alone in a saloon....

So is it then the case that everyone else is just using TStringList?

What am I missing?

Egan

Logged

MarkMLl

Hero Member
Posts: 6686

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

« Reply #1 on: October 03, 2021, 10:48:31 am »

So why are you assuming that somebody who'd done the work would use the names you've apparently pulled out of a hat, and that all of those names would appear on the same page (which is what you've asked for)?

Your curiosity is entirely valid, but you're assuming that Google is an oracle that can divine your intention: it's not, it's there to sell advertising services, and there's more than enough fuzziness in it already.

MarkMLl

Logged

MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Thaddy

Hero Member
Posts: 14373
Sensorship about opinions does not belong here.

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

« Reply #2 on: October 03, 2021, 12:40:40 pm »

You can maybe use a TFPGMap or TFpgMapObject,
Example:

Code: Pascal [Select][+]

{$mode objfpc}{$H+}
{$mode objfpc}{$H+}{$warn 3123 off}{$warn 6058 off}
Uses fgl;
type
  TMySortedObjectList = specialize TFpgMapObject<string,TObject>;
  
var
  MyList:TMySortedObjectList;  
begin
  MyList := TMySortedObjectList.Create(true);
  MyList.Sorted:=true;
  try
    MyList.AddOrSetData('test',nil); // using this prevents duplicate keys.
    Writeln(MyList.Keys[0]); // writes 'test'
    MyList.AddOrSetData('best',nil);
    Writeln(MyList.Keys[0]); // since it is sorted, now prints 'best'
  finally
    MyList.Free;
  end;
end.

I guess this already does what you want.

« Last Edit: October 03, 2021, 01:32:23 pm by Thaddy »

Logged

Object Pascal programmers should get rid of their "component fetish" especially with the non-visuals.

avk

Hero Member
Posts: 752

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

« Reply #3 on: October 03, 2021, 01:45:06 pm »

@EganSolo, just FYI, adding items to an array based sorted list usually has a time complexity of O(N^2).

Logged

Thaddy

Hero Member
Posts: 14373
Sensorship about opinions does not belong here.

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

« Reply #4 on: October 03, 2021, 01:46:04 pm »

Slight variation:

Code: Pascal [Select][+]

{$mode objfpc}{$H+}{$warn 3123 off}{$warn 6058 off}
{$macro on}
// do not allow duplicate keys macro}
{$define add:=AddOrSetData}
Uses fgl;
type
  TMySortedObjectList = specialize TFpgMapObject<string,TObject>;
  
var
  MyList:TMySortedObjectList;  
begin
  MyList := TMySortedObjectList.Create(true);
  MyList.Sorted:=true;
  try
    MyList.Add('test',TObject.create); // using this prevents duplicate keys.
    Writeln(MyList.Keys[0]); // writes 'test'
    MyList.Add('best',TObject.create);
    Writeln(MyList.Keys[0]); // since it is sorted, now prints 'best'
  finally
    MyList.Free;
  end;
end.

If you want that...

Logged

Object Pascal programmers should get rid of their "component fetish" especially with the non-visuals.

Thaddy

Hero Member
Posts: 14373
Sensorship about opinions does not belong here.

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

« Reply #5 on: October 03, 2021, 01:51:16 pm »

Quote from: avk on October 03, 2021, 01:45:06 pm

@EganSolo, just FYI, adding items to an array based sorted list usually has a time complexity of O(N^2).

That is true for worst case. Best case is O(n*log2(n) ( given a quicksort in both cases).
Average case is O(n log n);

Theory is well explained here: https://iq.opengenus.org/time-and-space-complexity-of-quick-sort

Anyway you can expect a penalty as opposed unsorted.

« Last Edit: October 03, 2021, 01:58:47 pm by Thaddy »

Logged

Object Pascal programmers should get rid of their "component fetish" especially with the non-visuals.

Warfley

Hero Member
Posts: 1499

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

« Reply #6 on: October 03, 2021, 02:10:18 pm »

If I need a sorted list I usually take TSet from the GSet unit. It makes use of an Red-Black Tree structure on an Array for storing the data. It has some advantages over most other implementations, first it can give you pointer access to the array (avoiding copies, very neat for large records) and it takes the sorting function as generic parameter, making it inlinable (which the function pointer based approaches of structures like TStringList can't). Also it is very lightweight, meaning it doesn't have so much additional functionality like events and customization options, which of course also take their time to be checked and implemented
But it does not contain functionality to automatically free the objects like TObjectList has.

Also, what exactly do you need it for, maybe some other structure (like a min-heap) can serve you just as well.

Quote from: avk on October 03, 2021, 01:45:06 pm

@EganSolo, just FYI, adding items to an array based sorted list usually has a time complexity of O(N^2).

But thanks to the large caches modern CPUs today have (mine has 40mb), it is usually (that means when the list is rather small, like only a few pages of size) still much faster than linked non continous datastructures.
One could create more complex datastructures that are linked and still make use of cache locality (e.g. LinkedLists in Java usually are implemented as a linked list of arrays whose size is optimized for the CPU to maximize cache locality). But as far as I know there aren't many such implementations for FreePascal around. So an Array based list is probably the best you get.
Asymptotic complexity is only half the story, often an asymptotically bad algorithm can in real world scenarios, thanks to low static/implementation based overhead, outperform a asymptotically better algorithm.

« Last Edit: October 03, 2021, 02:13:34 pm by Warfley »

Logged

GitHub: https://github.com/Warfley

avk

Hero Member
Posts: 752

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

« Reply #7 on: October 03, 2021, 03:00:48 pm »

Just wondering at what point the sorted list turned into a sorted set? And the reasoning about processor caches seems to require at least some confirmation.

« Last Edit: October 03, 2021, 04:11:45 pm by avk »

Logged

Warfley

Hero Member
Posts: 1499

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

« Reply #8 on: October 03, 2021, 05:21:21 pm »

Quote from: avk on October 03, 2021, 03:00:48 pm

Just wondering at what point the sorted list turned into a sorted set. And the reasoning about processor caches seems to require at least some confirmation.

TSet is part of the FCL-STL package and is basically a reimplementation of the C++ set datatype. It's just the name C++ gave the sorted datastructure. It can not hold duplicate values, for this the multiset type exists. It has nothing to do with the (bit) sets like in pascal.

About the caching, there is this great talk by Bjarne Stroustrup that discusses this: Link
Basically as the linear search time through the list dominates the runtime, arrays which make use of caching are much faster at that and therefore overall faster

And if we are talking about sorted lists, arrays are even better, because each binary search iteration can be done in constant time (as random access on arrays is in constant time), while all the distances need to be traversed on linked lists (and if the linked list isn't double linked, this it always needs to search from the root, making it O(n) rather than O(logn))

Linked lists are usually only found in places where large continous allocations are not possible like in the kernel. When you still find linked lists in the wild they are usually highly optimized (e.g. by having them allocate using a stack allocator with swap delete in the background, or linking small chunks of multiple objects together, e.g. in Java)

Also arrays have another massive advantage, you can preallocate the memory or perform geometric growth, which can result in massive speedups if the heap is highly fragmented, and many small allocations lead to long search times for the memory allocation

« Last Edit: October 03, 2021, 05:27:27 pm by Warfley »

Logged

GitHub: https://github.com/Warfley

avk

Hero Member
Posts: 752

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

« Reply #9 on: October 03, 2021, 05:49:00 pm »

Quote from: Warfley on October 03, 2021, 05:21:21 pm

TSet is part of the FCL-STL package and is basically a reimplementation of the C++ set datatype. It's just the name C++ gave the sorted datastructure. It can not hold duplicate values, for this the multiset type exists. It has nothing to do with the (bit) sets like in pascal.
...

So what do you find it looks like a sorted list?

Quote from: Warfley on October 03, 2021, 05:21:21 pm

...
About the caching, there is this great talk by Bjarne Stroustrup that discusses this: Link
Basically as the linear search time through the list dominates the runtime, arrays which make use of caching are much faster at that and therefore overall faster

And if we are talking about sorted lists, arrays are even better, because each binary search iteration can be done in constant time (as random access on arrays is in constant time), while all the distances need to be traversed on linked lists (and if the linked list isn't double linked, this it always needs to search from the root, making it O(n) rather than O(logn))
...

Finding items is great, but I think I was talking about adding, not finding items.

Logged

Warfley

Hero Member
Posts: 1499

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

« Reply #10 on: October 03, 2021, 06:02:56 pm »

Quote from: avk on October 03, 2021, 05:49:00 pm

So what do you find it looks like a sorted list?

It is a sorted list in everything but name.

Quote from: avk on October 03, 2021, 05:49:00 pm

Finding items is great, but I think I was talking about adding, not finding items.

But to add you need to first do a search. Let's say I have a linked list and want to add an item at position n, then I need to traverse from the root to the (n-1)th element to insert the newly created element right after it. Adding in linked lists is only faster, if you already have an element and want to add an element after (or before in case of a double linked list) that. If you just want to add an element to an arbitrary position, you first need to find that position.

Logged

GitHub: https://github.com/Warfley

avk

Hero Member
Posts: 752

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

« Reply #11 on: October 03, 2021, 06:26:05 pm »

Quote from: Warfley on October 03, 2021, 06:02:56 pm

...
It is a sorted list in everything but name.
...

Do you think list items should have a "name" attribute?

Quote from: Warfley on October 03, 2021, 06:02:56 pm

...
But to add you need to first do a search. Let's say I have a linked list and want to add an item at position n, then I need to traverse from the root to the (n-1)th element to insert the newly created element right after it. Adding in linked lists is only faster, if you already have an element and want to add an element after (or before in case of a double linked list) that. If you just want to add an element to an arbitrary position, you first need to find that position.

Well, and what is it in the end? Without a doubt, searching in a sorted array is O(LogN), but I was talking about adding and this is O(N), and which will win?

Logged

Warfley

Hero Member
Posts: 1499

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

« Reply #12 on: October 03, 2021, 06:49:50 pm »

Quote from: avk on October 03, 2021, 06:26:05 pm

Well, and what is it in the end? Without a doubt, searching in a sorted array is O(LogN), but I was talking about adding and this is O(N), and which will win?

Asymptotic notation does not tell you how fast an operation will be, it tells you about the runtime growth with increasing input size. O(N) will for some input size be slower than O(logN) but if that input size threshold is 2^800 bytes, we are a long way before there will be any PC that has even enough memory to have such an input.

Take searching an index in a sorted list. Using binary search you need to do logarithmic many lookups. A lookup on an array is a single operation, while a lookup on a linked list is a traversal from the root to the element. So you have in a linked list a lookup that is in O(logN) the same asymptotic runtime you have in an array, but the static overhead of each access is much larger for the linked list than for the array.
When adding an object to the list you need to potentially move all the elements of the list backwards, which is in O(N).
So the runtime is c1*logN + c2 for adding in the linked list and c3*logN + c4*N for the array.
While for every c1, c2, c3 and c4 there will be an N such that c1*logN + c2 < c3*logN + c4*N, this does not mean that this holds for all N.

The question is not how efficient an algorithm is asymptotically but how efficient it is in a real world scenario. If you only have lists that do not exceed 100 elements, it doesn't matter how well a datastructure performs for arbitrary N, it only matters how efficient it is on N <= 100

« Last Edit: October 03, 2021, 06:52:49 pm by Warfley »

Logged

GitHub: https://github.com/Warfley

avk

Hero Member
Posts: 752

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

« Reply #13 on: October 03, 2021, 08:02:39 pm »

Quote from: Warfley on October 03, 2021, 06:49:50 pm

...
The question is not how efficient an algorithm is asymptotically but how efficient it is in a real world scenario. If you only have lists that do not exceed 100 elements, it doesn't matter how well a datastructure performs for arbitrary N, it only matters how efficient it is on N <= 100

Whose question is this? Did someone ask it?

Logged

Warfley

Hero Member
Posts: 1499

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

« Reply #14 on: October 03, 2021, 08:52:41 pm »

If the question is what datastructure to use, the question what is the most efficient in a real world scenario is pretty much the most important question. What is most efficient asymptotically is only of relevance with respect of answering the former question

Logged

GitHub: https://github.com/Warfley

Lazarus

Bookstore

Search

Recent

Author Topic: Philosophically speaking: TFPSortedObjectList or TFPStringList? (Read 12595 times)

EganSolo

Philosophically speaking: TFPSortedObjectList or TFPStringList?

MarkMLl

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

Thaddy

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

avk

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

Thaddy

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

Thaddy

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

Warfley

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

avk

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

Warfley

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

avk

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

Warfley

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

avk

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

Warfley

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

avk

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

Warfley

Re: Philosophically speaking: TFPSortedObjectList or TFPStringList?

	Computer Math and Games in Pascal (preview)
	Lazarus Handbook