Recent

Author Topic: Sets Vs Arrays. When to use and why.  (Read 1806 times)

OC DelGuy

  • Full Member
  • ***
  • Posts: 197
Sets Vs Arrays. When to use and why.
« on: March 07, 2025, 10:08:07 pm »
I looked on YouTube, the FPC Wiki and Tutorials Point and there's not much on the subject.  Is there a video, website or PDF that compares the two and gives examples on when to use one over the other?
Free Pascal Lazarus Version #: 2.2.4
Date: 24 SEP 2022
FPC Version: 3.2.2
Revision: Lazarus_2_2_4
x86_64-win64-win32/win64

440bx

  • Hero Member
  • *****
  • Posts: 5178
Re: Sets Vs Arrays. When to use and why.
« Reply #1 on: March 07, 2025, 10:29:28 pm »
Is there a video, website or PDF that compares the two and gives examples on when to use one over the other?
I am not aware of one of those but, here are a few "guidelines" for you.

Use a set when:

1. the order in which the elements occur does not matter (because sets don't keep track of element order.)
2. there are no more than 256 elements in the set (this is a current limitation which may change in the future.)
3. the first element has an ordinal of zero and the last an ordinal of 255

Use an array when:

1. the order of elements matter, e.g, country flag colors.
2. there can be more than 256 elements, e.g, the collection of employees that went to work on a particular day in a multinational company.
3. a collection of elements has indexes that straddles the 255 limit, e.g, a number range such as 4,000,000..4,000,100 (commas present for legibility only) This last restriction could also be removed but I don't believe removing it has even been considered.

Off the top of my head, those are the main reasons.

HTH.
« Last Edit: March 07, 2025, 10:31:09 pm by 440bx »
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

korba812

  • Sr. Member
  • ****
  • Posts: 464
Re: Sets Vs Arrays. When to use and why.
« Reply #2 on: March 07, 2025, 11:17:40 pm »
Another difference is that properties of the set type can be published.

n7800

  • Sr. Member
  • ****
  • Posts: 284
Re: Sets Vs Arrays. When to use and why.
« Reply #3 on: March 07, 2025, 11:18:18 pm »
Also, in sets, each element can be specified only once. It is either present or absent. In essence, it is a set of flags that replaces several individual boolean variables. That is, they are interchangeable:
Code: Pascal  [Select][+][-]
  1. type
  2.   TSet = (enum1, enum2, enum3);
  3. var
  4.   FSet: TSet;
  5.  
  6. { or }
  7.  
  8. var
  9.   FEnum1: boolean;
  10.   FEnum2: boolean;
  11.   FEnum3: boolean;
  12.  

By the way, sets usually take up less memory than individual boolean variables. Their flags are stored in individual bits, while each variable requires at least one byte (in fact, much more due to memory alignment).

n7800

  • Sr. Member
  • ****
  • Posts: 284
Re: Sets Vs Arrays. When to use and why.
« Reply #4 on: March 07, 2025, 11:37:14 pm »
3. the first element has an ordinal of zero and the last an ordinal of 255

This can be controlled, for example, like this (see docs):

Code: Pascal  [Select][+][-]
  1. type
  2.   sign = (negative = -1, none = 0, positive = 1);
  3.  

But it is almost never necessary.

EDITED: Link added.
« Last Edit: March 08, 2025, 12:29:53 am by n7800 »

n7800

  • Sr. Member
  • ****
  • Posts: 284
Re: Sets Vs Arrays. When to use and why.
« Reply #5 on: March 07, 2025, 11:37:54 pm »
Actually, there is a good description in the documentation (here and here). It will be useful not only for beginners. For example, I did not know about the possibility of using the operator ">=".

440bx

  • Hero Member
  • *****
  • Posts: 5178
Re: Sets Vs Arrays. When to use and why.
« Reply #6 on: March 08, 2025, 12:01:22 am »
3. the first element has an ordinal of zero and the last an ordinal of 255

This can be controlled, for example, like this:

Code: Pascal  [Select][+][-]
  1. type
  2.   sign = (negative = -1, none = 0, positive = 1);
  3.  

But it is almost never necessary.
That works but, that opens the door to unexpected situations where things don't work anymore.  It opens the door to the same problem caused by 4,000,000..4,000,100.

I strongly recommend keeping the first and last index in the range 0..255.  Anything else may or may not work and, sometimes it will not be obvious why it doesn't.


 
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

n7800

  • Sr. Member
  • ****
  • Posts: 284
Re: Sets Vs Arrays. When to use and why.
« Reply #7 on: March 08, 2025, 12:36:53 am »
I strongly recommend keeping the first and last index in the range 0..255.  Anything else may or may not work and, sometimes it will not be obvious why it doesn't.

This causes even more problems than it seems at first glance. There is a large discussion on the compiler bug tracker about what to do with range "gaps" when specifying indices: #33603.

440bx

  • Hero Member
  • *****
  • Posts: 5178
Re: Sets Vs Arrays. When to use and why.
« Reply #8 on: March 08, 2025, 02:06:47 am »
I strongly recommend keeping the first and last index in the range 0..255.  Anything else may or may not work and, sometimes it will not be obvious why it doesn't.

This causes even more problems than it seems at first glance. There is a large discussion on the compiler bug tracker about what to do with range "gaps" when specifying indices: #33603.
I'm not sure I follow what you are really saying but, anytime a set has a starting index other than zero, the potential for unexpected problems is present.

Here is a discussion about some of the problems associated with that.
https://forum.lazarus.freepascal.org/index.php/topic,49640.0.html
I suggest you pay particular attention to PascalDragon's comments and the one about the range specifically.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

n7800

  • Sr. Member
  • ****
  • Posts: 284
Re: Sets Vs Arrays. When to use and why.
« Reply #9 on: March 08, 2025, 10:06:16 pm »
I'm not sure I follow what you are really saying but, anytime a set has a starting index other than zero, the potential for unexpected problems is present.

Here is a discussion about some of the problems associated with that.
https://forum.lazarus.freepascal.org/index.php/topic,49640.0.html
I suggest you pay particular attention to PascalDragon's comments and the one about the range specifically.

Yes, another issue is discussed here. But it is also useful to read, thank you.

Leledumbo

  • Hero Member
  • *****
  • Posts: 8798
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Sets Vs Arrays. When to use and why.
« Reply #10 on: March 09, 2025, 07:04:00 am »
I looked on YouTube, the FPC Wiki and Tutorials Point and there's not much on the subject.  Is there a video, website or PDF that compares the two and gives examples on when to use one over the other?
Sets are more specialized than arrays, in a sense that, you can make a set out of an array but not vice versa. Built-in sets are either implemented as a single 32-bit integer (for a maximum of 32 elements) or array of [0 .. 7] of 32-bt integer (for 33-256 elements), which is of course, hidden from the user. Sets operations are implemented as bitwise operations, so it's very fast and efficient. Therefore, whenever your problem requires sets to solve, then using them will give the best performance and efficiency (and safety) compared to DIY approach using arrays.

The documentation has an example of implementing days as sets, that should be simple enough to understand. In a more complex problem, say you're building a game whose characters can be inflicted with various status, sets are the perfect data structure to store them.
  • Character gets poisoned: Include(Chara.Status, csPoisoned);
  • The poisoned status gets cured: Exclude(Chara.Status, csPoisoned);
  • On character turn during poisoned: if csPoisoned in Chara.Status then Dec(Chara.HP, SomeCalculationResultToDetermineHowMuchHPTheyLost);
Feel free to imagine how you would convert those one liners using arrays.
« Last Edit: March 09, 2025, 07:20:31 am by Leledumbo »

Zvoni

  • Hero Member
  • *****
  • Posts: 2961
Re: Sets Vs Arrays. When to use and why.
« Reply #11 on: March 10, 2025, 09:43:15 am »
Another difference is the "Size" of Set vs. Array
Code: Pascal  [Select][+][-]
  1. Type
  2.    TMyEnum = (enOne=1, enTwo, enThree, enFour, enFive, enSix, enSeven, enEight, enNine, enTen);
  3.  
  4. Var
  5.   MySet:Set Of TMyEnum;
  6.   MyArray:Array[1..10] Of Integer; //Always Size = 10
  7.  
  8. Begin
  9.   MySet:=[enTwo, enFive];  //Size = 2
  10. End;

Another is the use of the "In"-operator for Sets (as shown by Leledumbo)
« Last Edit: March 10, 2025, 09:45:57 am by Zvoni »
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

440bx

  • Hero Member
  • *****
  • Posts: 5178
Re: Sets Vs Arrays. When to use and why.
« Reply #12 on: March 10, 2025, 10:37:53 am »
Yes, there is definitely a difference in size.

Note that the size of the MyArray in the above code is not 10, it is 40 (in 32 bit, 10 elements @ 4 bytes = 40)
the size of the set is 4 bytes not 2.



More "parallel" (set/array) code would be something along the lines of:
Code: Pascal  [Select][+][-]
  1.  
  2. {$SCOPEDENUMS ON}
  3.  
  4. Type
  5.    TMyEnum1 = (enOne=1, enTwo, enThree, enFour, enFive, enSix, enSeven, enEight, enNine, enTen);
  6.  
  7.    TMyEnum0 = (enZero, enOne, enTwo, enThree, enFour, enFive, enSix, enSeven, enEight, enNine);
  8.  
  9. Var
  10.   MySet    : set of TMyEnum1;
  11.  
  12.   { unfortunately, FPC does not allow the array to use TMyEnum1 in spite of   }
  13.   { the fact that there are no gaps in the sequence.  It would be a nice      }
  14.   { improvement for the compiler to allow that.                               }
  15.  
  16.   MyArray1 : array[1..10] of Integer; //Always Size = 10 (elements, not bytes)
  17.  
  18.   MyArray2 :           array[TMyEnum0] of boolean;  { closer  to the set      }
  19.   MyArray3 : bitpacked array[TMyEnum0] of boolean;  { closest to the set      }
  20.  
  21. begin
  22.   { unfortunately FPC does not allow "with TMyEnum1 do MySet := (....         }
  23.  
  24.   MySet:=[TMyEnum1.enTwo, TMyEnum1.enFive];  // Size = 2  (size is 4 bytes, not 2)
  25.   writeln(sizeof(MySet));                    // size is 4 not 2
  26.   writeln(sizeof(MyArray1));                 // size is 40 bytes
  27.  
  28.   writeln(sizeof(MyArray2));                 // size is 10
  29.   writeln(sizeof(MyArray3));                 // size is  2
  30.  
  31.   readln;
  32. end.
  33.  
Note: all sizes are for bitness 32 or greater (IOW, not 8 or 16 bit.)



On a completely different note, it would be nice if FPC allowed array ranges to be specified using gapless enumerations that do not start at zero.

Another nice improvement would be for FPC to allow using "with" to specify enumeration scopes.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

Zvoni

  • Hero Member
  • *****
  • Posts: 2961
Re: Sets Vs Arrays. When to use and why.
« Reply #13 on: March 10, 2025, 11:22:11 am »
Yes, there is definitely a difference in size.

Note that the size of the MyArray in the above code is not 10, it is 40 (in 32 bit, 10 elements @ 4 bytes = 40)
the size of the set is 4 bytes not 2.
Agreed, though in my Code "Size" was more for "Count of Elements" than any size in Bits/Bytes

Though not having looked at any implementation details, i'd hazard a guess that a "Set" is basically "operated" with bitwise "and"'s, "or"'s and "not"'s,
So a Set can be represented by a single byte (=8 Bits), which would also explain the limitation to 256 "Elements" for a Set
which would make it quadrillions faster than any Array-Lookup
« Last Edit: March 10, 2025, 11:28:15 am by Zvoni »
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

440bx

  • Hero Member
  • *****
  • Posts: 5178
Re: Sets Vs Arrays. When to use and why.
« Reply #14 on: March 10, 2025, 11:40:41 am »
Agreed, though in my Code "Size" was more for "Count of Elements" than any size in Bits/Bytes
I figured that's what you meant, I just wanted to ensure there was no room for confusion.

Though not having looked at any implementation details, i'd hazard a guess that a "Set" is basically "operated" with bitwise "and"'s, "or"'s and "not"'s,
So a Set can be represented by a single byte (=8 Bits), which would also explain the limitation to 256 "Elements" for a Set
which would make it quadrillions faster than any Array-Lookup
if the set has 32 elements or less then determining membership will be faster than accessing a boolean array element because the entire set fits in a register and an "and" operation can determine the presence or absence of an element.  How much faster the set will be in this case depends on the access pattern, for completely random pattern, the access speed will be very close to equal, for sequential access, the set will be (should be) in the ballpark of 32 times faster.

It's a different story when the set has more than 32 elements.  In those cases, accessing a boolean array element will be faster than accessing the set's target bit because before accessing the target bit, the target dword must be accessed and "and-ed" with the appropriate mask to test the target bit.  In that case the access pattern usually won't make much of a difference.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

 

TinyPortal © 2005-2018