### Bookstore

 Computer Math and Games in Pascal (preview) Lazarus Handbook

### Author Topic: Large or huge sets  (Read 8144 times)

• Hero Member
• Posts: 12621
##### Large or huge sets
« on: September 10, 2019, 10:15:11 am »
Large or huge sets are often requested. Just today I needed one. Wrote it in ten minutes, but maybe useful?
If the critics are not too harsh, I will add it to to the wiki.
Note it can be written *way* more efficient.
Code: Pascal  [Select][+][-]
1. {\$mode objfpc}{\$inline on}
2. { A simple way to implement large or huge sets.
4.  re-licensable: free for all to use.
6.  You can not claim license nor copyrigths }
7. unit largesets;
8. { Operator      Action
9.   ---------------------------------------------
10.   +             Union
11.   -             Difference
12.   *             Intersection
13.   ><        Symmetric difference
14.   <=        Contains
15.   include   Include an element in the set
16.   exclude   Exclude an element from the set
17.   in        Check wether an element is in a set
18. }
19. interface
20. uses
21.   generics.collections;
22.
23. type
24.   TWordSet  = specialize TSortedHashSet<word >;
25.   { The next two can cause some real memory pressure, be careful }
26.   TDWordSet = specialize TSortedHashSet<dword>;
27.   TQWordSet = specialize TSortedHashSet<Qword>;
28.
29.   { union }
30.   operator + (const a,b:TWordset ):TWordSet;
31.   operator + (const a,b:TDWordset):TDWordSet;
32.   operator + (const a,b:TQWordset):TQWordSet;
33.
34.   { difference }
35.   operator - (const a,b:Twordset ):TWordSet;
36.   operator - (const a,b:TDwordset):TDWordSet;
37.   operator - (const a,b:TQwordset):TQWordSet;
38.
39.   { intersection }
40.   operator * (const a,b:TwordSet ):TWordSet;
41.   operator * (const a,b:TDwordSet):TDWordSet;
42.   operator * (const a,b:TQwordSet):TQWordSet;
43.
44.   { symmetric difference }
45.   operator >< (const a,b:TWordSet ):TWordSet;
46.   operator >< (const a,b:TDWordSet):TDWordSet;
47.   operator >< (const a,b:TQWordSet):TQWordSet;
48.
49.   { contains }
50.   operator <= (const a,b:TWordSet ):Boolean;
51.   operator <= (const a,b:TDWordSet):Boolean;
52.   operator <= (const a,b:TQWordSet):Boolean;
53.
54.   { in }
55.   operator in (const a:word; const b:TWordset ):Boolean;
56.   operator in (const a:dword;const b:TDWordset):Boolean;
57.   operator in (const a:qword;const b:TQWordset):Boolean;
58.
59.   { include }
60.   procedure include(const a:TWordSet; const b:Word);
61.   procedure include(const a:TDWordSet;const b:DWord);
62.   procedure include(const a:TQWordSet;const b:QWord);
63.
64.   { exclude }
65.   procedure exclude(const a:TWordSet;const  b:Word);
66.   procedure exclude(const a:TDWordSet;const b:DWord);
67.   procedure exclude(const a:TQWordSet;const b:QWord);
68.
69. implementation
70.
71.   { union }
72.   operator + (const a,b:TWordset):TWordSet;
73.   begin
74.     b.UnionWith(a);
75.     Result := b;
76.   end;
77.
78.   operator + (const a,b:TDWordset):TDWordSet;
79.   begin
80.     b.UnionWith(a);
81.     Result := b;
82.   end;
83.
84.   operator + (const a,b:TQWordset):TQWordSet;
85.   begin
86.     b.UnionWith(a);
87.     Result := b;
88.   end;
89.
90.   { difference }
91.   operator -(const a,b:Twordset):TWordSet;
92.   begin
93.     b.ExceptWith(a);
94.     result:=b;
95.   end;
96.
97.   operator -(const a,b:TDwordset):TDWordSet;
98.   begin
99.     b.ExceptWith(a);
100.     result:=b;
101.   end;
102.
103.   operator -(const a,b:TQwordset):TQWordSet;
104.   begin
105.     b.ExceptWith(a);
106.     result:=b;
107.   end;
108.
109.   { intersection }
110.   operator *(const a,b:TwordSet):TWordSet;
111.   begin
112.     b.IntersectWith(a);
113.     Result := b;
114.   end;
115.
116.   operator *(const a,b:TDwordSet):TDWordSet;
117.   begin
118.     b.IntersectWith(a);
119.     Result := b;
120.   end;
121.
122.   operator *(const a,b:TQwordSet):TQWordSet;
123.   begin
124.     b.IntersectWith(a);
125.     Result := b;
126.   end;
127.
128.   { symmetric difference }
129.   operator ><(const a,b:TWordSet):TWordSet;
130.   begin
131.     b.SymmetricExceptWith(a);
132.     Result := b;
133.   end;
134.
135.   operator ><(const a,b:TDWordSet):TDWordSet;
136.   begin
137.     b.SymmetricExceptWith(a);
138.     Result := b;
139.   end;
140.
141.   operator ><(const a,b:TQWordSet):TQWordSet;
142.   begin
143.     b.SymmetricExceptWith(a);
144.     Result := b;
145.   end;
146.
147.   { contains }
148.   operator <=(const a,b:TWordSet):Boolean;
149.   var
150.     c:word;
151.   begin
152.     Result := true;
153.     for c in a do
154.     if not b.contains(c) then
155.     begin
156.       Result := False;
157.       break;
158.     end;
159.   end;
160.
161.   operator <=(const a,b:TDWordSet):Boolean;
162.   var
163.     c:word;
164.   begin
165.     Result := true;
166.     for c in a do
167.     if not b.contains(c) then
168.     begin
169.       Result := False;
170.       break;
171.     end;
172.   end;
173.
174.   operator <=(const a,b:TQWordSet):Boolean;
175.   var
176.     c:word;
177.   begin
178.     Result := true;
179.     for c in a do
180.     if not b.contains(c) then
181.     begin
182.       Result := False;
183.       break;
184.     end;
185.   end;
186.
187.   { in }
188.   operator in (const a:word;const b:TWordset):Boolean;
189.   begin
190.     Result := b.Contains(a);
191.   end;
192.
193.   operator in (const a:dword;const b:TDWordset):Boolean;
194.   begin
195.     Result := b.Contains(a);
196.   end;
197.
198.   operator in (const a:qword;const b:TQWordset):Boolean;
199.   begin
200.     Result := b.Contains(a);
201.   end;
202.
203.   { include }
204.   procedure include(const a:TWordSet;const b:Word);
205.   begin
207.   end;
208.
209.   procedure include(const a:TDWordSet;const b:DWord);
210.   begin
212.   end;
213.
214.   procedure include(const a:TQWordSet;const b:QWord);
215.   begin
217.   end;
218.
219.   { exclude }
220.   procedure exclude(const a:TWordSet;const b:Word);
221.   begin
222.     a.Remove(b);
223.   end;
224.
225.   procedure exclude(const a:TDWordSet;const b:DWord);
226.   begin
227.     a.Remove(b);
228.   end;
229.
230.   procedure exclude(const a:TQWordSet;const b:QWord);
231.   begin
232.     a.Remove(b);
233.   end;
234.
235. end.
The only thing I can say about Putin - born st Petersburg- is that he is indeed Russian, as opposed to Stalin, who was Georgian. Depending of historical time frame they could both be Lithuanian or Polish...even German. Shut him up!

#### marcov

• Hero Member
• Posts: 10492
• FPC developer.
##### Re: Large or huge sets
« Reply #1 on: September 10, 2019, 11:05:51 am »
The problem is that useset is the same as for TBits.

The problem with the set 256 limit is that you have a enum (as in, with identifier names) that suddenly grows beyond 256, and you have to redo your program.  Specially when the enum set tests are used in business code that hurts.

So try if your solution also works for a large enum.

And a demo to test (how would code look?) would be nice. E.g. will my IN operator tests remain the same?
« Last Edit: September 10, 2019, 11:11:20 am by marcov »

#### Ñuño_Martínez

• Hero Member
• Posts: 1179
##### Re: Large or huge sets
« Reply #2 on: September 10, 2019, 11:09:43 am »
Not sure but I think that some operators don't work as expected.  I mean, operator + shouldn't be like this?
Code: Pascal  [Select][+][-]
1.
2. OPERATOR + (CONST a, b:TWordset): TWordSet;
3. BEGIN
4.   RESULT := TWordSet.Create;
5.   RESULT.Copy (b);
6.   RESULT.UnionWith (a)
7. END;
8.

Note: Never used generics.collections and I didn't look for information about it, so my code would be so wrong...

Also: I'm not a big fan of operators overloading using classes for obvious reasons (i.e: I expect that the result of a + b is a new object, not to modify one of the operands).
« Last Edit: September 10, 2019, 11:11:58 am by Ñuño_Martínez »
Are you interested in game programming? Join the Pascal Game Development community!
Also visit the Game Development Portal

• Hero Member
• Posts: 12621
##### Re: Large or huge sets
« Reply #3 on: September 10, 2019, 11:14:08 am »
So try if your solution also works for a large enum.
That works.  Becase it avoids the size of enums.  work in progress.
The only thing I can say about Putin - born st Petersburg- is that he is indeed Russian, as opposed to Stalin, who was Georgian. Depending of historical time frame they could both be Lithuanian or Polish...even German. Shut him up!

• Hero Member
• Posts: 12621
##### Re: Large or huge sets
« Reply #4 on: September 10, 2019, 11:18:30 am »
Not sure but I think that some operators don't work as expected.  I mean, operator + shouldn't be like this?
Also: I'm not a big fan of operators overloading using classes for obvious reasons (i.e: I expect that the result of a + b is a new object, not to modify one of the operands).
You would be amazed. This code is not fastest, but satisfies most tests.
You are wrong about the new object: that would cause leaks, and I tested against leaks.

Anyway, it is rough and probably needs refinement.

Thanks for the response, both.
The only thing I can say about Putin - born st Petersburg- is that he is indeed Russian, as opposed to Stalin, who was Georgian. Depending of historical time frame they could both be Lithuanian or Polish...even German. Shut him up!

#### Kays

• Sr. Member
• Posts: 479
• Whasup!?
##### Re: Large or huge sets
« Reply #5 on: September 10, 2019, 12:06:08 pm »
Large or huge sets are often requested. […]
Meh, your code doesn’t support set of real.
Yours Sincerely
Kai Burghardt

• Hero Member
• Posts: 12621
##### Re: Large or huge sets
« Reply #6 on: September 10, 2019, 12:20:14 pm »
Large or huge sets are often requested. […]
Meh, your code doesn’t support set of real.
No, because that is not mathematically a viable option for real sets.
The TSortedHashset supports that to some extend...But not like real sets, like in a Pascal set. My code does.
You need to have a certain amount of bit patterns, just like normal sets.
(I know it requires a small, just a small, bit of computer science...)
Furthermore, I never pretend to have infinite scale, just larger scale if you run out of 256....If you do not understand that plz refrain from taking part of the discussion,
I just got two very informed answers that really help.
« Last Edit: September 10, 2019, 12:28:02 pm by Thaddy »
The only thing I can say about Putin - born st Petersburg- is that he is indeed Russian, as opposed to Stalin, who was Georgian. Depending of historical time frame they could both be Lithuanian or Polish...even German. Shut him up!

#### 440bx

• Hero Member
• Posts: 3268
##### Re: Large or huge sets
« Reply #7 on: September 10, 2019, 02:50:24 pm »
Large or huge sets are often requested. […]
Meh, your code doesn’t support set of real.
I presume that comment was tongue in cheek as only countable sets can be represented digitally.  The set of reals is not countable.

FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 SP1 64bit.

#### jamie

• Hero Member
• Posts: 5069
##### Re: Large or huge sets
« Reply #8 on: September 10, 2019, 04:01:17 pm »
There is one thing I can say about @Thaddy, good of bad, he knows how to write bloat!

The only true wisdom is knowing you know nothing

#### marcov

• Hero Member
• Posts: 10492
• FPC developer.
##### Re: Large or huge sets
« Reply #9 on: September 10, 2019, 04:09:39 pm »
Oops. I only read quickly initially, and it seems not to be a value type, meaning you must change business code (and streaming!). Not really an option for me. I wonder if changing the class to a record<>,  and then hacking till it works might do it. However the problem is dimensioning the static data field to the size of the enum.

Hmm. maybe abusing an ansistring/dynarray. Automated but many aspects like streaming and " IN " operator cost magnitude, remains the same
« Last Edit: September 10, 2019, 04:42:30 pm by marcov »

• Hero Member
• Posts: 12621
##### Re: Large or huge sets
« Reply #10 on: September 10, 2019, 04:31:28 pm »
There is one thing I can say about @Thaddy, good of bad, he knows how to write bloat!

Do it better. You can't. In 10 minutes. Clock is ticking....
( I am  one of the few that actually produce working code all the time..with compilable code....)
« Last Edit: September 10, 2019, 05:24:30 pm by Thaddy »
The only thing I can say about Putin - born st Petersburg- is that he is indeed Russian, as opposed to Stalin, who was Georgian. Depending of historical time frame they could both be Lithuanian or Polish...even German. Shut him up!

#### howardpc

• Hero Member
• Posts: 4104
##### Re: Large or huge sets
« Reply #11 on: September 10, 2019, 06:28:30 pm »
Thanks, Thaddy for sharing useful functionality.
My preference for "contains" (rather than via the slightly cryptic operator <=) is for a named function.

I think it also warrants a count function, which is what I miss most in the basic Pascal set implementation.

I would implement these as follows:
Code: Pascal  [Select][+][-]
1.   function SetAContainsSetB(const a, b: TWordSet): Boolean;
2.   var
3.     c: Word;
4.   begin
5.     for c in a do
6.       if not b.Contains(c) then
7.         Exit(False);
8.     Result := True;
9.   end;
10.
11.   function GetCount(const aSet: TWordSet): SizeInt;
12.   begin
13.     Exit(aSet.Count);
14.   end;

And corresponding  code for DWord and QWord if you need such enormous data structures.
If you do need such things, you are probably better off with a database anyway. But all things have their niche uses and appeal.
« Last Edit: September 10, 2019, 06:37:42 pm by howardpc »

• Hero Member
• Posts: 12621
##### Re: Large or huge sets
« Reply #12 on: September 10, 2019, 07:12:37 pm »
The slightly cryptic <= is a part of long standing Pascal set syntax, Howard  ....
But I will try to improve.... (I know I could do it with bitmasks, but that is not the point here)
The only thing I can say about Putin - born st Petersburg- is that he is indeed Russian, as opposed to Stalin, who was Georgian. Depending of historical time frame they could both be Lithuanian or Polish...even German. Shut him up!

#### howardpc

• Hero Member
• Posts: 4104
##### Re: Large or huge sets
« Reply #13 on: September 10, 2019, 07:34:38 pm »
Well I am aware of that syntax. I was merely expressing my preference for a more verbose alternative.
To those of a mathematical bent <= is no doubt transparent and self-explanatory when used as a set operator.
Lesser mortals like myself prefer wordy self-declaring function names, a luxury which is also Pascalish.
FPC provides also for those who prefer more C-like syntax, so we can have the best of both worlds.

#### Kays

• Sr. Member
• Posts: 479
• Whasup!?
##### Re: Large or huge sets
« Reply #14 on: September 10, 2019, 08:47:34 pm »
[…] Meh, your code doesn’t support set of real.
No, because that is not mathematically a viable option for real sets. […]
[…] I presume that comment was tongue in cheek as only countable sets can be represented digitally.  The set of reals is not countable.
Yes, it was tongue in cheek. Sorry, if that didn’t get through with the smiley. Of course I know there’s no bijective mapping ℝ → ℕ, so there’s no way to store 2 in a discrete manner, and bla, bla, bla. (Obviously I have *only* good memories about the math lectures. )
Yours Sincerely
Kai Burghardt