### Bookstore

 Computer Math and Games in Pascal (preview) Lazarus Handbook

### Author Topic: Sets of LongInt  (Read 10177 times)

#### Spribo

• New Member
• Posts: 41
##### Sets of LongInt
« on: July 17, 2010, 02:12:18 pm »
I recently made a program that has to store uniquely some numbers greater than 255 (LongInt). I couldn't use sets of LongInt so I used arrays, but elements were repeated and, when I had to check if a given number was in that array, it took ages to execute! The greater index of this array should be 6546, but instead was 35578! How could I use sets of LongInt or solve this problem in general? Any help will be greatly appreciated!

#### Martin_fr

• Hero Member
• Posts: 9362
• Debugger - SynEdit - and more
##### Re: Sets of LongInt
« Reply #1 on: July 17, 2010, 02:40:06 pm »
That depends on the usage pattern of the data.

If you do a lot of lookups, but very, very few inserts (or updates/changes), then you can have a sorted array/list => and use binary search for the lookup/search.
- easy to implement
- fast lookup
- slow insert/update

Otherwise you can look at structures like hashes.

A hash can perform very quickly. But you must handle clashes correctly.
Unless you work on a fixed set of numbers, and you can guarantee a perfect hashing http://en.wikipedia.org/wiki/Hash_function#Perfect_hashing

There may/should be some implementations already, but I don't know where...

#### bflm

• Jr. Member
• Posts: 54
##### Re: Sets of LongInt
« Reply #2 on: July 17, 2010, 05:12:08 pm »
I recently made a program that has to store uniquely some numbers greater than 255 (LongInt). I couldn't use sets of LongInt so I used arrays, but elements were repeated and, when I had to check if a given number was in that array, it took ages to execute! The greater index of this array should be 6546, but instead was 35578! How could I use sets of LongInt or solve this problem in general? Any help will be greatly appreciated!

A possible solution (not tested):
Code: [Select]
`uses  fgl;type  { TMySet }  TMySet = class(specialize TFPGMap<Integer, Boolean>)  public    function IsMember(const N: Integer): Boolean;    procedure Include(const N: Integer);    procedure Exclude(const N: Integer);  end;implementation  { TMySet }  function TMySet.IsMember(const N: Integer): Boolean;  begin    Result := IndexOf(N) >= 0;  end;  procedure TMySet.Include(const N: Integer);  begin    KeyData[N] := True;  end;  procedure TMySet.Exclude(const N: Integer);  begin    Remove(N);  end;`
Might require FPC >=2.5.1

#### marcov

• Hero Member
• Posts: 11080
• FPC developer.
##### Re: Sets of LongInt
« Reply #3 on: July 17, 2010, 05:52:01 pm »
Have a look at Classes.TBits.

#### Spribo

• New Member
• Posts: 41
##### Re: Sets of LongInt
« Reply #4 on: July 17, 2010, 08:40:21 pm »
Thank you very much for your help! I have a question: what does "specialize TFPGMap<Integer, Boolean>" mean?

#### Martin_fr

• Hero Member
• Posts: 9362
• Debugger - SynEdit - and more
##### Re: Sets of LongInt
« Reply #5 on: July 17, 2010, 10:07:09 pm »
@Spribo:

using TBits indeed is as similar to a set as you can get (and also the fastest implementation (equal to perfect hash) O=1),
but if Spribo uses some very big longint, then it comes at the cost of high memory usage.

@befelemepeseveze:
Unless I missed a critical point: TFPGMap seems to be based on a list implementation with key/value pairs. Therefore using it, is as good/bad as using a normal list.
Operations on the list can be potentially slow.

#### Spribo

• New Member
• Posts: 41
##### Re: Sets of LongInt
« Reply #6 on: July 18, 2010, 09:07:32 am »
I will certainly use TBits... As the biggest LongInt I will store will be 50000000.

#### Spribo

• New Member
• Posts: 41
##### Re: Sets of LongInt
« Reply #7 on: July 18, 2010, 12:57:27 pm »
When I try to create a TBit with size 1000000 (myarray:=TBit.Create(1000000)), it raises error (EBitsError), and says that the index of bit has exceeded the size limit. Why?

#### bflm

• Jr. Member
• Posts: 54
##### Re: Sets of LongInt
« Reply #8 on: July 18, 2010, 12:59:37 pm »
@befelemepeseveze:
Unless I missed a critical point: TFPGMap seems to be based on a list implementation with key/value pairs. Therefore using it, is as good/bad as using a normal list.
Operations on the list can be potentially slow.

Sure. I thought the problem is the large sparsity of the set which the map solves nicely and in a memory efficient way. TBits is definitely the fast path but wastes a lot of memory with sets like {1, 1000000, <<other thousand members>>, 50000000} (cf. one of the other Spiro's post above) where the mem usage diff is pretty big (~6MB vs few tens of kB per set instance).

#### Spribo

• New Member
• Posts: 41
##### Re: Sets of LongInt
« Reply #9 on: July 18, 2010, 01:06:30 pm »
But could you tell me why it raises that error? And where could I find documentation on unit fgl?

#### Leledumbo

• Hero Member
• Posts: 8718
• Programming + Glam Metal + Tae Kwon Do = Me
##### Re: Sets of LongInt
« Reply #10 on: July 19, 2010, 09:04:56 am »
Quote
it raises error (EBitsError), and says that the index of bit has exceeded the size limit. Why?
The constant is here, and and it's ~ 2 GB. I have no idea why it didn't pass 1M in your system, at least it works here up to ~ 2 GB.
Quote
where could I find documentation on unit fgl?
There are none atm, except the source itself.

#### marcov

• Hero Member
• Posts: 11080
• FPC developer.
##### Re: Sets of LongInt
« Reply #11 on: July 19, 2010, 01:48:35 pm »
Quote
it raises error (EBitsError), and says that the index of bit has exceeded the size limit. Why?
The constant is here, and and it's ~ 2 GB. I have no idea why it didn't pass 1M in your system, at least it works here up to ~ 2 GB.

Old compiler probably, I can remember the value was upped not too long ago. (2.2.4+?, 2.4.0+ ?)

Quote
Quote
where could I find documentation on unit fgl?
There are none atm, except the source itself.

Generics are officially still "experimental"