Recent

Author Topic: Sets of LongInt  (Read 10898 times)

Spribo

  • New Member
  • *
  • Posts: 41
Sets of LongInt
« on: July 17, 2010, 02:12:18 pm »
I recently made a program that has to store uniquely some numbers greater than 255 (LongInt). I couldn't use sets of LongInt so I used arrays, but elements were repeated and, when I had to check if a given number was in that array, it took ages to execute! The greater index of this array should be 6546, but instead was 35578! How could I use sets of LongInt or solve this problem in general? Any help will be greatly appreciated!

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 10553
  • Debugger - SynEdit - and more
    • wiki
Re: Sets of LongInt
« Reply #1 on: July 17, 2010, 02:40:06 pm »
That depends on the usage pattern of the data.

If you do a lot of lookups, but very, very few inserts (or updates/changes), then you can have a sorted array/list => and use binary search for the lookup/search.
- easy to implement
- fast lookup
- slow insert/update

Otherwise you can look at structures like hashes.

A hash can perform very quickly. But you must handle clashes correctly.
Unless you work on a fixed set of numbers, and you can guarantee a perfect hashing http://en.wikipedia.org/wiki/Hash_function#Perfect_hashing

There may/should be some implementations already, but I don't know where...



bflm

  • Jr. Member
  • **
  • Posts: 54
    • Free Pascal Random Bits
Re: Sets of LongInt
« Reply #2 on: July 17, 2010, 05:12:08 pm »
I recently made a program that has to store uniquely some numbers greater than 255 (LongInt). I couldn't use sets of LongInt so I used arrays, but elements were repeated and, when I had to check if a given number was in that array, it took ages to execute! The greater index of this array should be 6546, but instead was 35578! How could I use sets of LongInt or solve this problem in general? Any help will be greatly appreciated!

A possible solution (not tested):
Code: [Select]
uses
  fgl;

type

  { TMySet }

  TMySet = class(specialize TFPGMap<Integer, Boolean>)
  public
    function IsMember(const N: Integer): Boolean;
    procedure Include(const N: Integer);
    procedure Exclude(const N: Integer);
  end;

implementation

  { TMySet }

  function TMySet.IsMember(const N: Integer): Boolean;
  begin
    Result := IndexOf(N) >= 0;
  end;

  procedure TMySet.Include(const N: Integer);
  begin
    KeyData[N] := True;
  end;

  procedure TMySet.Exclude(const N: Integer);
  begin
    Remove(N);
  end;

Might require FPC >=2.5.1

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11940
  • FPC developer.
Re: Sets of LongInt
« Reply #3 on: July 17, 2010, 05:52:01 pm »
Have a look at Classes.TBits.

Spribo

  • New Member
  • *
  • Posts: 41
Re: Sets of LongInt
« Reply #4 on: July 17, 2010, 08:40:21 pm »
Thank you very much for your help! I have a question: what does "specialize TFPGMap<Integer, Boolean>" mean?

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 10553
  • Debugger - SynEdit - and more
    • wiki
Re: Sets of LongInt
« Reply #5 on: July 17, 2010, 10:07:09 pm »
@Spribo:

using TBits indeed is as similar to a set as you can get (and also the fastest implementation (equal to perfect hash) O=1),
but if Spribo uses some very big longint, then it comes at the cost of high memory usage.

@befelemepeseveze:
Unless I missed a critical point: TFPGMap seems to be based on a list implementation with key/value pairs. Therefore using it, is as good/bad as using a normal list.
Operations on the list can be potentially slow.

Spribo

  • New Member
  • *
  • Posts: 41
Re: Sets of LongInt
« Reply #6 on: July 18, 2010, 09:07:32 am »
I will certainly use TBits... As the biggest LongInt I will store will be 50000000.

Spribo

  • New Member
  • *
  • Posts: 41
Re: Sets of LongInt
« Reply #7 on: July 18, 2010, 12:57:27 pm »
When I try to create a TBit with size 1000000 (myarray:=TBit.Create(1000000)), it raises error (EBitsError), and says that the index of bit has exceeded the size limit. Why?

bflm

  • Jr. Member
  • **
  • Posts: 54
    • Free Pascal Random Bits
Re: Sets of LongInt
« Reply #8 on: July 18, 2010, 12:59:37 pm »
@befelemepeseveze:
Unless I missed a critical point: TFPGMap seems to be based on a list implementation with key/value pairs. Therefore using it, is as good/bad as using a normal list.
Operations on the list can be potentially slow.

Sure. I thought the problem is the large sparsity of the set which the map solves nicely and in a memory efficient way. TBits is definitely the fast path but wastes a lot of memory with sets like {1, 1000000, <<other thousand members>>, 50000000} (cf. one of the other Spiro's post above) where the mem usage diff is pretty big (~6MB vs few tens of kB per set instance).

Spribo

  • New Member
  • *
  • Posts: 41
Re: Sets of LongInt
« Reply #9 on: July 18, 2010, 01:06:30 pm »
But could you tell me why it raises that error? And where could I find documentation on unit fgl?

Leledumbo

  • Hero Member
  • *****
  • Posts: 8776
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Sets of LongInt
« Reply #10 on: July 19, 2010, 09:04:56 am »
Quote
it raises error (EBitsError), and says that the index of bit has exceeded the size limit. Why?
The constant is here, and and it's ~ 2 GB. I have no idea why it didn't pass 1M in your system, at least it works here up to ~ 2 GB.
Quote
where could I find documentation on unit fgl?
There are none atm, except the source itself.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11940
  • FPC developer.
Re: Sets of LongInt
« Reply #11 on: July 19, 2010, 01:48:35 pm »
Quote
it raises error (EBitsError), and says that the index of bit has exceeded the size limit. Why?
The constant is here, and and it's ~ 2 GB. I have no idea why it didn't pass 1M in your system, at least it works here up to ~ 2 GB.

Old compiler probably, I can remember the value was upped not too long ago. (2.2.4+?, 2.4.0+ ?)

Quote
Quote
where could I find documentation on unit fgl?
There are none atm, except the source itself.

Generics are officially still "experimental"

 

TinyPortal © 2005-2018