Forum > General

Using generics

<< < (2/3) > >>

jmm72:
I think I got it already, and even without being optimized it's already faster than the implementation I did with TStringList. (And forget the initial implementation "let's-make-it-work" which was like 2 minutes long, now it takes 3 seconds). I think I would have bumped my head against the wall too many times without your help.

However I'm unsure about ordered insertion of a lot of items will be slow. If it behaves like TStringList in that aspect, adding lots of items and then sorting is much more slower than adding items while the list is already sorted. I might do my test to assert this, tho.

The question about being slow is about TFPGMap.IndexOfData. I believe that even when the map is sorted, the data is not, thus it's always a sequential search when using IndexOfData. Some looks at the source code seems to follow this. If I'm right, data iteration for "reverse lookup" would be awfully slow. I don't mean doing an iteration through all the elements, but just searching a set of all the data. Then how could I implement a two-way dictionary using fgl?

howardpc:
Why not share the code you have so far? There are forum members who enjoy suggesting ways of enhancing code (whether to make it faster, or more reliable, or simpler etc.). It might also be instructive for people who are unfamiliar with generics, and who could learn about the usefulness and widespread applicability of an emerging FCL library, for which good examples are still rather scarce.
In answering your question, it would be easier to comment on specific visible code, than to talk in generalities.

Leledumbo:

--- Quote from: jmm72 on January 08, 2015, 01:24:04 pm ---However I'm unsure about ordered insertion of a lot of items will be slow. If it behaves like TStringList in that aspect, adding lots of items and then sorting is much more slower than adding items while the list is already sorted. I might do my test to assert this, tho.

--- End quote ---
Let me help you with that:

--- Code: ---{$mode objfpc}{$H+}

uses
  Classes,SysUtils,DateUtils,fgl;

var
  c1,c2,c3,c4,c5: Integer;

function GetKey(i: Integer): String;

  function CharOf(c: Integer): Char; inline;
  begin
    Result := Chr(c + Ord('A'));
  end;

  procedure UpdateCValues; inline;
  const
    MaxC = 26;
  begin
    Inc(c5);
    if c5 >= MaxC then begin
      c5 := 0;
      Inc(c4);
      if c4 >= MaxC then begin
        c4 := 0;
        Inc(c3);
        if c3 >= MaxC then begin
          c3 := 0;
          Inc(c2);
          if c2 >= MaxC then begin
            c2 := 0;
            Inc(c1);
            if c1 >= MaxC then begin
              c1 := 0;
            end;
          end;
        end;
      end;
    end;
  end;

begin
  Result := CharOf(c1) + CharOf(c2) + CharOf(c3) + CharOf(c4) + CharOf(c5);
  UpdateCValues;
end;

procedure ResetCValues;
begin
  c1 := 0;
  c2 := 0;
  c3 := 0;
  c4 := 0;
  c5 := 0;
end;

const
  MAX = 10000;
type
  TStringIntMap = specialize TFPGMap<String,Integer>;
var
  m1,m2: TStringIntMap;
  st,et: TTime;
  i: Integer;
  Same: Boolean;
begin
  ResetCValues;
  m1 := TStringIntMap.Create;
  m1.Sorted := false;
  st := Now;
  for i := 0 to MAX - 1 do
    m1[GetKey(i)] := 0;
  m1.Sort;
  et := Now;
  WriteLn('Time elapsed: ',SecondSpan(st,et):1:6,' second(s)');

  ResetCValues;
  m2 := TStringIntMap.Create;
  m2.Sorted := true;
  st := Now;
  for i := 0 to MAX - 1 do
    m2[GetKey(i)] := 0;
  et := Now;
  WriteLn('Time elapsed: ',SecondSpan(st,et):1:6,' second(s)');

  Same := true;
  for i := 0 to m1.Count - 1 do
    if m1.Keys[i] <> m2.Keys[i] then begin
      Same := false;
      WriteLn(i,' ',m1.Keys[i] + ' ' + m2.Keys[i]);
      WriteLn('Different output, invalid benchmark');
      break;
    end;
  if Same then WriteLn('Same output, valid benchmark');
end.

--- End code ---

--- Quote from: jmm72 on January 08, 2015, 01:24:04 pm ---The question about being slow is about TFPGMap.IndexOfData. I believe that even when the map is sorted, the data is not, thus it's always a sequential search when using IndexOfData. Some looks at the source code seems to follow this. If I'm right, data iteration for "reverse lookup" would be awfully slow. I don't mean doing an iteration through all the elements, but just searching a set of all the data. Then how could I implement a two-way dictionary using fgl?

--- End quote ---
Create a descendant with nested type specialized in the reverse order from outer map. e.g.:

--- Code: ---generic TTwoWayMap<TKey,TData> = class(specialize TFPGMap<TKey,TData>)
private type
  TMirrorMap = specialize TFPGMap<TData,TKey>;
private var
  FMirrorMap: TMirrorMap;
...
end;

--- End code ---
then override all required methods (including constructor and destructor) to make it, perhaps PutKeyData and IndexOfData are enough. Well, it's up to your implementation.

marcov:
(Note that in 2.7/3.x versions it might also be worthwhile to have a look at gmap,ghash etc in packages/fcl-stl )

jmm72:
I did a quick benchmark of making ordered and unordered insertion in a TFPGMap, both sorted and unsorted, and while ordered insertion is reasonably slow (IIRC a sorted list is the worst scenario for quicksort), both ordered and unordered insertions are so much faster when sorted.

@ Leledumbo: That TwoWayMap is such a great approach of what I'd need. I don't think I could write a custom two way map where both keys and data are hashed or sorted in some way, but that usage of nested maps and overriding the needed methods is genius. I think I can live with the extra memory requirement, because once the implementation is done, using it will be so easy.

I didn't want to put any code because the way I use it is not exactly <string, integer>, and the code is actually spread in several functions. What Leledumbo said is enough for anyone to start using a TFPGMap, so putting my code as an example is not needed IMHO (and it's a PITA to translate all the identifiers). However I think I could write down a quick example and post it, so if anyone searching for "how to use fgl TFPGMap" hits this thread can have real code to read.

And I don't think I can optimize it anymore without making it unreadable/uncomprehensible. I got it down to 3 seconds and I think I got the TFPGMap thing right and straight. There is another place where I can change a TStringListEx with this TFPGMap thingy, so when my code advances to make heavy use of that part, it'll be faster than it could have been without the map.

If anyone is curious, it's a custom file renamer to maintain a large collection of files, using certain words in the filename to categorize and rename accordingly in a certain way. For now it's 26,000 files, so making this as a pet/personal project was not a bad idea. Last time I counted there were like 13,000 of those keywords , and probably have some thousands more to find and categorize. The renamer also detects uncategorized keywords to help in maintaining the collection with the best possible names using that information.

Thanks everyone for all the help.

edit: Here you have some test code using the most common features of TFPGMap.


--- Code: ---program testmap;

uses
  fgl;

type
  TDictionary = specialize TFPGMap<string, integer>;

var
  i: integer;
  dict: TDictionary;

begin
  dict := TDictionary.Create;
  dict.Sorted := True; // for fast lookup of keys
  // assign initial values
  dict['one'] := 1;
  dict['two'] := 2;
  dict['three'] := 3;

  // get data back
  writeln('The value of ''one'' is ', dict['one']);

  // if it's not known if the key is in there
  // use Find for sorted maps, but use IndexOf otherwise
  if dict.Find('four', i) then
  begin
    writeln('The value of ''four'' is ', dict['four']);
    writeln('The value of ''four'' is ', dict.Data[i]);
  end
  else
  begin
    writeln('Can''t find key four, adding it');
    dict['four'] := 4;
  end;

  // change one of the numbers
  dict['one'] := 5;
  writeln('The value of ''one'' is now ', dict['one']);

  // iterate with the keys and/or data
  writeln('All the keys and their values from the current map');
  for i := 0 to dict.Count - 1 do
    writeln(dict.Keys[i], '=', dict.Data[i]);

  // delete one key (with its value) from the map
  writeln('Removing ''two''');
  dict.Remove('two');
  writeln('All the keys and their values from the current map');
  for i := 0 to dict.Count - 1 do
    writeln(dict.Keys[i], '=', dict.Data[i]);

  // find which key has a certain data (always sequential search, slow)
  i := dict.IndexOfData(3);
  writeln('Value 3 was found to have the keyname ', dict.Keys[i]);
  dict.Free;
end.

--- End code ---

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version