Recent

Author Topic: Functional Iterator Library (Container independent)  (Read 793 times)

Warfley

  • Hero Member
  • *****
  • Posts: 1849
Functional Iterator Library (Container independent)
« on: September 28, 2024, 06:36:09 pm »
Hello everyone,

from time to time I experiment a bit with things possible with modern language features. One of those experiments was an iterator library, which I found now to be in a publishable state.

It's a library for container independent iterators. It allows to iterate over data from different sources (e.g. Arrays, Lists, Streams, etc.) and modify the data on the fly (map, filter, fold, take, etc.). This allows to reduce complex functionality involving multiple nested loops to a small series of few iterator operations.

As a practical example on why this is useful, consider the following problem, you want to do a frequency analysis of some text file, an count the occurances of each character and print out the most used characters. You can either do this the classical way:
Code: Pascal  [Select][+][-]
  1.   m := TDict.Create;
  2.    try
  3.      fs := TFileStream.Create(filename, fmOpenRead);
  4.      try
  5.        // Read file charwise
  6.        while True do
  7.        begin
  8.          if fs.Read(c, SizeOf(c)) < SizeOf(c) then
  9.            Break;
  10.          CountMap(m, c); // Will add count to m
  11.        end;
  12.        // Sort by occurance
  13.        arr := m.ToArray;
  14.        Sort(arr, Greate); // Sort ascending by value
  15.        // print out top 5
  16.        for i:=0 to 4 do
  17.          WriteLn('  ''', arr[i].Key,''': ', arr[i].Value);
  18.      finally
  19.        fs.Free;
  20.      end;
  21.    finally
  22.      m.Free;
  23.    end;
Which while being easily possible, is very bulky. Instead, using simple iterators it can be reduced to:
Code: Pascal  [Select][+][-]
  1.   m := TDict.Create;
  2.   try
  3.     for p in Take<TDictPair>(5, // Show top 5 used chars
  4.              Sorted<TDictPair>(Greater, // Sort by most used
  5.              Iterate<Char, Integer>( //Iterate through resulting dict
  6.              FoldR<TDict, Char>(CountMap, m, // Count characters in a dict
  7.              Iterate<Char>(TFileStream.Create(filename, fmOpenRead) // Iterate over file contents
  8.     ))))) do
  9.       WriteLn('  ''', p.Key,''': ', p.Value);
  10.   finally
  11.     m.Free;
  12.   end;

The code and package you can find in the GitHub Repository together with additional information about the usage. Examples for all provided functions can be found in the Examples.

Warning: This library uses a lot of generics and other modern compiler features, which FPC handles... well let's say roughly. This means it is very unstable, not because of the code itself is (though of course I only tested so much by myself), but because the fpc may at any point decide to throw internal exceptions, internalerrors or other errors. I tested it locally with 3.2.2 which works sometimes, and Trunk which seems to be a bit more stable. Also it may break Lazarus CodeTools, as it has trouble parsing generics correctly.

Warfley

  • Hero Member
  • *****
  • Posts: 1849
Re: Functional Iterator Library (Container independent)
« Reply #1 on: September 29, 2024, 11:09:15 am »
Found a way to work around the CodeTools Bug (by using Mode Delphi for defining a helper type). Now it is also usable in Lazarus, and also fpc 3.2 seems to be a bit happier and breaks less often :)

Thaddy

  • Hero Member
  • *****
  • Posts: 16343
  • Censorship about opinions does not belong here.
Re: Functional Iterator Library (Container independent)
« Reply #2 on: September 29, 2024, 01:36:26 pm »
I don't know if it is feasable for your purpose, but defining a specialized type first and then a var of the specialized type should also shut up codetools.
Code: Pascal  [Select][+][-]
  1. {$mode objfpc}
  2. uses
  3.   Generics.Collections;
  4.  
  5. type
  6.   TMyPair = specialize TPair<String, Integer>; // not codetools business...
  7. var
  8.   p:TMyPair;// now codetools does not have to deal with specialize ever.
  9. begin
  10. end.
I highly recommend to use the above if possible, because it makes for very clean code.
That is, depending on purpose this is not always possible but highly desirable:
Specialize to a type, not to a var. Not only to circumvent codetools bugs, but it is simply better, cleaner code in most cases.
« Last Edit: September 29, 2024, 01:47:41 pm by Thaddy »
There is nothing wrong with being blunt. At a minimum it is also honest.

Warfley

  • Hero Member
  • *****
  • Posts: 1849
Re: Functional Iterator Library (Container independent)
« Reply #3 on: September 29, 2024, 10:19:36 pm »
The problem is that I use my own tuple library (which is now also part of the RTL) for a TPair implementation, so I need this disambiguation. Also as I am working with generic functions I can't do type definitions (unlike for example with generic types which can contain subtype definitions).

This is why this library is so "harsh" on fpc, inline specializations are something it does not deal that well with (often getting internal errors, dublicate identifier issues, etc.).

But with the new workaround of putting it in another unit as a wrapper type with another name, and including that instead, now has made it much better. It's now usable with Lazarus 3.6 and FPC 3.2.2 without many issues...
Well there is one, it seems that RangeChecks and ZeroBasedStrings compiler switches don't go well together in FPC 3.2.2 but it is solved in trunk. But as there should be a new FPC version released soon™, I don't want to optimize for older FPC versions

 

TinyPortal © 2005-2018