Recent

Author Topic: I'm seeking (beta-)testers for my fast regular expression engine FLRE  (Read 28266 times)

Jurassic Pork

  • Hero Member
  • *****
  • Posts: 1228
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #30 on: September 06, 2015, 11:54:28 am »
example to do the same thing with TFLRE as Tregexpr :
Code: [Select]
const stringInFile = '79324817350003235658981449032';
var
  regex : TRegexpr;
  regexFLRE : TFLRE;
  pattern: TFLRERawByteString;
  MultiExtractions:TFLREMultiStrings;   
  i, j: integer; 
begin               
writeln('=========  Tregexpr ==========');
regex := Tregexpr.Create();
regex.Expression := '^.*([0-9]{10,10})([0-9]{14,14})([0-9]{5,5}).*';
regex.Exec(stringInFile);
if (regex.SubExprMatchCount =3 )then
        Writeln('phone number : ' + regex.Match[1] + #13#10 +
           'ID : ' + regex.Match[2] + #13#10 +
           'Zip Code : ' + regex.Match[3]);
writeln('=========   TFLRE   ==========');
// pattern := '^.*([0-9]{10,10})([0-9]{14,14})([0-9]{5,5}).*';
// 8/09/2015 pattern more compact :
pattern := '.*(\d{10})(\d{14})(\d{5}).*';
regexFLRE := TFLRE.Create(pattern, []);
regexFLRE.MaximalDFAStates:=65536;
  if regexFLRE.ExtractAll(stringInFile,MultiExtractions) then begin
      if length(MultiExtractions[0]) = 4  then
         Writeln('phone number : ' + MultiExtractions[0,1] + #13#10 +
           'ID : ' + MultiExtractions[0,2] + #13#10 +
           'Zip Code : ' + MultiExtractions[0,3]);
    writeln;
    end;
end;

Results :
Quote
=========  Tregexpr ==========
phone number : 7932481735
ID : 00032356589814
Zip Code : 49032
=========   TFLRE   ==========
phone number : 7932481735
ID : 00032356589814
Zip Code : 49032
« Last Edit: September 08, 2015, 01:02:19 am by Jurassic Pork »
Jurassic computer : Sinclair ZX81 - Zilog Z80A à 3,25 MHz - RAM 1 Ko - ROM 8 Ko

Roland57

  • Sr. Member
  • ****
  • Posts: 419
    • msegui.net
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #31 on: September 06, 2015, 05:26:08 pm »
FLRE has TFLRE.PtrReplaceCallback  TFLRE.ReplaceCallback and  TFLRE.UTF8ReplaceCallback now

Could you post an example of that?  :)
My projects are on Gitlab and on Codeberg.

BeRo

  • New Member
  • *
  • Posts: 45
    • My site
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #32 on: September 07, 2015, 03:09:31 pm »
FLRE has TFLRE.PtrReplaceCallback  TFLRE.ReplaceCallback and  TFLRE.UTF8ReplaceCallback now

Could you post an example of that?  :)

With the lastest FLRE version from today:

Code: [Select]
type TExample=class
      public
       function Callback(const Input:PFLRERawByteChar;const Captures:TFLRECaptures):TFLRERawByteString;
     end;

function TExample.Callback(const Input:PFLRERawByteChar;const Captures:TFLRECaptures):TFLRERawByteString;
begin
 // FLREPtrCopy is needed here instead copy, because the PFLRERawByteChar-indexing and the values inside the Captures array here are 0-based, not 1-based
 result:=FLREPtrCopy(Input,Captures[3].Start,Captures[3].Length)+'.'+
         FLREPtrCopy(Input,Captures[2].Start,Captures[2].Length)+'.'+
         FLREPtrCopy(Input,Captures[1].Start,Captures[1].Length);
end;

var FLREInstance:TFLRE;
    Example:TExample;
begin
 FLREInstance:=TFLRE.Create('(\d+)\/(\d+)\/(\d+)',[]);
 try
  Example:=TExample.Create;
  try
   writeln(FLREInstance.ReplaceCallback('123/456/789',Example.Callback)); // prints 789.456.123
  finally
   Example.Free;
  end;
 finally
  FLREInstance.Free;
 end;
 readln;
end;


And as side note: FLRE isn't compatible with Delphi's mobile compilers, but rather only with the Delphi's desktop compiler, because the Delphi's mobile compilers have these new silly incompatibilities on the string and char data types (for example no PAnsiChar). But I do not matter it anyway, because I think the Delphi's mobile compiler are undesirable developments in the current form with a lot of wrong design decisions at the cost of backward compatibility. At least, FPC does it right (in my opinion), without negative costs of backward compatibility. 
« Last Edit: September 07, 2015, 03:28:26 pm by BeRo »

Roland57

  • Sr. Member
  • ****
  • Posts: 419
    • msegui.net
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #33 on: September 07, 2015, 06:54:53 pm »
With the lastest FLRE version from today:

Great! Thank you for the code. I believe that now I have all what I need.  :)

For information, I compiled the example successfully with Lazarus 1.4 and Delphi XE10.
My projects are on Gitlab and on Codeberg.

BeniBela

  • Hero Member
  • *****
  • Posts: 905
    • homepage
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #34 on: September 07, 2015, 07:35:23 pm »
But I should maybe change that in the API, since \A and \Z is also supported, hmmmmm. I think I will make some inquiries, how the majority of other engines handle it.   

Always?

Can you add an option to disable/customize some expression?

I have a list of around thousand regular expressions that I need to validate/match. In the XQuery regex flavor ( ^ or $ are allowed, but not \A or \Z), which is a super set of the XML schema regex (they do not even allow ^ or $)

Would be nice, if it could be used for both cases.

Roland57

  • Sr. Member
  • ****
  • Posts: 419
    • msegui.net
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #35 on: September 07, 2015, 11:36:00 pm »
A first version of a simple FEN (Forsyth-Edwards Notation) syntax validator using FLRE.  :)

Code: [Select]
program fenvalidator1;
{$I DIRECTIVES}

uses
  SysUtils,
  Classes,
  FLRE in '..\src\FLRE.pas',
  FLREUnicode in '..\src\FLREUnicode.pas';

type
  TCallbackClass = class
    public
      function Callback(const Input: PFLRERawByteChar; const Captures: TFLRECaptures): TFLRERawByteString;
  end;

function TCallbackClass.Callback(const Input: PFLRERawByteChar; const Captures: TFLRECaptures): TFLRERawByteString;
begin
  result := StringOfChar('-', StrToInt(FLREPtrCopy(Input, Captures[1].Start, 1)));
end;

function IsValidFEN(const s: string): boolean;
var
  re1, re2, re3, re4, re5, re6, re7, re8, re9: TFLRE;
  ss1, ss2: TFLREStrings;
  _: TFLRECaptures;
  cc: TCallbackClass;
  i: Integer;
begin
  re1 := TFLRE.Create('\s', []);
  re2 := TFLRE.Create('/', []);
  re3 := TFLRE.Create('^[BKNPQRbknpqr1-8]+$', []);
  re4 := TFLRE.Create('(\d)', []);
  re5 := TFLRE.Create('^[wb]$', []);
  re6 := TFLRE.Create('^([KQkq]+|-)$', []);
  re7 := TFLRE.Create('([a-h][36]|-)', []);
  re8 := TFLRE.Create('\d+', []);
  re9 := TFLRE.Create('[1-9]\d*', []);

  try
    if re1.Split(s, ss1) then
      result := (Length(ss1) = 6)
    else
      result := FALSE;

    if not result then
      exit;

    if re2.Split(ss1[0], ss2) then
      result := (Length(ss2) = 8)
    else
      result := FALSE;

    if not result then
      exit;

    for i := 0 to High(ss2) do
      result := result and (re3.Match(ss2[i], _));

    cc := TCallbackClass.Create;
    try
      for i := 0 to High(ss2) do
        result := result and (Length(re4.ReplaceCallback(ss2[i], cc.Callback)) = 8);
    finally
      cc.Free;
    end;

    result := result and (re5.Match(ss1[1], _));
    result := result and (re6.Match(ss1[2], _));
    result := result and (re7.Match(ss1[3], _));
    result := result and (re8.Match(ss1[4], _));
    result := result and (re9.Match(ss1[5], _));
  finally
    re1.Free;
    re2.Free;
    re3.Free;
    re4.Free;
    re5.Free;
    re6.Free;
    re7.Free;
    re8.Free;
    re9.Free;
    SetLength(ss1, 0);
  end;
end;

const
  SAMPLE: array[1..5]of string = (
    'rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1',
    'rnbqkbnr/pp1ppppp/8/2p5/4P3/8/PPPP1PPP/RNBQKBNR w KQkq c6 0 2',
    'rnbqkbnr/pp1ppppp/8/2p5/4P3/5N2/PPPP1PPP/RNBQKB1R b KQkq - 1 2',
    '4k3/8/8/8/8/8/4P3/4K3 w - - 5 39',

    'rnb2kbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1'
  );

var
  i: integer;

begin
  WriteLn('+----------+----------+');
  WriteLn('| RESULT   | EXPECTED |');
  WriteLn('+----------+----------+');

  for i := Low(SAMPLE) to High(SAMPLE) do
    WriteLn(Format(
      '| %8s | %8s |',
      [UpperCase(BoolToStr(IsValidFEN(SAMPLE[i]), TRUE)), UpperCase(BoolToStr(i < 5, TRUE))]
    ));

  WriteLn('+---------------------+');
  ReadLn; 
end.
« Last Edit: September 08, 2015, 08:29:12 am by Roland Chastain »
My projects are on Gitlab and on Codeberg.

Jurassic Pork

  • Hero Member
  • *****
  • Posts: 1228
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #36 on: September 16, 2015, 05:44:54 pm »
hello,

BeRo, have you an example to see how to use named group in your regexp engine ?   :-\
Jurassic computer : Sinclair ZX81 - Zilog Z80A à 3,25 MHz - RAM 1 Ko - ROM 8 Ko

Roland57

  • Sr. Member
  • ****
  • Posts: 419
    • msegui.net
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #37 on: September 26, 2015, 08:05:24 am »
Hello!

FLRE has been updated (link).

You can try this example to see which bug was fixed:

Code: [Select]
program alternative;
{$I directives}

uses
  SysUtils,
  Classes,
  FLRE in '..\src\FLRE.pas',
  FLREUnicode in '..\src\FLREUnicode.pas';

(* Expressions alternatives *)

const
  SAMPLE: array[0..2] of string = (
    'jeudi',
    '24/09/2015',
    '13:33'
  );

  FRENCH_DAY_NAME_PATTERN = '([A-Za-z]{5,8})';
  DATE_PATTERN = '(\d{2})/(\d{2})/(\d{4})';
  TIME_PATTERN = '(\d{2}):(\d{2})';

  PATTERN = FRENCH_DAY_NAME_PATTERN + '|' + DATE_PATTERN + '|' + TIME_PATTERN;

var
  e: TFLRE;
  c: TFLRECaptures;
  i: integer;
  j: integer;

begin
  e := TFLRE.Create(PATTERN, []);

  for i := Low(SAMPLE) to High(SAMPLE) do
    if e.Match(SAMPLE[i], c) then
    begin
      (*
      for j := 0 to High(c) do
        WriteLn(Format('%d %d %s', [i, j, Copy(SAMPLE[i], c[j].Start, c[j].Length)]));
      *)
      if c[1].Length <> 0 then
        WriteLn(Format('SAMPLE[%d] is a %s', [i, 'day name']))
      else if c[2].Length <> 0 then
        WriteLn(Format('SAMPLE[%d] is a %s', [i, 'date']))
      else
        WriteLn(Format('SAMPLE[%d] is a %s', [i, 'time']));

    end;

  e.Free;

  ReadLn;
end.

@BeRo

What about this request?  :)

BeRo, have you an example to see how to use named group in your regexp engine ?   :-\
My projects are on Gitlab and on Codeberg.

BeniBela

  • Hero Member
  • *****
  • Posts: 905
    • homepage
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #38 on: February 14, 2017, 07:14:11 pm »
Do you think Bero is still working on this? https://github.com/BeRo1985/flre/issues

BeRo

  • New Member
  • *
  • Posts: 45
    • My site
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #39 on: March 01, 2017, 05:13:40 pm »
Do you think Bero is still working on this? https://github.com/BeRo1985/flre/issues

Yes, I'm still working on it, but i can't clone me for to work on all my projects at the same time, so I have to set priorities, in the order, which project is the most suitable for me at the end in the future, professional-job-technical.

And in the moment, it is (since mid-2014) still my unreleased UnrealEngine4-style but-in-Pascal-implemented Work-In-Progress game engine, what is a mega whole lot of work for an single individual (me), and besides, I have as primary job to work for Viprinet, so I have to split up my time meaningfully, and therefore it follows that FLRE is not in the first place, but also that I keep still working on FLRE, but just not in the first place in the moment.

And by the way, when you do to want interpret my GitHub commit activity, PasMP, KRAFT, PasVulkan, PACC etc. are opnsourced-part-subprojects of my current game engine.


« Last Edit: March 01, 2017, 05:18:58 pm by BeRo »

Okoba

  • Hero Member
  • *****
  • Posts: 528
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #40 on: February 17, 2019, 02:38:52 pm »
Hi,
It seems FLRE has a bug with Win64.
I'm testing it with Lazarus 2.0.0 and FPC 3.0.4 and it seems works fine with Win32 but not Win64.
Code: Pascal  [Select][+][-]
  1. program project1;
  2.  
  3. uses
  4.   FLRE;
  5.  
  6. var
  7.   Pat, Inpt: TFLRERawByteString;
  8.   re: TFLRE;
  9.   Captures: TFLREMultiCaptures;
  10.   i, j: Integer;
  11. begin
  12.   Inpt := 'book';
  13.   Pat := '^[bk]+$';
  14.   re := TFLRE.Create(Pat, []);
  15.   re.MaximalDFAStates := 65536;
  16.   Captures := nil;
  17.   re.UTF8MatchAll(Inpt, Captures);
  18.   for i := 0 to High(Captures) do
  19.     for j := 0 to High(Captures[i]) do
  20.       with Captures[i][j] do
  21.         WriteLn(i, ',', j, ': ', Copy(Inpt, Start, Length));
  22.   re.Free;
  23.   ReadLn;
  24. end.        

BeRo

  • New Member
  • *
  • Posts: 45
    • My site
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #41 on: February 20, 2019, 05:35:53 pm »
Are you using the lastest FLRE version from GitHub? Because with this version it's working at me with FPC 3.3.1 SVN Trunk from 01. Jan 2019, or at least, i'm seeing there no differences between the Win32 and Win64 builds then.

Okoba

  • Hero Member
  • *****
  • Posts: 528
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #42 on: February 21, 2019, 07:25:21 am »
Yes.
Error is in PtrPosCharSetOf2Search function on line 3724.
I put both FLRE and PUCA units from FLRE repo in the project and ran with Win64 config and with the latest FPC stable and trunk and it has the same error.
I should say simple regex like "\d" works ok.

Code: Pascal  [Select][+][-]
  1.  movq xmm0,rsi
  2.  movq xmm1,rsi
  3.  pxor xmm2,xmm2
  4. {$ifdef fpc}
  5.  movdqa xmm3,[rip+XMM1Constant]
  6. {$else}
  7.  movdqa xmm3,[rel XMM1Constant]
  8. {$endif}
  9.  pshufb xmm0,xmm2//<<<<<======
  10.  pshufb xmm1,xmm3
  11.  
  12.  mov ecx,edi
  13.  and rdi,-32
  14.  and ecx,31            

BeRo

  • New Member
  • *
  • Posts: 45
    • My site
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #43 on: February 21, 2019, 09:34:26 am »
See https://youtu.be/N16MOmEeUEk as video proof, where your code example (in a slighty modified variant, because ^$ are begin/end anchors, which shouldn't be used in this case, when [bk]+ should find the b and k in the "book" string) is working with FPC 3.3.1 as Win32 "and" Win64 builds.

BeRo

  • New Member
  • *
  • Posts: 45
    • My site
Re: I'm seeking (beta-)testers for my fast regular expression engine FLRE
« Reply #44 on: February 21, 2019, 10:30:05 am »
And your CPU must have support for SSSE3, because pshufb is a SSSE3 instruction, see https://www.felixcloutier.com/x86/pshufb.


 

TinyPortal © 2005-2018