Recent

Author Topic: [solution] Easy way to strip comments?  (Read 6625 times)

indydev

  • Full Member
  • ***
  • Posts: 116
[solution] Easy way to strip comments?
« on: May 29, 2024, 07:24:19 pm »
Maybe I missed something, but is there a way, in the Lazarus IDE to either copy, or save a unit stripped of all of its comments?

For the inevitable question 'why?':  Because sometimes I get units loaded with unhelpful comments; sometimes I need to send something with comments only for me that another person will not need; to help train AI models etc. The idea isn't to remove comments in a project, but to have a way to strip out for uses beyond the project you are working on.
« Last Edit: June 03, 2024, 09:49:02 pm by indydev »

cdbc

  • Hero Member
  • *****
  • Posts: 1678
    • http://www.cdbc.dk
Re: Easy way to strip comments?
« Reply #1 on: May 29, 2024, 09:55:12 pm »
Hi
I think I can whip something together in a jiffy, hang on...
edit:
I'm back...
With a quick'n'dirty console app, it's got some beauty-flaws, but it's a first try, so bear with me
  :D
It could perhaps also do with a bit of debugging, but it's getting late, here in Denmark...
Let me know what you think...
  8-)
Regards Benny
« Last Edit: May 30, 2024, 01:24:47 am by cdbc »
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 2.2.6 up until Jan 2024 from then on it's: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 3.0

cdbc

  • Hero Member
  • *****
  • Posts: 1678
    • http://www.cdbc.dk
Re: Easy way to strip comments?
« Reply #2 on: May 30, 2024, 03:06:34 am »
Hi
Right, couldn't help myself...  :D
Here's a new version, with some bug-fixes
Should be better, let me know, what you think  %)
edit: I call it like this:
 "./strip_pas_comments comment_states.pas > strip.pas"
/without the quotes/

Regards Benny
« Last Edit: May 30, 2024, 03:25:40 am by cdbc »
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 2.2.6 up until Jan 2024 from then on it's: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 3.0

indydev

  • Full Member
  • ***
  • Posts: 116
Re: Easy way to strip comments?
« Reply #3 on: May 30, 2024, 04:05:50 am »
Thanks.  I had some trouble getting back on here for some reason, and I decided to write my own. Will check yours. Here's my command line version that works "ok" for now.

Code: Pascal  [Select][+][-]
  1. program stripcmts;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. uses
  6.   {$IFDEF UNIX}
  7.   cthreads,
  8.   {$ENDIF}
  9.   Classes, SysUtils, CustApp,
  10.   RegExpr;
  11.  
  12. type
  13.  
  14.   { TMyApplication }
  15.  
  16.   TMyApplication = class(TCustomApplication)
  17.   protected
  18.     procedure DoRun; override;
  19.   public
  20.     constructor Create(TheOwner: TComponent); override;
  21.     procedure WriteHelp; virtual;
  22.   end;
  23.  
  24. var
  25.   FileName: String;
  26.   FileContent: TStringList;
  27.   RegEx: TRegExpr;
  28.   i: integer;
  29.  
  30. { TMyApplication }
  31.  
  32. procedure TMyApplication.DoRun;
  33. var
  34.   ErrorMsg: String;
  35. begin
  36.   // quick check parameters
  37.   ErrorMsg:=CheckOptions('h', 'help');
  38.   if ErrorMsg<>'' then begin
  39.     ShowException(Exception.Create(ErrorMsg));
  40.     Terminate;
  41.     Exit;
  42.   end;
  43.  
  44.   // parse parameters
  45.   if HasOption('h', 'help') then begin
  46.     WriteHelp;
  47.     Terminate;
  48.     Exit;
  49.   end;
  50.  
  51.   if ParamCount < 1 then
  52.   begin
  53.     WriteLn('Usage: ./stripcmts <filename>');
  54.     Terminate;
  55.     Exit;
  56.   end;
  57.  
  58.   FileName := ParamStr(1);
  59.   FileContent := TStringList.Create;
  60.   try
  61.     FileContent.LoadFromFile(FileName);
  62.  
  63.     RegEx := TRegExpr.Create;
  64.     RegEx.ModifierM:=TRUE;
  65.  
  66.     try
  67.       RegEx.Expression := '\/\/.*';
  68.       RegEx.ModifierI := TRUE; // case-insensitive
  69.  
  70.       for i := 0 to FileContent.Count - 1 do
  71.       begin
  72.         if RegEx.Exec(FileContent[i]) then FileContent[i] := StringReplace(FileContent[i], RegEx.Match[0], '', []);
  73.       end;
  74.  
  75.       RegEx.Expression := '\{\s*[^\s\$].*?\}'; //'\{[^\$].*?\}';
  76.       RegEx.ModifierS := TRUE;
  77.       FileContent.Text := RegEx.Replace(FileContent.Text, '', FALSE);
  78.  
  79.  
  80.  
  81.     finally
  82.       RegEx.Free;
  83.     end;
  84.  
  85.     FileContent.SaveToFile(ChangeFileExt(FileName, '_CmtFree.pas'));
  86.   finally
  87.     FileContent.Free;
  88.   end;
  89.   // stop program loop
  90.   Terminate;
  91. end;
  92.  
  93. constructor TMyApplication.Create(TheOwner: TComponent);
  94. begin
  95.   inherited Create(TheOwner);
  96.   StopOnException:=True;
  97. end;
  98.  
  99. procedure TMyApplication.WriteHelp;
  100. begin
  101.   { still need to write help code here }
  102.   writeln('Usage: ', ExeName, ' -h');
  103. end;
  104.  
  105. var
  106.   Application: TMyApplication;
  107. begin
  108.   Application:=TMyApplication.Create(nil);
  109.   Application.Title:='Strip Comments';
  110.   Application.Run;
  111.   Application.Free;
  112. end.

need to add '(*' '*)' blocks to be more complete, and this doesn't handle nested blocks.

Thanks for writing a couple up.

indydev

  • Full Member
  • ***
  • Posts: 116
Re: Easy way to strip comments?
« Reply #4 on: May 30, 2024, 04:19:38 am »
Hi
Right, couldn't help myself...  :D
Here's a new version, with some bug-fixes
Should be better, let me know, what you think  %)
edit: I call it like this:
 "./strip_pas_comments comment_states.pas > strip.pas"
/without the quotes/

Regards Benny


Ok. You went to town on this. I just put some regular expressions together. I didn't bother with a more advanced state machine. Thanks for your work!

Thaddy

  • Hero Member
  • *****
  • Posts: 16201
  • Censorship about opinions does not belong here.
Re: Easy way to strip comments?
« Reply #5 on: May 30, 2024, 07:29:48 am »
I like both approaches.
Personnally I have been using Dipp by Ralf Junker for decades. It is freeware, also works with FreePascal code, but he never releases sourcecode, which is a pity.
You can get it from his website. I hope both of you will look at "what is missing"?.
Just a teaser,

https://www.yunqa.de/delphi/apps/dipp/index
« Last Edit: May 30, 2024, 05:35:51 pm by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

cdbc

  • Hero Member
  • *****
  • Posts: 1678
    • http://www.cdbc.dk
Re: Easy way to strip comments?
« Reply #6 on: May 30, 2024, 10:09:15 am »
Hahaha Thaddy  :D
Thanks for that one...  :P
I bet, 'Dipp' isn't a 3 hour job  ;D
I quite enjoyed the break from other stuff  8-)
Regards Benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 2.2.6 up until Jan 2024 from then on it's: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 3.0

indydev

  • Full Member
  • ***
  • Posts: 116
Re: Easy way to strip comments?
« Reply #7 on: May 30, 2024, 04:25:30 pm »
I like both approaches.
Personnally I have been using Dipp by Ralf Junker for decades. It is freeware, also works with FreePasccal code, but he never releases sourcecode, which is a pity.
You can get it from his website. I hope both of you will look at "what is missing"?.
Just a teaser,

https://www.yunqa.de/delphi/apps/dipp/index

Unfortunately, this is Windows only.  I just don't have ready access to try it (a situation I need to resolve soon), let alone use it.

indydev

  • Full Member
  • ***
  • Posts: 116
Re: Easy way to strip comments?
« Reply #8 on: May 30, 2024, 04:28:03 pm »
Ok. Looks like there was some interest. I went ahead and completed my attempt to make it at least a little more useable.

Code: Pascal  [Select][+][-]
  1. program stripcmts;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. uses
  6.   {$IFDEF UNIX}
  7.   cthreads,
  8.   {$ENDIF}
  9.   Classes, SysUtils, CustApp, Process,
  10.   RegExpr;
  11.  
  12. type
  13.  
  14.   { TStripComments }
  15.  
  16.   TStripComments = class(TCustomApplication)
  17.   protected
  18.     procedure DoRun; override;
  19.   public
  20.     constructor Create(TheOwner: TComponent); override;
  21.     procedure WriteHelp; virtual;
  22.   end;
  23.  
  24. var
  25.   FileName, NewFileName, OpenProgram: String;
  26.   FileContent: TStringList;
  27.   RegEx: TRegExpr;
  28.   i: integer;
  29.  
  30. { TStripComments }
  31.  
  32. procedure TStripComments.DoRun;
  33. var
  34.   ErrorMsg: String;
  35.  
  36.   procedure RemoveExtraBlankLines(CodeLines: TStringList);
  37.   var
  38.     i: integer;
  39.     BlankLineFound: Boolean;
  40.   begin
  41.     BlankLineFound := False;
  42.     i := 0;
  43.     while i < CodeLines.Count do
  44.     begin
  45.       if Trim(CodeLines[i]) = '' then
  46.       begin
  47.         if BlankLineFound then CodeLines.Delete(i) else
  48.         begin
  49.           BlankLineFound := True;
  50.           Inc(i);
  51.         end;
  52.       end else
  53.       begin
  54.         BlankLineFound := False;
  55.         Inc(i);
  56.       end;
  57.     end;
  58.   end;
  59.  
  60. begin
  61.   // quick check parameters
  62.   ErrorMsg:=CheckOptions('h', 'help');
  63.   if ErrorMsg<>'' then begin
  64.     ShowException(Exception.Create(ErrorMsg));
  65.     Terminate;
  66.     Exit;
  67.   end;
  68.  
  69.   // parse parameters
  70.   if HasOption('h', 'help') then begin
  71.     WriteHelp;
  72.     Terminate;
  73.     Exit;
  74.   end;
  75.  
  76.   if ParamCount < 1 then
  77.   begin
  78.     WriteLn('Usage: ./stripcmts <filename> [program to open unit]');
  79.     WriteLn('The second parameter is optional');
  80.     Terminate;
  81.     Exit;
  82.   end;
  83.  
  84.   FileName := ParamStr(1);
  85.   if ParamCount >= 2 then
  86.     OpenProgram := ParamStr(2)
  87.   else
  88.     OpenProgram := 'xed';  // default program
  89.  
  90.   FileContent := TStringList.Create;
  91.   try
  92.     FileContent.LoadFromFile(FileName);
  93.  
  94.     RegEx := TRegExpr.Create;
  95.     RegEx.ModifierM:=TRUE;
  96.  
  97.     try
  98.       RegEx.Expression := '\/\/.*';
  99.       RegEx.ModifierI := TRUE; // case-insensitive
  100.  
  101.       for i := 0 to FileContent.Count - 1 do
  102.       begin
  103.         if RegEx.Exec(FileContent[i]) then FileContent[i] := StringReplace(FileContent[i], RegEx.Match[0], '', []);
  104.       end;
  105.  
  106.       RegEx.Expression := '\{\s*[^\s\$].*?\}|\(\*.*\*\)';
  107.       RegEx.ModifierS := TRUE;
  108.       FileContent.Text := RegEx.Replace(FileContent.Text, '', FALSE);
  109.  
  110.     finally
  111.       RegEx.Free;
  112.     end;
  113.  
  114.     // Remove extra blank lines
  115.     RemoveExtraBlankLines(FileContent);
  116.  
  117.     NewFileName := ChangeFileExt(FileName, '_CmtFree.pas');
  118.     FileContent.SaveToFile(NewFileName);
  119.  
  120.     // Launch the new file in xed editor
  121.     with TProcess.Create(nil) do
  122.     try
  123.       Executable := OpenProgram;
  124.       Parameters.Add(NewFileName);
  125.       Options := Options + [poWaitOnExit];
  126.       Execute;
  127.     finally
  128.       Free;
  129.     end;
  130.  
  131.   finally
  132.     FileContent.Free;
  133.   end;
  134.   // stop program loop
  135.   Terminate;
  136. end;
  137.  
  138. constructor TStripComments.Create(TheOwner: TComponent);
  139. begin
  140.   inherited Create(TheOwner);
  141.   StopOnException:=True;
  142. end;
  143.  
  144. procedure TStripComments.WriteHelp;
  145. begin
  146.   writeln('Usage: ', ExeName, ' <filename> [program to open unit]');
  147.   writeln;
  148.   writeln('Description:');
  149.   writeln('  This program strips comments from a Free Pascal source file.');
  150.   writeln('  It supports single-line comments (//), multi-line comments ({...}),');
  151.   writeln('  and Pascal-style block comments ( (* ... *) ). The cleaned file');
  152.   writeln('  is saved with a "_CmtFree" suffix added to the original filename.');
  153.   writeln;
  154.   writeln('Parameters:');
  155.   writeln('  <filename>             The name of the file to process.');
  156.   writeln('  [program to open unit] Optional. The program to use to open the cleaned file.');
  157.   writeln('                         If not specified, the default program "xed" will be used.');
  158.   writeln;
  159.   writeln('Examples:');
  160.   writeln('  ', ExeName, ' myfile.pas');
  161.   writeln('    Strips comments from "myfile.pas" and opens the result with the default program.');
  162.   writeln;
  163.   writeln('  ', ExeName, ' myfile.pas nano');
  164.   writeln('    Strips comments from "myfile.pas" and opens the result with "codeeditor".');
  165. end;
  166.  
  167.  
  168. var
  169.   Application: TStripComments;
  170. begin
  171.   Application:=TStripComments.Create(nil);
  172.   Application.Title:='Strip Comments';
  173.   Application.Run;
  174.   Application.Free;
  175. end.

Roland57

  • Sr. Member
  • ****
  • Posts: 475
    • msegui.net
Re: Easy way to strip comments?
« Reply #9 on: May 30, 2024, 04:53:42 pm »
@cdbc
Nice! And seems to work well.

@indydev
I will try your latest version. I love regular expressions.

I enter the competition with my Pascal Code Cleaner written in Lua.  :)

Usage: lua pcc.lua SOURCE DESTINATION [DEBUGFILE1] [DEBUGFILE2]

Where all parameters are file names. Parameters 3 and 4 are optional.

« Last Edit: May 30, 2024, 07:19:27 pm by Roland57 »
My projects are on Gitlab and on Codeberg.

alpine

  • Hero Member
  • *****
  • Posts: 1303
Re: Easy way to strip comments?
« Reply #10 on: May 30, 2024, 05:10:24 pm »
Given the regex used, why not just use sed or awk for that?
... or Shift+Ctrl+F in Lazarus?
"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

cdbc

  • Hero Member
  • *****
  • Posts: 1678
    • http://www.cdbc.dk
Re: Easy way to strip comments?
« Reply #11 on: May 30, 2024, 07:38:58 pm »
Hi
@indydev: Nifty little program, Me Likey  8) seems to also work well and nice touch, with the opening in an editor for review \o/
@Roland57: Yours is nice too  ;) works well and although it's the first time I see Lua, I can follow the code \o/
Cool - Benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 2.2.6 up until Jan 2024 from then on it's: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 3.0

indydev

  • Full Member
  • ***
  • Posts: 116
Re: Easy way to strip comments?
« Reply #12 on: May 30, 2024, 07:49:19 pm »
Given the regex used, why not just use sed or awk for that?
... or Shift+Ctrl+F in Lazarus?

Dependencies! Well, at least that was my thought. I have written bash scripts to do work for myself only to find later I needed it somewhere else. While my code is geared toward Linux, it is easily converted for use in Windows. I haven't been on Windows in years, but does it come with sed or awk?

cdbc

  • Hero Member
  • *****
  • Posts: 1678
    • http://www.cdbc.dk
Re: Easy way to strip comments?
« Reply #13 on: May 30, 2024, 11:36:09 pm »
Hi
Saw you've noticed my 'IStringList', here's the lpr-file using the IStringList...:
Code: Pascal  [Select][+][-]
  1. program strip_pas_comments;
  2. {$mode objfpc}{$H+}
  3. { ©2024 Benny Christensen a.k.a. cdbc, All right reserved }
  4. uses
  5.   Classes, sysutils, StrUtils, istrlist, bc_statemachine, comment_states;
  6. type
  7.   { TStateData provides an object for our event-handler, we don't need to instantiate :o) }
  8.   TStateData = object
  9.     procedure HandleStateData(aSender: TbcState;aFieldList: TStrings;var {%H-}Cancel: boolean);
  10.   end;
  11.  
  12. var
  13.   raw: boolean = false; // suppresses the writing of 'filename', '<- - ->' & 'Done!'
  14.   Sd: TStateData; // works just fine
  15.   sl: IStringList; // now using the *new* IStringList
  16.   Sm: TbcStateMachine;
  17.   Statenames: TStringArray;
  18.   pasFile: string;
  19.  
  20. { TStateData }
  21. procedure TStateData.HandleStateData(aSender: TbcState;aFieldList: TStrings;var Cancel: boolean);
  22. var
  23.   S: string;
  24. begin
  25.   case IndexStr(aSender.ClassName,Statenames) of
  26.     0: ; { nothing to do in idle }
  27.     1: for S in aFieldList do
  28.          if S <> '' then writeln(S);
  29.     2: for S in aFieldList do
  30.          if S <> '' then writeln(S);
  31.     3: for S in aFieldList do
  32.          if S <> '' then writeln(S);
  33.     4: for S in aFieldList do
  34.          if S <> '' then writeln(S);
  35.     5: ; { nothing to do in error }
  36.     else
  37.       writeln('not in case -> ',trim(aFieldList.Text));
  38.   end;
  39. end;
  40.  
  41. begin
  42.   if ParamCount < 1 then begin
  43.     writeln('* * * please provide a filename in 1.st param * * *');
  44.     writeln('* * * optionally provide -r in 2.nd param for raw output * * *');
  45.     Halt(-1);
  46.   end;
  47.   pasFile:= ParamStr(1);
  48.   if ParamCount >= 2 then raw:= (ParamStr(2) = '-r');
  49.   if not raw then begin
  50.     writeln('Parsing pascal file: ',pasFile); // has been made optional
  51.     writeln('<- - - - ->'); // has been made optional
  52.   end;
  53.   Sl:= CreateStrList; // now making use of the *new* IStringList
  54.   Sl.LoadFromFile(pasFile);
  55.   Sm:= TbcStateMachine.Create;
  56.   try
  57.     RegisterPASStates(Sm,StateNames); { convenience procedure in comment_states }
  58.     Sm.OnStateData:= @{%H-}Sd.HandleStateData; { event-handler for getting at the data }
  59.     Sm.ExecuteFile(Sl.List);        { unleash the beast on the poor file :-D }
  60.   finally Sm.Free; end;
  61.   if not raw then writeln('Done!'); // has been made optional
  62. end.
  63.  
Added also the '-r' option to get /raw/ output.
For others who'd like to try 'IStringList' out, you can get it here:  https://gitlab.com/cdbc-public/ibcstringlist.
Regards Benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 2.2.6 up until Jan 2024 from then on it's: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 3.0

cdbc

  • Hero Member
  • *****
  • Posts: 1678
    • http://www.cdbc.dk
Re: Easy way to strip comments?
« Reply #14 on: May 31, 2024, 05:10:06 pm »
Hi
If anyone should be interested, I found a bug in 'comment_states.pas', that /eats/ Guids, as if they were comments...
Attached is a bug-fixed unit.
Regards benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 2.2.6 up until Jan 2024 from then on it's: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 3.0

 

TinyPortal © 2005-2018