Recent

Author Topic: Search in files  (Read 2302 times)

d7_2_laz

  • Hero Member
  • *****
  • Posts: 510
Search in files
« on: June 07, 2023, 04:19:29 pm »
First, a twofold sorry: i'm not sure if it really belongs herein, and: for text search usually i used a separate tool:

What's the correct answer for the question how to exclude binaries from searches?

What i mean is: interested in speed. And assuming others are not quite interested to in to search within exe's, dll's, obj's, bmp etc.  … eg:  exe,dll,dcu,zip,rar,ttf,chm,pdf,xls,png,ico,jpg,gif,bmp,svg,tga,pcx,mp3,wav,db3,dbf,mdb,o,a,obj,ppu,res,ppm,bin,mdx,dat,mo,mod,blue,green,red,tlb,cur,dcr,lrs,xpm,msg,odg,odt,ods,RUS,uni,odp,mbf,fbk,icns,ans,fpcmake,raw,dia,bgf,s3m,grit,it,pew
It's a matter of speed (search, with some tool,  within a whole Lazarus or Delphi (or others) root dir including vs. excluding such file exensions and you know what i mean).

Imo the edit used for to specify a positive list resp a specific file search pattern (eg. project*.pas) should not be overwhelmed for this very principle task.

What's recommended how to do that best?

My vote would be for a negative list; set it one time, benefit from the speed gain and forget.

Lazarus 3.2  FPC 3.2.2 Win10 64bit

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9793
  • Debugger - SynEdit - and more
    • wiki
Re: Search in files
« Reply #1 on: June 07, 2023, 04:47:40 pm »
First, a twofold sorry: i'm not sure if it really belongs herein, and: for text search usually i used a separate tool:
Thread was split off from https://forum.lazarus.freepascal.org/index.php/topic,63599.0.html



Please have a closer look at the "Find in files" dialog.

If you select "find in directories" you can specify a list of file extensions. So you only search *.pas or *.pp *.inc files. Or *.txt Or ....

Then when you did the search, all the result are listed in a new Window. The above thread then is about how to filter the results in that window.
« Last Edit: June 07, 2023, 04:50:56 pm by Martin_fr »

d7_2_laz

  • Hero Member
  • *****
  • Posts: 510
Re: Search in files
« Reply #2 on: June 07, 2023, 05:42:21 pm »
Martin, thanks for the correction of the belongig thread, understood.

About the positive list of file extensions: that's not the point.
Searching within maybe different programming environtments, maybe CPP, Basic, html-css et. al. too, it could be nice
not each time to care about the file extensions (use case: search all within specified folder) but to exclude binaries though.

Wouldn't it be nice to say: search in all, but not in binaries, instead of:
*.pas;*.dpr;*.lpr;*.dpk;*.inc;*.dof;*.dfm;*.lfm;*.inc;*.cfg;*.pp;*.xct;*.cpp;*.c;*.h;*.hpp;*.cxx;*.hxx;
*.js;*.xml;*.xsl;*.dtd;*.hta;*.xul;*.rdf;*.html;*.htm;*.shtml;*.xhtml;*.htt;*.css;*.liquid;php;.*php3;.*phtml;
.*json;.*java;.*bat;.*cmt;.*ini;.*vbp;.*reg;.*pl;.*pm;.*cgi;.*bas;.*frm;.*vbs;.*sql;*.txt;*.log;*.upl;*.vbw;*.diz

- A positive list of the candidates to be taken in scope might vary multiple times per day,
  as they might contain very specific search patterns
- A negative list (what to exclude in any search) could be setup one time forever (until there are news)

If it's not of interest, please simply forget.
It was only a little idea, as i just was thinking about to get rid of to each switching to a separate tool for that.

Lazarus 3.2  FPC 3.2.2 Win10 64bit

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9793
  • Debugger - SynEdit - and more
    • wiki
Re: Search in files
« Reply #3 on: June 07, 2023, 06:05:23 pm »
I could see it on Windows, though would want to hear what other think.

!*.exe
Or some prefix to negate the filter.

How would it work on linux? Exclude all files that do not have an extension?

Also, list of file extensions are common, like in any file open dialog.

Personally I don't know how often I would benefit. My restrictions are usually much more than just "no exe".
In most cases I also do not search in
   lpi, lps, lpk, sh, bat, xml, ini 
And then some package may introduce a new extension that I need to add.
So my exclusion list would often be longer than my inclusion list.
Though then I have (in any editor) different inclusion lists depending on the language of the project/file for which I search. (But if somehow a html file ends up in my Lazarus dir, I do *not* want it to be searched.
But again: Let's see what others think.


A similar case however would for me be the directory to search in. I want subfolders to be searched, except if the match certain names.

So, in general I am open to the idea of extending such filters.

wp

  • Hero Member
  • *****
  • Posts: 11857
Re: Search in files
« Reply #4 on: June 07, 2023, 06:07:23 pm »
- A positive list of the candidates to be taken in scope might vary multiple times per day,
  as they might contain very specific search patterns
Just tested it: When I edit the filter list the new item is added to the combobox. So, when you need a previously used combination of extensions, just drop down the list in the combo and pick the one needed. I don't know, though, how long the list is allowed to become. And to avoid too much scrolling it is always a good idea to increase the drop-down count in "Tools" > "Options" > "General"

d7_2_laz

  • Hero Member
  • *****
  • Posts: 510
Re: Search in files
« Reply #5 on: June 07, 2023, 08:25:29 pm »
Yes, file extensions …  it’s probably somehow Windows centric. So i'd not worry about if it is said: not of much interest from a cross platform perspective.
And it's a matter of taste and personal needs.  But at least i'd liked to share the idea.

!*.exe  - exclude candidates in an inclusion list:
   Notepad++ actually behaves so  (!*.exe !*.dll !*.dcu !*.zip   ...).
Possible, but wouldn't be my favorite #1, as i think it would be better to keep it distinct:
by that we could leave the input field empty (= all files) or say *.pas or xy*.pp, but when having an exclusion list you can rely that those files are never touched by a search.
And if you want to go back to "all fles", you don't need to reset your list of negative candidates (!*.exe !*.dll) now.
 
ComboBox version with remembered old entries:
Yes, possible of course to do so. But in any case you need to make sure that you identify the correct – and only partially visible - old item that covers the extensions to include.
Why not, alternativley, let me define a single exclusion list only once and forget?
 
ComboBox version with different inclusion lists depending on the language:
Possible of course. But why should i want to switch between combobox items each time i go from Lazarus root to Delphi root or CPP-example-project root and simply intended to do a quick fulltext search (excluding binaries)?

Search recursive or not: should be of course an option.
Lazarus 3.2  FPC 3.2.2 Win10 64bit

ASerge

  • Hero Member
  • *****
  • Posts: 2223
Re: Search in files
« Reply #6 on: June 07, 2023, 08:41:18 pm »
What's the correct answer for the question how to exclude binaries from searches?
Code: Pascal  [Select][+][-]
  1. uses Masks;
  2.  
  3. const
  4.   CValidExtensions = '*.pas;*.dpr;*.lpr;*.dpk;*.inc;*.dof;*.dfm;*.lfm;*.inc;*.cfg;*.pp;*.xct;' +
  5.     '*.cpp;*.c;*.h;*.hpp;*.cxx;*.hxx;*.js;*.xml;*.xsl;*.dtd;*.hta;*.xul;*.rdf;*.html;*.htm;' +
  6.     '*.shtml;*.xhtml;*.htt;*.css;*.liquid;php;*.php3;*.phtml;*.json;*.java;*.bat;*.cmt;*.ini;' +
  7.     '*.vbp;*.reg;*.pl;*.pm;*.cgi;*.bas;*.frm;*.vbs;*.sql;*.txt;*.log;*.upl;*.vbw;*.diz';
  8.  
  9. procedure TForm1.Button1Click(Sender: TObject);
  10. var
  11.   R: TSearchRec;
  12.   ML: TMaskList;
  13. begin
  14.   ML := TMaskList.Create(CValidExtensions);
  15.   try
  16.     if FindFirst('C:\*', faAnyFile, R) = 0 then
  17.     try
  18.        repeat
  19.          if ((R.Attr and faDirectory) = 0) and ML.MatchesWindowsMask(R.Name) then
  20.            Memo1.Append(R.Name);
  21.        until FindNext(R) <> 0;
  22.     finally
  23.       FindClose(R);
  24.     end;
  25.   finally
  26.     ML.Free;
  27.   end;
  28. end;

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: Search in files
« Reply #7 on: June 07, 2023, 08:51:27 pm »
*.cgi is wrong. Compiled with a.o. FreePascal or C or...they ARE binaries.
Specialize a type, not a var.

d7_2_laz

  • Hero Member
  • *****
  • Posts: 510
Re: Search in files
« Reply #8 on: June 07, 2023, 09:21:14 pm »
Thanks ASerge and Thaddy for attention!
Therefore it's good that the user of the Find In Files dialog can edit / provide such a list.

But my post this time had been less about the technics howto, but more about if it basically makes sense
if this dialog takes care about to skip binary searches.  Resp: the contents might vary, see what Martin said:
"In most cases I also do not search in  lpi, lps, lpk, sh, bat, xml, ini"
In that case those could be part of an exclusion list   .... at least imo; different opinions will exist for sure.
Lazarus 3.2  FPC 3.2.2 Win10 64bit

d7_2_laz

  • Hero Member
  • *****
  • Posts: 510
Re: Search in files
« Reply #9 on: June 07, 2023, 11:02:31 pm »
I try to explain why i maybe appear a bit emphasized with this approach.
One of my very old private Delphi projects i had ported to Lazarus (since then a convinced  fan) had been a kind of Wingrep, which, when clicking on a result line, presents the matches in a view. My interest was very basic, input only a search string (no regex) and let me navigate to it in the found files. Main aim was speed.

So i used a file extension exclude list (same as of reply #2 above).
For measuring of times each the first run must be ignored due to the Windows disc caching effects. For run #2 and later i have:  Lazarus root directory (1,5 GB), search for string "InvalidateSelected".
Run time 2294 directories read:    1,92 seconds.
I think that's a number one can work with and since then i'm interested in this criteria and try at least to point it out.
Lazarus 3.2  FPC 3.2.2 Win10 64bit

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9793
  • Debugger - SynEdit - and more
    • wiki
Re: Search in files
« Reply #10 on: June 07, 2023, 11:18:57 pm »
But why should i want to switch between combobox items each time i go from Lazarus root to Delphi root or CPP-example-project root and simply intended to do a quick fulltext search

Well, because when *I* would go between maybe a web project (html) and Pascal, then when in Pascal I do not want to search html or xml files (and yet they can exist in the project directory). Because usually I only want to search the sources. (and maybe in exceptional cases, lfm files).

Anyway, everyone has their preferences.


As far as 2 dropdowns go. There have been various voices about the dialog already being very crowded. And I am doubtful that there will be a consensus that a 2nd dropdown is important enough. (Especially since the inverted filter can be done in the existing drop down / and it's only semi useful on Linux). And really the "!*.exe;!*.dll" could be easily be appended to each entry. That is a once off task.
But lets see....

d7_2_laz

  • Hero Member
  • *****
  • Posts: 510
Re: Search in files
« Reply #11 on: June 08, 2023, 12:30:13 am »
Quote
then when in Pascal I do not want to search html or xml files (and yet they can exist in the project directory)
Yep - right!  So, i'd simply not click onto this items in the result list ...  or, if i even don't want to see them, i'd additionally activate the "Filter" combobox (dropdown type 2; -> screenshot).
Honestly myself i'm not very interested in it and nearly never use it; it's not important enough, just as you say.  I wouldn't propose it. And right, it could be somehow mimiced by dropdown type 1 (remembered items).

Important is - that's the only thing i wanted to say here - an easy configurable way to exclude items from any search in any environment that i/one/some developers (how to express?) is generally not interested in for text searches; exe obj dll bitmaps png jpg font or db files and many more.

Many thanks for the discussion so far, i think it was good to focus this topic a bit.
I see (and agree) it wouldn't be a good idea though to overuse a windows perspective too much. Thank you for your attention!
« Last Edit: June 08, 2023, 12:38:07 am by d7_2_laz »
Lazarus 3.2  FPC 3.2.2 Win10 64bit

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9793
  • Debugger - SynEdit - and more
    • wiki
Re: Search in files
« Reply #12 on: June 08, 2023, 12:56:52 am »
I would suggest to add a feature request.

I think that the idea of exclusion is a good one (as is the same for excluding subfolders by match "C:\laz;!backup" ).

As for one or two dropboxes, well when you request the feature you can say you prefer 2. Then lets see if anyone from the team comments.

And if it is one checkbox with  !*.foo => still an improvement,  isn't it?

dbannon

  • Hero Member
  • *****
  • Posts: 2786
    • tomboy-ng, a rewrite of the classic Tomboy
Re: Search in files
« Reply #13 on: June 08, 2023, 01:24:47 am »
I'd think that this action would be better performed on unix systems with the stat command. It can tell, with reasonable accuracy just what a file is. The 'file' command, for example does this, "file tests each argument in an attempt to classify it.  There are three sets of tests, performed in this order: filesystem tests, magic tests, and language tests."

(Its good there is still a place for magic in this world ... )

David

Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

d7_2_laz

  • Hero Member
  • *****
  • Posts: 510
Re: Search in files
« Reply #14 on: June 08, 2023, 11:08:14 am »
Quote
as is the same for excluding subfolders by match "C:\laz;!backup"

Yes, absolutely. Myself i use an entry field for this purpose (scrrenshot 1),
But can live with any possiblity that allows to ignore certain file types: eg. in Notepad++ i "misuse" the regular file filter for that (s. screenshot 2):
The only reason to keep the blacklist distinct from the whitelist, is surely a matter of taste (i feel it's more proper and does not burden a whitelist with the task to pickup candidates from a "set once "blacklist, so keeping it free for changes that might happen more frequently).

Quote
>  checkbox with  !*.foo => still an improvement
Absolutely! Ie. to place a checkbox in front would allow to activate/deactivate a set-once ignore list by demand.

An additional dropdown for this special purpose? Rather not ...  (why?)

Quote
on unix systems  It can tell, with reasonable accuracy just what a file is. The 'file' command, for example does this
Right. A bit funny: even the autkor of Notepad++ (- Windows only) thought about to replicate the logic of the file command for to reliably detect a binary. For anybody interested, here is the thread:
https://github.com/notepad-plus-plus/notepad-plus-plus/issues/9445#issue-comment-box

I think the idea is placed enough now so that one can consider pros and cons.
I myself doubt somehow now if it is really of interest for non-Windows OSses.
Lazarus 3.2  FPC 3.2.2 Win10 64bit

 

TinyPortal © 2005-2018