Recent

Author Topic: findfirst/findnext wrong results with Linux  (Read 5039 times)

winni

  • Hero Member
  • *****
  • Posts: 3197
findfirst/findnext wrong results with Linux
« on: October 25, 2021, 07:58:16 pm »
Hi!

I often recognized that findfirst/findnext with Linux is buggy.
Now I checked that systematic.

Two big errors appear

1) Symbolic links are  never reported correct. The attribute is shown as 32/$20 which is the value for faArchive. It should be $0400

2) The attribute for directories is always 48/$30 which is faDirectory  or faArchive.  Why is the faArchive flag always set? This is nonsense.

Winni

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: findfirst/findnext wrong results with Linux
« Reply #1 on: October 25, 2021, 08:52:37 pm »
Hmm. To what extent is anything other than than name considered public by the library definition?

I've just taken a look and the code that I use to walk /sys to find e.g. serial device parameters does this:

Code: Pascal  [Select][+][-]
  1.   if FindFirst(dir + '*'{%H-}, faDirectory, searchRec) = 0 then begin
  2.     sorter := TStringList.Create;
  3.     sorter.Sorted := true;
  4.     try
  5.  
  6. (* It is not safe to assume that the kernel returns names in a natural or       *)
  7. (* reproducible order, particularly when looking at e.g. /sys, so handle it in  *)
  8. (* two stages.                                                                  *)
  9.  
  10.       repeat
  11.         if searchRec.Name[1] = '.' then (* Always ignore . and .. completely    *)
  12.           continue;
  13.  
  14. (* For everything in the directory, make sure it's a subdirectory and not a     *)
  15. (* symlink. Don't try to use searchRec.Attr for this.                           *)
  16.  
  17.         if FileGetAttr({%H-}dir + searchRec.Name) and (faDirectory + faSymLink{%H-}) = faDirectory then
  18.           sorter.Add(searchRec.Name{%H-})
  19.       until FindNext(searchRec) <> 0;
  20.  

...and so on, heavily recursively. So faDirectory is definitely good as a parameter, and for some reason I've specifically noted that searchRec.Attr is suspect... I wonder how I arrived at that conclusion?

I would however make two observations. The first is that I've previously kvetched about the change in the behaviour of FileExists(), where directories were comparatively recently removed. The second is that some version checking I was doing earlier today reminded me that the definitions of some fields in the data structure returned by FpStat() and FpFStat() changed around FPC 3.2.

Are you absolutely certain that the behaviour you're seeing isn't of recent onset (or for that matter hasn't been fixed recently)?

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: findfirst/findnext wrong results with Linux
« Reply #2 on: October 25, 2021, 09:18:08 pm »
Hi!

My fpc version is 3.2.2

Did a run over 4 linux ext4 partitions with >  1.000.000 files (nightjob).

* Not even one symlink was reported
* All directories have a minimum of 48/$30, sometimes more including hidden, sys or readonly

Yes I know  how to work around. But that slows down heavy when you are searching something specific on a big partition.

And dont forget to ignore '..' !!!
Otherwise the job is never finished. Or your stringlist will explode ......

Winni

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: findfirst/findnext wrong results with Linux
« Reply #3 on: October 25, 2021, 09:35:07 pm »
Yes I know  how to work around. But that slows down heavy when you are searching something specific on a big partition.

Agreed. Or in the case of the code I posted, an exhaustive walk of the relevant bits of /sys is already uncomfortably slow.

Quote
And dont forget to ignore '..' !!!
Otherwise the job is never finished. Or your stringlist will explode ......

I wasn't. /And/ I commented what I was doing :-)

The documentation says that "Not all fields of this record should be used"... and continues by mentioning a field which doesn't even appear in the declaration as given.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

AlexTP

  • Hero Member
  • *****
  • Posts: 2386
    • UVviewsoft
Re: findfirst/findnext wrong results with Linux
« Reply #4 on: October 25, 2021, 09:44:50 pm »
I tried to find the FPC src place which does it, but I came to
Code: Pascal  [Select][+][-]
  1. Function InternalFindNext (var Rslt : TAbstractSearchRec; var Name : RawByteString) : Longint;
  2. ...
  3. Begin
  4.       UnixFindData^.DirPtr := fpopendir(Pchar(DirName));
  5. ....
  6.      p:=fpreaddir(pdir(UnixFindData^.dirptr)^);    
Which uses

Code: Pascal  [Select][+][-]
  1.     Function  FpReaddir    (var dirp : Dir) : pDirent; external name 'FPC_SYSC_READDIR';
  2.  
Which is not having the src code.

Kays

  • Hero Member
  • *****
  • Posts: 569
  • Whasup!?
    • KaiBurghardt.de
Re: findFirst/findNext wrong results with Linux
« Reply #5 on: October 25, 2021, 09:45:12 pm »
2) The attribute for directories is always 48/$30 which is faDirectory  or faArchive.  Why is the faArchive flag always set? This is nonsense.
The documentation of findFirst says:
Quote
  • faArchive
    • file needs to be archived. Not possible on Unix
An strace(1) reveals that findFirst/findNext do a plain stat(2). This system call conveys only limited information (particularly in the st_mode field).

In order to determine whether a file should be archived, you would [at least] need to
  • examine the mount, the 5th field in fstab(5), and
  • the file attributes (absence/presence of the d flag).
For the latter you would need to do an additional ioctl(…, FS_IOC_GETFLAGS, …), confer what lsattr(1) does.

Evidently this is quite a hassle. I guess it’s an implementation choice simply marking all files with faArchive.


1) Symbolic links are  never reported correct. The attribute is shown as 32/$20 which is the value for faArchive. It should be $0400
Again, the documentation of findFirst reads:
Quote
  • faAnyFile
    • Find any file (this is a combination of the other flags).
The phrase “the other flags” only refers those six listed flags. faSymLink is not one of them, confer their numeric values.

If you do a
Code: Pascal  [Select][+][-]
  1. findFirst('*', faAnyFile or faSymLink, info)
(note the or faSymLink) you will find all information, including symbolic links being reported properly.
Yours Sincerely
Kai Burghardt

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: findFirst/findNext wrong results with Linux
« Reply #6 on: October 25, 2021, 10:13:01 pm »
An strace(1) reveals that findFirst/findNext do a plain stat(2).

...which is notoriously platform-specific.

Quote
confer what lsattr(1) does.

Although in fairness, lsattr is specific to ext2 and its successors, so is of limited relevance to a problem which (presumably) affects Linux in general.

The bottom line appears to be that at some point over the last few versions the search record has changed from being a data structure in memory with .Attr declared as a byte to a badly-documented advanced record with .Attr defined as a longint. (Numeric) constants such as faDirectory are documented as being used in TSearchRec but not explicitly as being used in TRawbyteSearchRec... I suspect that the documented relationship between the two types is incorrect.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: findFirst/findNext wrong results with Linux
« Reply #7 on: October 25, 2021, 10:14:20 pm »

If you do a
Code: Pascal  [Select][+][-]
  1. findFirst('*', faAnyFile or faSymLink, info)
(note the or faSymLink) you will find all information, including symbolic links being reported properly.

Hi!

Just tested with /usr/bin

The contents of the whole directory is given. With all files, directories and links.
So something there is totaly boken.

But that is not the way we treat findfirst/findnext since Delphi days:

Read everything in and decide then with the  attribute and/or the filename what to do.
With you suggestion you have to read all directories two, three or four times.   Too slow.

Winni

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: findFirst/findNext wrong results with Linux
« Reply #8 on: October 25, 2021, 10:24:02 pm »
Read everything in and decide then with the  attribute and/or the filename what to do.
With you suggestion you have to read all directories two, three or four times.   Too slow.

In fairness, there are two distinct things here: the filter applied to FindFirst() etc., and the bitset used for returning actual attributes.

I suspect that the faXXX constants are good for the first case on all platforms, but not reliably for the second since the .Attr field is undocumented and apparently unreliable except possibly in the case of the original DOS API.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Kays

  • Hero Member
  • *****
  • Posts: 569
  • Whasup!?
    • KaiBurghardt.de
Re: findFirst/findNext wrong results with Linux
« Reply #9 on: October 25, 2021, 10:59:46 pm »
An strace(1) reveals that findFirst/findNext do a plain stat(2).
...which is notoriously platform-specific.
The question was about Linux, so I’m not sure what’s your point here.

Although in fairness, lsattr is specific to ext2 and its successors, so is of limited relevance to a problem which (presumably) affects Linux in general.
Uhm, it’s an implementation. You call it chattr on Linux, you call it chflags on FreeBSD (ZFS), and I don’t know what the Windoze folks do (NTFS), but they got their implementation of dump/nodump [with file granularity] too.

[…] The contents of the whole directory is given. With all files, directories and links.
So something there is totaly boken. […]
Quote from findFirst:
Quote
It is a common misconception that Attr specifies a set of attributes which must be matched in order for a file to be included in the list. This is not so: The value of Attr specifies additional attributes, this means that the returned files are either normal files or have an attribute which is present in Attr.
It is correct that you get a complete directory listing. You will need to filter the results to get just want you want.
Yours Sincerely
Kai Burghardt

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: findFirst/findNext wrong results with Linux
« Reply #10 on: October 26, 2021, 09:44:00 am »
An strace(1) reveals that findFirst/findNext do a plain stat(2).
...which is notoriously platform-specific.
The question was about Linux, so I’m not sure what’s your point here.

Yes, and my answer is specifically about Linux. Look at the definitions: the structure returned by stat() on Linux is processor-specific, and all code that does anything with that has to be cautious. The reason is that when Linux started supporting multiple CPUs, somebody decided that it was more important to look like the dominant unix flavour on that particular target... resulting in a mess.

Quote
Although in fairness, lsattr is specific to ext2 and its successors, so is of limited relevance to a problem which (presumably) affects Linux in general.
Uhm, it’s an implementation. You call it chattr on Linux, you call it chflags on FreeBSD (ZFS), and I don’t know what the Windoze folks do (NTFS), but they got their implementation of dump/nodump [with file granularity] too.

Well, that just reinforces my point: what you get back from all of those is outside the scope of the basic unix file-handling API, which is what FPC uses. Correct operation of FindFirst() etc. should comply with the Linux API, and be completely ambivalent to the filesystem in use.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: findfirst/findnext wrong results with Linux
« Reply #11 on: October 26, 2021, 09:47:12 am »
Hi!

Okay, we summarize:

* faAnyfile or faSymlink does what faAnyFile is supposed to do - and does with Delphi or Lazarus with Windows

* faArchive is always set. faNormal is never set.

If six was  nine (Jimi Hendrix)


Winni

 

TinyPortal © 2005-2018