Lazarus

Programming => Packages and Libraries => Topic started by: Gustavo 'Gus' Carreno on April 22, 2021, 03:00:47 am

Title: Unit Zipper from package paszlib has an inconsistency
Post by: Gustavo 'Gus' Carreno on April 22, 2021, 03:00:47 am
Hey All,

In the comments of the Unit it self it says that the Archive Name should always use / as Directory Separators.
I guess that's the ZIP standard.

But then, when under Windows you use(To try and be cross-platform compatible):
Code: Pascal  [Select][+][-]
  1.     for index:=1 to 4 do
  2.     begin
  3.       filename:= Format('data%sblock%d.txt', [DirectorySeparator,index]);
  4.       Zip.Entries.AddFileEntry(filename);
  5.     end;
  6.  

You get:
Code: Bash  [Select][+][-]
  1. & less file.zip
  2. Archive:  file.zip
  3.  Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
  4. --------  ------  ------- ---- ---------- ----- --------  ----
  5.       16  Defl:N       16   0% 2021-04-21 21:20 fffc0087  data\block1.txt
  6.       16  Defl:N       16   0% 2021-04-21 21:20 d4d15344  data\block2.txt
  7.       16  Defl:N       16   0% 2021-04-21 21:20 cdca6205  data\block3.txt
  8.       16  Defl:N       16   0% 2021-04-21 21:21 828bf4c2  data\block4.txt
  9. --------          -------  ---                            -------
  10.       64               64   0%                            4 files

That has the wrong Directory Separator!!

But if instead you do:
Code: Pascal  [Select][+][-]
  1.     for index:=1 to 4 do
  2.     begin
  3.       filename:= Format('data%sblock%d.txt', [DirectorySeparator,index]);
  4.       archivename:= Format('data/block%d.txt', [index]);
  5.       Zip.Entries.AddFileEntry(filename, archivename);
  6.     end;
  7.  

You get:
Code: Bash  [Select][+][-]
  1. & less file.zip
  2. Archive:  file.zip
  3.  Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
  4. --------  ------  ------- ---- ---------- ----- --------  ----
  5.       16  Defl:N       16   0% 2021-04-21 21:20 fffc0087  data/block1.txt
  6.       16  Defl:N       16   0% 2021-04-21 21:20 d4d15344  data/block2.txt
  7.       16  Defl:N       16   0% 2021-04-21 21:20 cdca6205  data/block3.txt
  8.       16  Defl:N       16   0% 2021-04-21 21:21 828bf4c2  data/block4.txt
  9. --------          -------  ---                            -------
  10.       64               64   0%                            4 files
This has the correct Directory Separator!!

So then I tried to unzip a BAD under Linux but trying to change the Entries properties, to no avail!!

Where can I post this bug? Is the unit still being maintained?

Cheers,
Gus
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: trev on April 22, 2021, 03:48:10 am
DirectorySeparator is system dependant constant, so on Windows it's \ but on Linux, macOS, FreeBSD etc it's  /.
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: Gustavo 'Gus' Carreno on April 22, 2021, 04:48:25 am
Hey Trev,

DirectorySeparator is system dependant constant, so on Windows it's \ but on Linux, macOS, FreeBSD etc it's  /.

I know you're not being condescendant here, cuz I know that for quite a while :P (around my first install of Slackware circa 1995), but that's not the issue.

You add a file under Windows using Zip.Entries.AddFileEntry(filename)(One param, not 2) and the Zip file is incorrectly formatted:
Code: Bash  [Select][+][-]
  1. & less file.zip
  2. Archive:  file.zip
  3.  Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
  4. --------  ------  ------- ---- ---------- ----- --------  ----
  5.       16  Defl:N       16   0% 2021-04-21 21:20 fffc0087  data\block1.txt
  6.       16  Defl:N       16   0% 2021-04-21 21:20 d4d15344  data\block2.txt
  7.       16  Defl:N       16   0% 2021-04-21 21:20 cdca6205  data\block3.txt
  8.       16  Defl:N       16   0% 2021-04-21 21:21 828bf4c2  data\block4.txt
  9. --------          -------  ---                            -------
  10.       64               64   0%                            4 file

Internally it should replace those IN-correct Directory separators, but it doesn't !!!

Only a problem in Windows.

Cheers,
Gus
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: trev on April 22, 2021, 07:33:49 am
Internally it should replace those IN-correct Directory separators, but it doesn't !!!

Gotcha. It would be nice to have in that case for Windows (I no longer recall what OS/2 used for directory separators). On the other hand, you did specify the incorrect separator :)

Is the author known? If so, you could contact him/her and suggest an enhancement.
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: Gustavo 'Gus' Carreno on April 23, 2021, 02:05:45 am
Hey Trev,

Gotcha. It would be nice to have in that case for Windows (I no longer recall what OS/2 used for directory separators).

Hummm, good call, that also needs to be addressed!!

On the other hand, you did specify the incorrect separator :)

Well, yes and no...
In the case where you use the single param overload of Zip.Entries.AddFileEntry() you can only specify the file itself.
This means that under Windows it will be separated with \.
So INTERNALLY, in this case, TZipper should keep the Filename intact, but replace the \ with / on it's internally named ArchiveFilename, which it doesn't and causes the above mentioned issue to arise.

This is only an issue if you try to unzip that badly formed zip on a Linux system where data\block1.txt is a perfectly valid file name and NOT a folder and a file.
If you do so, you have a file called data\block1.txt on the current folder and not a file block1.txt under a data folder.

Is the author known? If so, you could contact him/her and suggest an enhancement.

Well, I was hoping that the oldest members of the Forum would know that, since this is the only info I got and it's the header of the Zipper unit(who's this micheal?):
Code: Pascal  [Select][+][-]
  1. {
  2.     $Id: header,v 1.3 2013/05/26 06:33:45 michael Exp $
  3.     This file is part of the Free Component Library (FCL)
  4.     Copyright (c) 1999-2014 by the Free Pascal development team
  5.  
  6.     See the file COPYING.FPC, included in this distribution,
  7.     for details about the copyright.
  8.  
  9.     This program is distributed in the hope that it will be useful,
  10.     but WITHOUT ANY WARRANTY; without even the implied warranty of
  11.     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  12.  
  13.  **********************************************************************}

This unit is part of fpcsrc/packages/paszlib.

Can anyone else help?

Many MANY thanks in advance!!

Cheers,
Gus
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: GetMem on April 23, 2021, 06:24:01 am
@Gustavo 'Gus' Carreno
Quote
Can anyone else help?
Help with what? If you found a bug, just go to the bugtracker: https://bugs.freepascal.org/my_view_page.php , select FPC and report it.

Quote
Well, I was hoping that the oldest members of the Forum would know that, since this is the only info I got and it's the header of the Zipper unit(who's this micheal?):
Again bugtracker, FPC section and search for Michael. You will see him assigned to many bug reports.  He is really the Michael you are looking for, I know because I also reported issues related to paslib in the past.  :)
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: engkin on April 23, 2021, 06:27:26 am
 O:-)
They like patches when bugs are reported. The link is on your left side for FPC — I am sure you had seen it before. There is a section for packages in the bug tracker. You'll need a new user name/password to be able to create a bug report.

If you do report this bug, don't forget to leave the issue number here for us. Thank you in advance. 😃
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: Gustavo 'Gus' Carreno on April 23, 2021, 06:32:16 am
Hey GetMem,

Again bugtracker, FPC section and search for Michael. You will see him assigned to many bug reports.  He is really the Michael you are looking for, I know because I also reported issues related to paslib in the past.  :)

Thanks GetMem!

I was almost certain that I had to drop an issue on the bugtracker, but I think I was a bit lazy and kinda dreaded the fact that I have to be formal and give a proper report;

This means maybe diving more into the code and see if I can come with a solution and I'm deep in PHP land ATM so the context switch is hard :)

I'll do that then...

Cheers,
Gus
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: Gustavo 'Gus' Carreno on April 23, 2021, 06:35:57 am
Hey Engkin,

O:-)
They like patches when bugs are reported. The link is on your left side for FPC — I am sure you had seen it before. There is a section for packages in the bug tracker. You'll need a new user name/password to be able to create a bug report.

Thanks mate. I think because I'm knee deep into PHP land ATM I'm dreading the context switching...
But I do have to dive into the code and see where it's appropriate to drop the fix.
Not my first bug report/feature request on FPC, so I think I'm good with the drill :)

If you do report this bug, don't forget to leave the issue number here for us. Thank you in advance. 😃

Will do!!

Cheers,
Gus
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: wp on April 23, 2021, 09:09:48 am
I've been using forward slashes in Windows for zipper usage in fpspreadsheet all the time, and there have never been any reports of unreadable files in Linux. Also, the attached project results in a valid zip in windows and Linux.

Code: Pascal  [Select][+][-]
  1. program project1;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. uses
  6.   Classes, zipper;
  7.  
  8. const
  9.   zipfilename = 'test.zip';
  10.   filename1 = 'project1.lpr';
  11.   filename2 = 'project1.lpi';
  12.   filename3 = 'subdir/a.txt';
  13.   filename4 = 'subdir/b.txt';
  14.  
  15. var
  16.   zip: TZipper;
  17. begin
  18.   zip := TZipper.Create;
  19.   zip.Filename := zipfilename;
  20.   zip.Entries.AddFileEntry(filename1);
  21.   zip.Entries.AddFileEntry(filename2);
  22.   zip.Entries.AddFileEntry(filename3);
  23.   zip.Entries.AddFileEntry(filename4);
  24.   zip.SaveToFile(zipfilename);
  25.   zip.Free;
  26. end.
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: Gustavo 'Gus' Carreno on April 24, 2021, 12:27:45 am
Hey WP,

I've been using forward slashes in Windows for zipper usage in fpspreadsheet all the time, and there have never been any reports of unreadable files in Linux. Also, the attached project results in a valid zip in windows and Linux.

Well, if you invert those slashes, TZipper will zip it, TUnzipper will unzip it and it's even a valid zip file.
 
The problem is that if you invert those slashes the Unzipping in Windows will have ONE behaviour and the Unzipping in Linux will have ANOTHER, as I've demonstrated MANY times in the posts above.

The fact is that there isn't any error or warning or anything when you feed this: 'subdir/a.txt' OR this: 'subdir\a.txt' to AddFileEntry() and this a problem for any novice programmer that isn't aware that the ZIP standard is to use forward slashes.

IOW, if you feed this 'subdir/a.txt' to AddFileEntry() you get:

BUT, if you feed this 'subdir\a.txt' to AddFileEntry() you get:

Internally, the Entries class has 2 properties. One is called Filename and the other is Archivename, among other copies due to UTF8.

In my view, Filename can contain the platform specific slash AS LONG AS the Archivename will ALWAYS be platform independent and ZIP compliant by ALWAYS having the forward slash.
And I've demonstrated well enough, that that's not the case.

But that's all moot because a cross-platform component should behave the same on different platforms NO MATTER what the user inputs in terms of files, right?

Or am I missing something here, WP?

Cheers,
Guss
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: Gustavo 'Gus' Carreno on April 24, 2021, 12:53:53 am
Hey All,

As promised I come back with the link to the issue:

https://bugs.freepascal.org/view.php?id=38798

Cheers,
Gus
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: wp on April 24, 2021, 12:58:16 am
So the problem is that AddFileEntry does not convert back slash to forward slashes? Calling something like SwitchPathDelims in LazFileUtils (or a similar function because zipper is in FCL) would be a fix to your bug report.
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: Gustavo 'Gus' Carreno on April 24, 2021, 01:11:09 am
Hey WP,

So the problem is that AddFileEntry does not convert back slash to forward slashes? Calling something like SwitchPathDelims in LazFileUtils (or a similar function because zipper is in FCL) would be a fix to your bug report.

I can completely agree with you. And they use StringReplace() all over the place in TZipper / TUnzipper.

My problem is now: Where?? They are missing one more StringReplace(), but I'm not sure where :)

I tried to follow the code and got completely lost, but I did get the impression that it should be inside the class TZipEntry(I think that's the name).

So I've put all my arguments, finds and opinions on the bug issue, let's see what happens next :)

Cheers,
Gus
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: wp on April 24, 2021, 01:28:07 am
Found this one:
Code: Pascal  [Select][+][-]
  1. procedure TZipFileEntry.SetDiskFileName(const AValue: String);
  2. begin
  3.   if FDiskFileName=AValue then Exit;
  4.   // Zip file uses / as directory separator on all platforms
  5.   // so convert to separator used on current OS
  6.   if DirectorySeparator='/' then
  7.     FDiskFileName:=AValue
  8.   else
  9.     FDiskFileName:=StringReplace(AValue,'/',DirectorySeparator,[rfReplaceAll]);
  10. end;

And this looks as if your issue is already considered...
But the slash to be replaced by StringReplace is the wrong one!
Code: Pascal  [Select][+][-]
  1.     FDiskFileName := StringReplace(AValue, '\', '/', [rfReplaceAll]);
  2.  

There are several locations like this. Not sure, though, which one is the name to be used for zipping and which one for unzipping.

If you want to debug the zipper unit you should copy the zipper.pp file into your test project directory. This unit is at the "end of the food chain" and can be recompile without affecting others.
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: wp on April 24, 2021, 01:48:02 am
I think this fixes it:
Code: Pascal  [Select][+][-]
  1. procedure TZipFileEntry.SetDiskFileName(const AValue: String);
  2. begin
  3.   if FDiskFileName=AValue then Exit;
  4.   // Zip file uses / as directory separator on all platforms
  5.   FDiskFileName := StringReplace(AValue, '\', '/', [rfReplaceAll]);
  6. end;

It works on Windows and Linux - try this test
Code: Pascal  [Select][+][-]
  1. program project1;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. uses
  6.   Classes, zipper;
  7.  
  8. const
  9.   zipfilename = 'test.zip';
  10.   filename1 = 'project1.lpr';
  11.   filename2 = 'project1.lpi';
  12.   filename3 = 'subdir\a.txt';
  13.   filename4 = 'subdir/b.txt';
  14.  
  15. var
  16.   zip: TZipper;
  17.   unzipper: TUnzipper;
  18.  
  19. begin
  20.   zip := TZipper.Create;
  21.   zip.Filename := zipfilename;
  22.   zip.Entries.AddFileEntry(filename1);
  23.   zip.Entries.AddFileEntry(filename2);
  24.   zip.Entries.AddFileEntry(filename3);
  25.   zip.Entries.AddFileEntry(filename4);
  26.   zip.SaveToFile(zipfilename);
  27.   zip.Free;
  28.  
  29.   unzipper := TUnzipper.Create;
  30.   unzipper.Filename := zipFileName;
  31.   unzipper.OutputPath := 'unzip';
  32.   unzipper.Examine;
  33.   unzipper.UnzipAllFiles;
  34.   unzipper.Free;
  35. end.
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: Gustavo 'Gus' Carreno on April 24, 2021, 01:56:23 am
Hey WP,

Found this one:
Code: Pascal  [Select][+][-]
  1. procedure TZipFileEntry.SetDiskFileName(const AValue: String);
  2. begin
  3.   if FDiskFileName=AValue then Exit;
  4.   // Zip file uses / as directory separator on all platforms
  5.   // so convert to separator used on current OS
  6.   if DirectorySeparator='/' then
  7.     FDiskFileName:=AValue
  8.   else
  9.     FDiskFileName:=StringReplace(AValue,'/',DirectorySeparator,[rfReplaceAll]);
  10. end;

This one in particular makes absolute sense since it's the DiskFile and should be platform DEPENDANT.

I think the error is not making sure that TZipFileEntry.ArchiveName NEVER has a back slash.
TZipFileEntry.Filename and TZipFileEntry.Diskname can be, SHOULD be, platform dependant.
BUT TZipFileEntry.ArchiveName should ALWAYS be ZIP compliant.
I hope I'm making sense here.

And this looks as if your issue is already considered...
But the slash to be replaced by StringReplace is the wrong one!
Code: Pascal  [Select][+][-]
  1.     FDiskFileName := StringReplace(AValue, '\', '/', [rfReplaceAll]);
  2.  

There are several locations like this. Not sure, though, which one is the name to be used for zipping and which one for unzipping.

If you want to debug the zipper unit you should copy the zipper.pp file into your test project directory. This unit is at the "end of the food chain" and can be recompile without affecting others.

I'm sorry to disagree with you WP, but I think, respectfully, that you're barking at the wrong tree.

If you do that replace to TZipFileEntry.Diskname you will still have a zipped file like this:
Code: Bash  [Select][+][-]
  1.     & less file.zip
  2.     Archive:  file.zip
  3.      Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
  4.     --------  ------  ------- ---- ---------- ----- --------  ----
  5.           16  Defl:N       16   0% 2021-04-21 21:20 fffc0087  data\block1.txt
  6.           16  Defl:N       16   0% 2021-04-21 21:20 d4d15344  data\block2.txt
  7.           16  Defl:N       16   0% 2021-04-21 21:20 cdca6205  data\block3.txt
  8.           16  Defl:N       16   0% 2021-04-21 21:21 828bf4c2  data\block4.txt
  9.     --------          -------  ---                            -------
  10.           64               64   0%                            4 file

Because you're not affecting TZipFileEntry.ArchiveName, which is the field that is used in the listing above.

So again, the issue is with TZipFileEntry.ArchiveName but not the other fields!!

Cheers,
Gus
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: Gustavo 'Gus' Carreno on April 24, 2021, 01:59:24 am
Hey WP,

It works on Windows and Linux - try this test

But have you looked at the generated zip file?

Are all the slashes ONE way or are they BOTH ways?

If they are BOTH ways, that's not ZIP compliant :P

Cheers,
Gus
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: wp on April 24, 2021, 08:37:13 am
I did not care about the variable names, I think somebody got confused here.

Fact is what the debugger does: in my line "zip.Entries.AddFileEntry(filename3)" it goes into TZipFileEntry.SetDiskFileName, and it is that DiskFileName which is used by ZipOneFile and written to the archive.

Later, the Unzipper uses a procedure GetOutputfileName (local to UnzipOneFile) the replace the forward slashes by DirectorySeparator of the OS.

Using ArchiveName in AddFileEntry, on the other hand, is a way to rename entries in the zip (and later as the extracted file).
Here is a modified test which also sets the ArchiveName of one entry. It works also.

Code: Pascal  [Select][+][-]
  1. program project1;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. uses
  6.   Classes, zipper;
  7.  
  8. const
  9.   zipfilename = 'test.zip';
  10.   filename1 = 'project1.lpr';
  11.   filename2 = 'project1.lpi';
  12.   filename3 = 'subdir\a.txt';
  13.   filename4 = 'subdir/b.txt';
  14.  
  15. var
  16.   zip: TZipper;
  17.   unzipper: TUnzipper;
  18.  
  19. begin
  20.   zip := TZipper.Create;
  21.   zip.Filename := zipfilename;
  22.   zip.Entries.AddFileEntry(filename1);
  23.   zip.Entries.AddFileEntry(filename2);
  24.   zip.Entries.AddFileEntry(filename3);
  25.   zip.Entries.AddFileEntry(filename4);
  26.   zip.Entries.AddFileEntry(filename3, 'subdir\renamed_a.txt');
  27.   zip.SaveToFile(zipfilename);
  28.   zip.Free;
  29.  
  30.   unzipper := TUnzipper.Create;
  31.   unzipper.Filename := zipFileName;
  32.   unzipper.OutputPath := 'unzip';
  33.   unzipper.Examine;
  34.   unzipper.UnzipAllFiles;
  35.   unzipper.Free;
  36. end.

When I run this in Linux the unzipper creates a subfolder "subdir", and that contains files "a.txt", "b.txt" and "renamed_a.txt" - just like it should be.

A patch should also take care of the confusing comments in the TZipEntry declaration regarding the meaning of the names DiskFileName and ArchiveFileName. "DiskfileName" should be "Name of the file to be zipped, directory separators will be replaced by '/' internally", and "ArchiveFileName" should be "Name of the file to be extracted, directory separators are '/' internally but will become those of the OS upon extraction".

There is one inconsistency in my patch, though: When AddFileEntry replaces '\' by '/' then ZipOnFile will search for a file with forward slashes in Windows. But this is no problem in Windows and FPC.

Alternatively ZipOneFile should introduce a local function GetZippedFileName which does the '\'-to-'/' replacement and which is called after the stream with the input file (original DiskFileName) has been opened
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: wp on April 24, 2021, 11:32:38 am
On the other hand, when a backslash is a valid character in a Linux filename, how should zipper know that the '\' in 'subdir\a.txt' refers to file 'subdir\a.txt' in the current directory or to 'a.txt' in subdirectory 'subdir' due to the user meaning a '/' rather than a '\'?

I get the feeling this "issue" should not be fixed on the zipper side. It is the user's responsibility to provide the correct path delimiters in cross-platform applications. There is the global variable PathDelim (in SysUtils) for exactly this purpose.

The worst software is the one which tries to guess what the user means.
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: Gustavo 'Gus' Carreno on April 24, 2021, 12:24:22 pm
Hey WP,

The worst software is the one which tries to guess what the user means.

This statement alone made me think of this matter in a total different light.
And in all frankness I completely agree with the statement.

But this means that in ALL the places where there is an example of TZipper, there has to be a clear explanation on how you should never use a backslash when dealing with this component UNLESS you really want one.

If this is not explained in ALL examples then someone else will use Windows backslashes to create a zip that will not Unzip correctly in *nix environments and we will be back here again.

So we have a dilemma:

One is going to be dependant that all new users of the Zipper unit read the correct examples but has no guaranties.
The other one is dependant that one user will stumble on this issue when he's in *nix land and wants to create a file with an actual backslash.

My vote is with the latter not the former, EVEN when I agree with you that software should not try to guess what the user's intentions are.
There will be WAY less users stumbling on the latter that on the former. I think this is lesser of two evils territory and the lesser evil is the latter.

Just my 2c.

Cheers,
Gus
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: wp on April 24, 2021, 01:28:25 pm
Let's think about it again...

The names first... I think the basic idea is correct. There is the name of the disk file which will be added to the zip - this is the DiskFileName. And there is the ArchiveFilename which is used inside the zip and requires forward slashes.

AddFileEntry has several overloads: two of them with an ArchiveFileName specified
Code: Pascal  [Select][+][-]
  1. procedure TZipFileEntry.SetArchiveFileName(const AValue: String);
  2. begin
  3.   if FArchiveFileName=AValue then Exit;
  4.   // Zip standard: filenames inside the zip archive have / path separator
  5.   if DirectorySeparator='/' then
  6.     FArchiveFileName:=AValue
  7.   else
  8.     FArchiveFileName:=StringReplace(AValue, DirectorySeparator, '/', [rfReplaceAll]);
  9. end;
  10.  
Here, nothing is changed on Linux, but on Windows (DirectorySeparator <> '/') the backslashes are replaced by '/'.

The third overload has no ArchiveFileName as argument. It leaves the ArchiveFileName empty, but I think we should set the ArchiveFilename here equal to the DiskFileName since this will replace the backslashes on windows for the zip's internal name.
Code: Pascal  [Select][+][-]
  1. function TZipFileEntries.AddFileEntry(const ADiskFileName): TZipFileEntry;
  2. begin
  3.   Result:=Add as TZipFileEntry;
  4.   Result.DiskFileName:=ADiskFileName;
  5.   Result.ArchiveFileName:=ADiskFileName;     // ADDED. The setter SetArchiveFileName will replace '\' by '/' in Windows
  6. end;

Let's look at the following line of my sample:
Code: Pascal  [Select][+][-]
  1.   zip.Entries.AddFileEntry('subdir\a.txt', 'subdir\renamed_a.txt');

On Windows, the second parameter is changed to 'subdir/renamed_a.txt' which makes the zip valid. And the file itself is found for packing - fine.

On Linux, we have two problems: the first one is that the file itself is not found because of the backslash. And if we refrain from guessing what the user thinks we must keep it like that - or the user must code correctly with the correct path delimiter, i.e. the first parameter must be specified as 'subdir' + PathDelim + 'a.txt'. But this is no limitation of the zipper, this is general practice in file handling on Linux.

The other problem is the ArchiveFilename parameter, 'subdir\renamed_a.txt'. When the user knows and understands that this is the name of the file inside the zip he would specify it as 'subdir/renamed_a.txt' and everything would be fine. But he probably thinks that this is the name of the file after unzipping. Now we have the same problem as before, and the ArchiveFileName must be specified as a valid name of the file system: 'subdir' + PathDelim + 'renamed_a.txt' and let the setter of ArchiveFileName do the work to replace the PathDelim back to a forward slash.

This way the zip/unzip operation yields the desired results, on both Windows and Linux - I tested it.

To summarize we must
- fill the ArchiveFileName in the one-parameter overload of AddFileName to be equal to the DiskFileName, as shown above.
- assume that all names are to be understood as names of the filesystem, i.e. with the correct path delimiter. But this is rather self-evident.
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: marcov on April 24, 2021, 10:48:07 pm
Contracase:  preparing a ZIP for dos  on Windows? Needs to have backslashes in the archive.
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: wp on April 25, 2021, 12:33:41 pm
In this case even the current (un)zipper does not work because there are several replacements of the directory separator by a forward slash. To implement a DOS-compatible version of the (un)zipper a property ArchivePathdelim = (pdDefault, pdDOS) should be introduced.
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: Gustavo 'Gus' Carreno on April 26, 2021, 07:51:09 am
Hey All,

Did I just stumble on a very big can of worms??

It really feels like that :)

@WP: Have you reached a final conclusion on this all?

Cheers,
Gus
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: wp on April 26, 2021, 01:00:07 pm
My conclusions do not matter since I cannot write to the fpc repository... But I submitted a patch similar to reply #21 to your report hoping that an fpc dev picks it up and finds it useful.

It does not address the DOS issue mentioned by marcov since it has been there already before your report.
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: tetrastes on April 26, 2021, 09:34:14 pm
It seems that this is not the only problem of this package. Try filenames with non-ASCII characters at Windows...
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: Gustavo 'Gus' Carreno on April 28, 2021, 04:34:43 am
Hey WP,

My conclusions do not matter since I cannot write to the fpc repository... But I submitted a patch similar to reply #21 to your report hoping that an fpc dev picks it up and finds it useful.

Well, I thank you so very much for dropping in the patch. I have to admit that I was too lazy to do it, so all praises to you my friend!!

It does not address the DOS issue mentioned by marcov since it has been there already before your report.

Hummm, yeah, the worms on this can are increasing...

Cheers,
Gus
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: Gustavo 'Gus' Carreno on April 28, 2021, 04:36:13 am
Hey Tetrastes,

It seems that this is not the only problem of this package. Try filenames with non-ASCII characters at Windows...

Ouch, this can of worms gets bigger and bigger !!!

Many thanks for the report Tetrastes!!

Cheers,
Gus
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: wp on April 28, 2021, 10:07:07 am
This has nothing to do with zipper. It is a standard problem in FPC string handling. If you use zipper in an LCL GUI program everything works fine because the standard encoding is UTF8. The problem mentioned appears in standalone commandline programs in which FPC does not switch to UTF8 by default.

Simply call SetMultiByteConversionCodePage(CP_UTF8) as the first line of your program and the zipper handles utf8-encoded file names correctly. See also attached demo.

Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: tetrastes on April 28, 2021, 12:43:30 pm
This has nothing to do with zipper. It is a standard problem in FPC string handling. If you use zipper in an LCL GUI program everything works fine because the standard encoding is UTF8. The problem mentioned appears in standalone commandline programs in which FPC does not switch to UTF8 by default.

Simply call SetMultiByteConversionCodePage(CP_UTF8) as the first line of your program and the zipper handles utf8-encoded file names correctly. See also attached demo.

Have you tested generated by your demo test.zip? 
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: wp on April 28, 2021, 12:56:11 pm
Yes, of course. But I use PowerArchiver to display the zip rather than 7zip, and it displays the directory correctly. And the unzipped folder contains the alpha-beta-gamma.txt file - did you look in the "unzipped" folder? (And alpha, beta, gamma are not on my Windows code page).
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: tetrastes on April 28, 2021, 01:37:08 pm
The unzipped folder contains the αβγ.txt file only if test.zip unzipped with your demo (i.e. zipper.pp) and may be with PowerArchiver, and may be smth else, which supposes that filenames in archive in utf8. But this is not what Windows uses for filenames. And other unzippers will unzip it to ╬▒╬▓╬│.txt.   
Title: Re: Unit Zipper from package paszlib has an inconsistency
Post by: wp on April 28, 2021, 04:43:33 pm
ok, I see it in the Win Explorer now. But after some seeking I found that there is a TZipper property UseLanguageEncoding which, when set to true, writes the archive names in UTF8 encoding.

There is also a UseUTF8 property for the Unzipper, but I did not find it necessary to set it to true. Also, calling the SetMultiByteConversionCodepage was not required any more after activating the UseLanguageEncoding property.

Oh man - I think the paszlib wiki page in which zipper is documented needs a rewrite...
TinyPortal © 2005-2018