Recent

Author Topic: Pack  (Read 10473 times)

KodeZwerg

  • Hero Member
  • *****
  • Posts: 2269
  • Fifty shades of code.
    • Delphi & FreePascal
Re: Pack
« Reply #15 on: February 18, 2024, 02:46:17 pm »
Up to now no technical background information or any kind of documentation is given, why?
Do you plan it open source or create for programmers a library by giving us headers so we can use it like 7z from within our apps or similar?
What are the current legal rights by using your product?
What specifications does your product have, summarize it somewhere, compression and encryption have special rules in many countries.
« Last Edit: Tomorrow at 31:76:97 xm by KodeZwerg »

O

  • New Member
  • *
  • Posts: 40
  • Creator of Pack
    • Pack
Re: Pack
« Reply #16 on: February 18, 2024, 03:07:06 pm »
Hey KodeZwerg,

Yes as stated on the first and second post of this topic, it will be open source, soon.

Yes it will get a library some day, for now focus is to be good CLI program, then GUI and then Library. But I updated the first post to show it will get a Lib in the future.

The executable is free for you to do what you like. The source will get a permissive license.

As the first post noted, it is using Zstandard for compression, and for now no encryption feature is released.
Yes I should make a repository or update  a document page for that. For now, ask me anything and I will answer.

paweld

  • Hero Member
  • *****
  • Posts: 1268
Re: Pack
« Reply #17 on: February 18, 2024, 03:38:05 pm »
The packing and unpacking test for the first 4 folders (1-4) was performed several times, but for each type of archive. The remaining 3 folders (5-7) were compressed only once. Both VPSs were fresh, i.e. they were created specifically for the tests. The folders used for the tests were added to exceptions in Windows Defender.
The logs show what data each folder contained.
After unpacking, pairs of folders were compared in WinMerge to see if they contained the same data - they all matched.
Best regards / Pozdrawiam
paweld

O

  • New Member
  • *
  • Posts: 40
  • Creator of Pack
    • Pack
Re: Pack
« Reply #18 on: February 18, 2024, 03:44:32 pm »
I looked at your test code; it's neat.
Just a side note: you can use `--activate-other-options --verify-pack` to let Pack, first pack the input, then unpack temporary, compare source and destination, and then delete the unpacked data as cleanup.

I tried to keep the CLI as simple as I could, but these hidden `--activate-other-options` can help the development.

Thank you for the test and results!

fabiopesaju

  • Jr. Member
  • **
  • Posts: 96
Re: Pack
« Reply #19 on: February 18, 2024, 05:09:01 pm »
why not use sqlcipher? it can natively offer the encryption you need. or am I wrong?

O

  • New Member
  • *
  • Posts: 40
  • Creator of Pack
    • Pack
Re: Pack
« Reply #20 on: February 18, 2024, 05:34:24 pm »
@KodeZwerg I updated the notes, but as said, let me know if you have any questions.

@fabiopesaju sqlcipher is SQLite with encryption. Pack is much different; it is a container format, like Zip or Tar, but modern.

KodeZwerg

  • Hero Member
  • *****
  • Posts: 2269
  • Fifty shades of code.
    • Delphi & FreePascal
Re: Pack
« Reply #21 on: February 19, 2024, 01:16:22 am »
Today I played a little with your pack.exe, no speed benchmark, small sizes is my target so I downloaded the running tiger logo from this size as my test-subject.
Since I have no idea what settings you might have used, I picked my own, mostly best compression at slower rate.
I used several different one-file archiver or container variants.

5.677 - FreePascal animated.zstd - (basic zstd with best options used)
5.723 - FreePascal animated.lzip
5.729 - FreePascal animated.lz4
5.762 - FreePascal animated.uha
5.793 - FreePascal animated.gif - (here is the original file)
5.814 - FreePascal animated (rar4).rar
5.832 - FreePascal animated.xz
5.852 - FreePascal animated.7z
5.871 - FreePascal animated.bzip2
5.886 - FreePascal animated (lzma).zipx
5.887 - FreePascal animated.zip
5.887 - FreePascal animated (deflate64).zipx
5.932 - FreePascal animated (xz).zipx
5.963 - FreePascal animated (rar5).rar
5.973 - FreePascal animated (ppmd).zipx
5.973 - FreePascal animated (bzip2).zipx
6.221 - FreePascal animated.kgb
7.680 - FreePascal animated.pack - (here is the file created by pack.exe)

I am unsure about the fileformat you have choosen, to me the header (table data) it writes is very big.
(There are already variants out there that uses ZStd with a better matching container type.)
« Last Edit: Tomorrow at 31:76:97 xm by KodeZwerg »

domasz

  • Hero Member
  • *****
  • Posts: 553
Re: Pack
« Reply #22 on: February 19, 2024, 08:06:44 am »
Did you just compress .GIF file? Or did you uncompress it to .BMP or .PPM first? .GIF is already compressed so the test might make a bit more sense if you first uncompress the file to something non-compressed.
But I believe using SQlite as a container makes no sense and it will slow down things when compressing huge files, like 10-20 GB.

KodeZwerg

  • Hero Member
  • *****
  • Posts: 2269
  • Fifty shades of code.
    • Delphi & FreePascal
Re: Pack
« Reply #23 on: February 19, 2024, 08:44:05 am »
Did you just compress .GIF file? Or did you uncompress it to .BMP or .PPM first? .GIF is already compressed so the test might make a bit more sense if you first uncompress the file to something non-compressed.
I did what I wrote to have no synthetic "well prepared" test, I wanted a practical challenge vs others on the market or just vs the used compression algo itself.
I do agree that the tiger is really good prepared by the makers of that GIF, I mean it is really good size optimized.
That compressing GIF is like compressing mp3 or png or mkv etc.. the result, if at all, will not benefit much due to the format specific pre-compress state.
Since not everyone need to compress huge sets of data or do prepare data to its best shape it can be, I was missing such kind of low filesize results complete.
(and of course i wanted to watch how much difference is between a database and a real archive header.)

But I believe using SQlite as a container makes no sense and it will slow down things when compressing huge files, like 10-20 GB.
I have not that deep tested his pack.exe program, does it write first in a temporary file the de-/en- coded content?
« Last Edit: Tomorrow at 31:76:97 xm by KodeZwerg »

O

  • New Member
  • *
  • Posts: 40
  • Creator of Pack
    • Pack
Re: Pack
« Reply #24 on: February 19, 2024, 09:28:10 am »
@KodeZwerg thank you for the test results.
Can you tell me what the source file is? Is do not think it is the famous tiger.svg, but if you can share it, I can test it.
Small files are a field that Pack can shine in if there are not just one or two, as you detected, and as I stated previously. Pack's lowest size is 2KB. So if you want to compress, for example, a 4KB or even 1KB file, you cannot get a win in compression ratio. After all, Pack is a container, not a compressor.
If you send me the file, I can test it more, but here are my results for tiger.svg, 95.6KB:
tar.gz: 30.8KB
RAR: 30.2KB
Zip: 28.1KB
7z: 26.3KB
Pack: 34KB
You can see that for small files, there is no win in size, only speed. And that is by design; Pack does not go after that last drop of size; it tries to make a balance between size and speed in a meaningful way. And by meaningful, it means, everyday use, and not going hard on your hardware just to save a kilobyte. There are a lot of alternatives that can go to extreme lengths to get the last drop; my goal was the other way around, doing the task optimized and not competitive.
But I get you; sometimes you want more. For example, if you want to pack it once and send it many times over the wire, there is a Hard Press available. You can use `--press=hard` to tell Pack to try harder to compress your data. For example, for the previous tests, the output will be 30.5KB. And for the first Linux source test, it is 146MB instead of the normal 194MB.
And again, even with Hard Press, Pack does not try to eat your hardware just to get a little more; it goes the optimized way I described.

So let us do more, and go with PNG files, a format that is already compressed.
Lazarus images, 3K files, and around 3.88MB in Size, and 8.39MB Size on Disk.
tar.gz: 240ms, 3.45MB (3.46MB on disk)
Pack: 83ms, 3.52MB (3.52MB On disk)
You can see again that for a small loss of size (2%), Pack prepares the result 300% faster.

And to answer your question, no, no temp files.
I am enjoying your input, and I will use it to optimize Pack further. Thank you.
« Last Edit: February 19, 2024, 09:36:14 am by O »

O

  • New Member
  • *
  • Posts: 40
  • Creator of Pack
    • Pack
Re: Pack
« Reply #25 on: February 19, 2024, 09:48:13 am »
@domasz SQLite, used correctly, is much faster than most of the databases out there. And certainly, faster than many archive or container formats out there, while still providing security and transactional updates. The popular idea about it comes from the default settings most people use, but Pack is not that way.

But let us check a bigger file, just one file, from the Lichess archive.
Testing export form 2014-06, with a size of 907MB:
tar.gz: 31,227ms, 257MB
RAR: 22,055ms, 200MB
7z: 79,434ms, 169MB
Pack: 881ms, 245MB

It is clear, that Pack is more than 20X to 90X (2000% to 9000%) faster while using less resource in most cases, and giving a pretty reasonable size.

But let's go higher, here is from the Lichess archive again.
Testing export form 2017-01, with a size of 9.16GB:
RAR: 260s, 2.02GB  (Using ~65% CPU)
Pack: 9s, 2.52GB (Using ~50% CPU)

Also, more than 28X (2800%) faster, while using less resource and giving 24% less compression.


And again, for example for publishing purposes, you can use Hard Press, and you get 1.88GB result instead of 2.52GB. It will take 550s, while giving you 25% more compression in this case.
« Last Edit: February 19, 2024, 10:23:45 am by O »

domasz

  • Hero Member
  • *****
  • Posts: 553
Re: Pack
« Reply #26 on: February 19, 2024, 09:55:02 am »
I do agree that the tiger is really good prepared by the makers of that GIF, I mean it is really good size optimized.
That compressing GIF is like compressing mp3 or png or mkv etc.. the result, if at all, will not benefit much due to the format specific pre-compress state.
The latest state-of-the-art compressors are converting GIF, PNG, JPEG and other formats to uncompressed formats and compressing them with own algorithsm. Then when decompressing such an archive the file is compressed back to GIF/PNG/JPEG. This gives amazing results and beats by far general compressing algorithms (like ZIP, RAR). For example try to compress a JPEG file with 7zip or RAR and you will get usually about 10%. But try with Lepton or WinZIP (.ZIPX) and you might get 50%.

O

  • New Member
  • *
  • Posts: 40
  • Creator of Pack
    • Pack
Re: Pack
« Reply #27 on: February 19, 2024, 10:22:00 am »
@domasz I update the previous post with more tests.
Rearranging data may be a good choice in some cases. Despite great test results, I didn't implement it as it changes input data forever, and you can not get the same exact data with the same hash. It gives great results for formats like PNG, but it may not be what the user want, to change their data. It needs more thoughts.

domasz

  • Hero Member
  • *****
  • Posts: 553
Re: Pack
« Reply #28 on: February 19, 2024, 10:34:08 am »
And here's the fun thing - programs like WinZIP or Lepton give back files that are identical byte-by-byte.

O

  • New Member
  • *
  • Posts: 40
  • Creator of Pack
    • Pack
Re: Pack
« Reply #29 on: February 19, 2024, 11:11:23 am »
If you mean Dropbox's Lepton, it is archived and suggests Brotli (a general compression algorithm) or Microsoft's port that has no official release to test. And I do not know why they left it behind, but it does not give me confidence. And ZipX, I cannot get much benefit from PNG, but JPG, is working. Yet considering the speed, that is not what I prefer for Pack. Pack needs to be fast and simple while giving good compression. I agree with you, if done right, it is a great way to do it, but for now, my tests and research didn't show promising results.

Let's see what happens in the future.

 

TinyPortal © 2005-2018