Recent

Author Topic: Parallel archiver was updated to version 1.98...  (Read 5116 times)

aminer

  • Hero Member
  • *****
  • Posts: 956
Parallel archiver was updated to version 1.98...
« on: November 21, 2013, 05:46:39 pm »

Hello..

Parallel archiver was updated to version 1.98...

The LoadFromStream() method was fault tolerant to archive
damages and power failures etc. but there was still a problem,
the archive have to have a kind of an unique id so that the
LoadFromStream() works correctly , so i have added this unique
id and now Parallel archiver is rock solid an very stable.

You can download parallel archiver 1.98 from:

http://pages.videotron.com/aminer/

PArchiver 1.98  (stable version)

Description: Parallel archiver using my Parallel LZO , Parallel LZ4 , Parallel Zlib ,  Parallel Bzip and Parallel LZMA compression algorithms..

Supported features:

- Opens and creates archives using my Parallel LZ4 or Parallel LZO or Parallel Zlib or Parallel Bzip or Parallel LZMA compression algorithms.

- Wide range of Parallel compression algorithms: Parallel LZ4, Parallel LZO, Parallel ZLib, Parallel BZip and   Parallel LZMA with different compression levels

- Compiles into exe - no dll/ocx required.

- 64 bit supports - lets you create archive files over 4 GB ,
  supports archives up  to 2^63 bytes, compresses and
  decompresses files up to 2^63 bytes.

- Now my Parallel Zlib gives 5% better performance than Pigz.

- Supports memory and file streams , adds compressed data
  directly from streams and extracts archived files to streams
  without creating temp files.
- Save/Load the archive from stream

- Supports in-memory archives

- You can use it as a hashtable from the hardisk or from the memory

- Fault tolerant to power failures etc..

- Creates encrypted archives using Parallel AES encryption with 256 bit keys.

- Fastest compression levels are extremely fast

- Good balanced compression levels provide  both good compression ratio and high speed

- Maximum compression levels provide much better compression ratio than Zip, RAR and BZIP and the same as  7Zip with 8 megabytes dictionary.

- It supports both compression and decompression rate indicator

- You can test the integrity of your archive

- Easy object programming interface

- Full source codes available.

- Platform: Win32 , Win64

Please look at test_pzlib.pas , test_plzo.pas , test_plz4.pas , test_pbzip.pas and test_plzma.pas demos inside the zip file, compile and execute them.. -
I have tried to do a worst scalability prediction with an HDD drives and a Q6600 quad core for my parallel archiver  with Parallel LZMA, and i think it's good..

There is four things in my Parallel LZMA algorithm:

First we have to copy serially a stream from the hardisk to the memory and this will take in average 0.2  second and in the compression method we have to copy a stream to the memory and this will take in average 0.05 second and in the compression method you have to compress a stream to another stream in memory and this will take in average 13 seconds seconds and in the compression method you have to copy a compressed stream to a hardisk file and this will take in average 0.01 second.

So we have the serial part that is: 0.2 second + 0.01 second + 0.05 second = 0.26 second = 0.02%
and the parallel part will that is: 13 seconds = 0.98%

So the worst case scalability scenario using an HDD and using the Amdahl equation will give us: 1/0.02% + (0.98%/N) = 50X scalability  (N: is the number of cores)

So this will scale up to: 50X , so as you have noticed with an HDD drive this is a good scalability.

So what can we do to scale more parallel archiver using parallel LZMA ?

You can for example use a RAID 10 with a base configuration of 4 HDD drives, so this will cut in 4 the 0.2 second and the 0.01 second , so this will give a scalability of 124X and this is better.. but to speed more the things we can use SSD drives that are 2X time faster than a HDD drives and  with  a RAID 10 configuration and this will give:  434X worst case scalability.

So as you have noticed if you are using only an HDD with a multicore system you will get a 50X scalability with my parallel archiver using parallel LZMA, and if you use RAID 10 with SSD drives you will get 434X scalability.

When you want to delete files inside the archive you have to call the DeleteFiles() method , the DeleteFiles() method will not delete the files, it will mark the files as deleted , when you want to delete completly the files , you have to call the DeletedItems() method to see how many files are marked deleted and after that you use the Clean() method to delete completly the files from the archive. I have implemented it like that, cause it's better in my opinion..

And my parallel archiver uses a hashtable to store the file names and there corresponding file positions so that you can direct access to files inside the archive when decompressing, and deleting etc. so it's very fast.

Please look at the test_pzlib.pas, test_plzo.pas, test_plz4.pas , test_pbzip.pas and test_plzma.pas demos inside the zip file to see how to use my Parallel archiver.

And please don't use directly the ParalleZlib.pas that i have included inside the Parallel archiver zip file, cause i have modified it to work correclty with my Parallel archiver.

If you want to use my ParallelZlib library just download it from my website, or download my other Parallel compression library.

You can now use my Parallel archiver as a hashtable from the hardisk with 0(1) access, you can for example
stream your database row with my ParallelVarFiler into a memory stream or into a string, and store it with my Parallel archiver into an archive, and after that your can access your rows into the hardisk as a hashtable with O(1) access, you can use it like that as a database if you  have for example id keys that you want to map to database rows, that will be a good idea to
use my Parallel archiver as a hashtable.

Question:

What's your newest ideas behind your parallel archiver ?

Answer:

Of course my Parallel Archiver supports Parallel compression etc. but my newest ideas behind my Parallel Archiver are the following:

I have played with Winzip and 7Zip , but if you want to give some files to extract or to test there integrity, they both (Winzip and 7Zip) will use sequential access and that's bad i think, so i have decided to implement a O(1) access that is very fast for extraction and and for testing the integrity etc. into my Parallel Archiver and for that i have used an in-memory hashtable that maintains the files names and there correponding file positions , and my second idea is that my Parallel Archiver is fault tolerant to power failures and also if your hardisk is full and you get file corruption etc. so my Parallel Archiver is fault tolerant to this kind of problems , 7Zip and Winzip i think are not fault tolerant to those kind of problems.

I have just played with 7Zip , and i have compressed 3 files into the archive and after than i have opened the archive with an editor and i have deleted some bytes and i have saved the file and after that when i have tried to open the archive, 7zip responded that the file is corrupted, so 7Zip is not fault tolerant, i think that with WinZip it's the same, but i have done the same test with  my Parallel archiver, and it's recovering from the file damage, so it's fault tolerant to this kind of damages, such as power failures and when also the disk is full and you get a file corruption etc. I have implemented this kind of fault tolerancy into my Parallel archiver.

I have updated my Parallel archiver and i have added the Update() method, it's overloaded now in the first version you pass a key name and a TStream, and in the second version you pass a key name and a filename. Please look at the test_pzlib.pas demo inside the zip file to see how to use those methods.

So now you have all the methods to use my Parallel archiver as a Hashtable from the hardisk with direct access to the compressed and/or encrypted data with O(1) complexity and very fast acces to the data , the DeleteFiles() has a O(1) complexity the ExtractFiles() and Extract() have also O(1) complexity and  GetInfo() is also O(1) and of course the AddFiles() is also O(1), the Test() method is also O(1). So now it's extremely fast.
When you want to do solid compression with my Parallel archiver using Bzip , you can use the same method as is using Tar , you can first archive your file with the compression level 0 and after that compress all your archive file using Bzip, and when you want to encrypt your data with Parallel AES encryption just give a password by setting the password property and when you don't want to encrypt just set the password property to a null string or don't set the password property , that's all.

Parallel archiver supports the storing and restoring of the following file attributes:

Hidden, Archive, System, and Read only attributes.

To store and restore them just set the AddAttributes property like this:

pzr.AddAttributes:=[ffArchive,ffReadOnly,ffHidden,ffSystem];
I have added the in-memory archives support, cause this way Parallel archiver will be much more faster than disk archives, and you will be able to lower much more the response time and to lower the load on your server.

If you want to use an in-memory archive,  pass an empty string to the file name in the constructor, like this:

pzr :=TPLZ4Archiver.Create('',1000,4);

And if you want to read your in-memory archive , read from the Stream property that is exposed(a TStream) like this:

pzr.stream.position:=0;
A_Memory_Stream.copyfrom(pzr.stream,pzr.stream.size)

You can also load your archive from a file or memory stream just by assigning your file or memory stream to the Stream property (a TStream).

I have overloaded the GetKeys() method , now you can use wildcards, you can pass the wildcard in the first argument and the TStringList in the second argument like this:  pzr.getkeys('*.pas',st);
and after that call the  ExtractFiles() method and pass it the TStringList.

As you have noticed,  the programming interface of  my Parallel archiver is very easy to use.

And read this:

"We're a video sharing site located in China. We rewrote the PHP memcached client extension by replacing zlib with QuickLZ. Then our server loads were dramatically reduced by up to 50%, the page response time was also boosted. Thanks for your great work!

Jiang Hong"

http://www.quicklz.com/testimonials.html

http://www.quicklz.com/

So as you have noticed , like QuickLZ or Qpress, i have implemented Parallel archiver to be very fast also.

By using my Parallel Zlib or my Parallel LZ4 or my Parallel LZO compression algorithms my Parallel archiver will  be very very fast and as i have wrote in my webpage:

"So now you have all the methods to use my Parallel archiver as a Hashtable from the hardisk with direct access to the compressed and/or encrypted data with O(1) very fast acces to the data , the DeleteFiles() has a O(1) complexity the ExtractFiles() and Extract() have also O(1) complexity and  GetInfo() is also O(1) and of course the AddFiles() is also O(1), the Test() method is also O(1). So now it's extremely fast. "

You can even use  my Parallel archiver as a hash table database from the Harddisk to lower more the load on your server (from internet or intranet) and boost the response time.....

I have used solid compression like with the  tar.lzma format and i have found that my Parallel archiver, with maximum level compression that is clLZMAMax, compresses to the same size as 7Zip with maximum level compression and with a dictionary size of 8 MB and it compresses 13% better than WinRar with maximum level compression and it is muh  better than WinZip on compression ratio .

How to use solid compression with my Parallel archiver ?

Just archive your files with clLZMANone and  after that compress your archive with clLZMAMax, Parallel archiver will then compress to the same size as 7Zip with maximum level compression and with a dictionary size of 8 MB and it will compress 13% better than WinRar with maximum level compression and it will compress muh better than WinZip with maximum level compression .

I have updated my Parallel archiver to a new version  and i have decided to include Parallel LZ4 compression  algorithm (one of the fastest in the world) into my Parallel archiver,  so to compress bigger data such us Terabytes data you can use my Parallel LZO or my Parallel LZ4 compression algorithms with my Parallel archiver, i have also added the high compression mode to Parallel LZ4 compression algorithm, now for a fast mode  use clLZ4Fast and for the high compression mode use clLZ4Max. The Parallel LZ4 high compression mode is interresting also, it compresses much better than LZO and it is very very fast on decompression, faster than Parallel LZO. I have included a test_plz4.pas demo inside my Parallel archiver zip file to show you how to use Parallel LZ4 algorithm with  my Parallel archiver.

Here is the LZ4 website if you want to read about it:

http://code.google.com/p/lz4/


I have downloaded also the IHCA compression  algorithm from the following website:

http://objectegypt.com/

And i have wrote a Parallel IHCA and begin testing it against my Parallel LZO and my Parallel LZ4 , they say on the IHCA website that it  has the same performance as the LZO algorithm , but i have noticed on my benchmarks that Parallel IHCA(that i wrote) is much more slower than my Parallel LZO and my Parallel LZ4 , so i think the IHCA compressoin algorithm is a poor quality software that you must avoid, so please use my Parallel archiver and Parallel compression library cause with my Parallel LZO and my Parallel LZ4 they are now one of the fastest in the world.

I have also downloaded the following QuickLZ algorithm from:

http://www.quicklz.com/

and i have wrote a Parallel QuickLZ and i have tested it against  my Parallel LZO and Parallel LZ4 , and i have noticed that Parallel QuickLZ is slower than my Parallel LZ4 algorithm, other than that  with  QuickLZ  you have to pay for a commercial license , but with  my Parallel archiver and my Parallel compression library you have to pay 0$ for a commercial license.

My Parallel archiver was updated,  i have ported the Parallel LZ4 compression algorithm(one of the fastest in the world)  to the Windows 64 bit system, now Parallel LZ4 compression algorithm is working perfectly with Windows 32 bit and 64 bit, if you want to use Parallel LZ4 with Windows 64 bit just copy the lz4_2.dll inside the LZ4_64 directory (that you find inside the zip file) to your
current directory or to the c:\windows\system32 directory, and if you want to use the Parallel LZ4 with Windows 32 bit  use the lz4_2.dll inside the LZ4_32 directory.

Here is more information about my Parallel archiver:

Parallel LZO supports Windows 32 bit and 64 bit

Parallel Zlib supports Windows 32 bit and 64 bit

Parallel LZ4 supports Windows 32 bit and 64 bit

Parallel Bzip is Windows 32 bit only

Parallel LZMA is Windows 32 bit only

But even if Parallel LZMA and Parallel Bzip are windows 32 bit only , my Parallel archiver supports Terabytes files and your archive can grow to Terabytes  size even with 32 bit windows executables, and that's good.

And Look also at the prices of the XCEED products:

XCEED Streaming compression library:

http://xceed.com/Streaming_ActiveX_Intro.html

and the XCEED Zip compression library:

http://xceed.com/Zip_ActiveX_Intro.html

http://xceed.com/pages/TopMenu/Products/ProductSearch.aspx?Lang=EN-CA


I don't think the XCEED products supports parallel compression as does my Parallel archiver
and my Parallel compression library..

And just look also at the Easy compression library for example, if you have noticed also it's not a parallel compression library.

http://www.componentace.com/ecl_features.htm

And look at its pricing:

http://www.componentace.com/order/order_product.php?id=4


My Parallel archiver and parallel compression library costs you 0$ and they are parallel compression libraries, and they are very fast and very easy to use, and they  supports Parallel LZ , Parallel LZ4, Parallel LZO, Parallel Zlib,  Parallel Bzip and Parallel LZMA and they come with the source codes and much more...

Hope you will enjoy my Parallel archiver.

Here is the public methods that i have implemented:

Constructor Create(file1:string,size:integer;nbrprocs:integer);
- Creates a new TPZArchiver ready to use, size is the hashtable size for the index(Key file names and the corresponding file position ,and file1 is the file archive, nbrprocs is the number of cores you have specify to run Zlib , LZ4, LZO , Bzip and LZMA  in parallel.

Destructor Destroy;
- Destroys the TPZArchiver object and cleans up.

function AddFiles;
- Adds the files to the archive.

function AddStream;
-Adds the stream to the archive.

function DeleteFiles;
- Deletes the TStringList content from the archive.

function Erase;
   - Erases the data inside the archive and inside  the hashtable.

function Update;
- Updates the file or the stream inside the archive

function ExtractFiles;
- Extracts the TStringList content from the archive.

function ExtractAll;
- Extracts all the files from the archive.

function Extract;
-Extracts the file to the stream.

function Test;
- Tests the integrity of the files inside the archive.

function GetInfo;
- Gets the file info that is returned in a TZSearchRec record.

function ClearFile;
- Deletes all contents of the archive.

function Clean:boolean
- Cleans the marked deleted items from the file.

function DeletedItems:integer
- Returns the number of items marked deleted.

function LoadIndex:boolean
- Loads the the file names keys and there correponding file positions values from the file passed to the constructor into the hashtable.

function Exists(Name : String) : Boolean;
- Returns True if a file Name exists

procedure GetKeys(Strings : Tstrings);
- Fills up a TStrings descendant with all the file names.

function Count : Integer;
- Returns the number of files inside the archive.


PUBLIC PROPERTIES:

Indicator : boolean
- To show the compression and decompression indicator.
CompressionLevel;
- Sets and reads the compression level.
Overwrite:boolean
- To update and overwrite the file without asking .
Freshen: boolean
-Adds newer files to the archiver and extract newer files from the archive.
AddRecurse: boolean
- AddFiles() method will recurse on subdirectories.
Stream:boolean
  - The archive is exposed as a TStream, use it for in-memory archive or disk archive.
AddAttributes: TAttrOptions
- FindFile attributes for the AddFiles() method, look inside FindFile component.

Language: FPC Pascal v2.2.0+ and Lazarus  / Delphi 7 to 2007: http://www.freepascal.org/

Operating Systems: Win32 and Win64

And inside defines.inc you can use the following defines:

{$DEFINE CPU32} and {$DEFINE Win32} for 32 bit systems

{$DEFINE CPU64} and {$DEFINE Win64} for 64 bit systems

Required FPC switches: -O3 -Sd -dFPC -dWin32 -dFreePascal

-Sd for delphi mode....

Required Delphi switches: -DMSWINDOWS -$H+ -DDelphi



Thank you,
Amine Moulay Ramdane.

mercury

  • Full Member
  • ***
  • Posts: 154
Re: Parallel archiver was updated to version 1.98...
« Reply #1 on: February 28, 2015, 11:15:22 am »
Created files is not compatible with winzip, 7z or other software. It's too bad.

mercury

  • Full Member
  • ***
  • Posts: 154
Re: Parallel archiver was updated to version 1.98...
« Reply #2 on: February 28, 2015, 01:02:05 pm »
- Compiles into exe - no dll/ocx required.

According to my test, only plzo and pzlib run without DLL.

- Full source codes available.

So where are the codes for these files?
Code: [Select]
Zlib_fpc32/adler32.o
Zlib_fpc32/crc32.o
Zlib_fpc32/deflate.o
Zlib_fpc32/infback.o
Zlib_fpc32/inffast.o
Zlib_fpc32/inflate.o
Zlib_fpc32/inftrees.o
Zlib_fpc32/match.o
Zlib_fpc32/trees.o
Zlib_fpc32/zutil.o

 

TinyPortal © 2005-2018