Recent

Author Topic: About my compression library and my Parallel archiver...  (Read 3291 times)

aminer

  • Hero Member
  • *****
  • Posts: 956
About my compression library and my Parallel archiver...
« on: April 08, 2013, 08:34:29 pm »

Hello,

I have downloaded Easy compression library that costs you 149$
and more...

Here it is:

http://www.componentace.com/ecl_features.htm

And i have noticed that it supports three compression algorithms:
Zlib, Bzip and PPM, and i have tried to benchmark 
it against my Parallel compression library and i have found
that the Easy compression library with maximum level compression that
is ppmMax(with PPM maximum compression level) has less compression ratio than my Parallel compression library with maximum level compression that is clLZMAMax(LZMA with maximum compression level),
try it yourself and see, so my Parallel compression library and Parallel archiver are better on compression ratio.

If we take a look know at the performance and scalability,
my Parallel my Parallel compression library and my Parallel archiver
are very fast , you can even use my Parallel archiver as a hashtable from the harddisk with O(1) access, and they can scale with the number of cores, Easy compression library can not.

If we take a look at the Relibalitity, my Parallel compression library
and my Parallel archiver are very stable now, and they don't take
too much memory ressources.

If we take a look also at the usability , my Parallel compression library and my Parallel archiver are very easy to use.

My Parallel compression library supports Parallel LZMA,Parallel Bzip,Parallel LZ,Parallel LZO and Parallel Gzip compression algorithms, and my Parallel archiver supports Parallel LZMA,Parallel Bzip,Parallel LZO and Parallel Zlib.

And  they are compatible with Windows 32bit and Windows 64 bit.

And my Parallel compression library and my Parallel archiver don't
cost you anything.. they are free and they come with source code...

I have also updated Parallel archiver...

Description: Parallel archiver using my Parallel Zlib , Parallel LZO , Parallel Bzip and Parallel LZMA compression algorithms..

Supported features:

- Opens and creates archives using my Parrallel Zlib or Parallel LZO or Parallel Bzip or Parallel LZMA compression algorithms.

- Compiles into exe - no dll/ocx required.

- 64 bit supports - lets you create archive files over 4 GB

- Now my Parallel Zlib gives 5% better performance than Pigz.

- Supports memory and file streams

- You can use it as a hashtable from the hardisk

- Fault tolerant to power failures etc..

- Supports Parallel AES encryption.

- Parallel compression and parallel decompression are extremely fast

- It supports both compression and decompression rate indicator

- You can test the integrity of your archive

- Easy object programming interface

- Full source codes available.

- Platform: Win32 , Win64


Please look at test_pzlib.pas , test_plzo.pas , test_pbzip.pas and test_plzma.pas demos inside the zip file, compile and execute them.. -

Note: test_plzma.pas demo compiles just  under Delphi.

When you want to delete files inside the archive you have to call the DeleteFiles() method , the DeleteFiles() method will not delete the files, it will mark the files as deleted , when you want to delete completly the files , you have to call the DeletedItems() method to see how many files are marked deleted and after that you use the Clean() method to delete completly the files from the archive. I have implemented it like that, cause it's better in my opinion..

Other than that, you have to call the LoadIndex() just after you create your TPZlibArchiver or TPLZOArchiver ,  TPBzipArchiver or TPLZMAArchiver objects with the constructor, it's mandatory, and by calling the LoadIndex() method , my Parallel archiver will be fault tolerant to power failures etc.

And my parallel archiver uses a hashtable to store the file names and there corresponding file positions so that you can direct access to files inside the archive when decompressing, and deleting etc. so it's very fast.

Please look at the test_pzlib.pas, test_plzo.pas ad test_pbzip.pas demos inside the zip file to see how to use my Parallel archiver.

And please don't use directly the ParalleZlib.pas that i have included inside the Parallel archiver zip file, cause i have modified it to work correclty with my Parallel archiver.

If you want to use my ParallelZlib library just download it from my website, or download my other Parallel compression library.

You can now use my Parallel archiver as a hashtable from the hardisk with 0(1) access, you can for example
stream your database row with my ParallelVarFiler into a memory stream or into a string, and store it with my Parallel archiver into an  archive, and after that your can access your rows into the hardisk as a hashtable with O(1) access, you can use it like that as a database if you  have for example id keys that you want to map to database rows, that will be a good idea to
use my Parallel archiver as a hashtable.

Question:

What's your newest ideas behind your parallel archiver ?

Answer:

Of course my Parallel Archiver supports Parallel compression etc. but my newest ideas behind my Parallel Archiver are the following:

I have played with Winzip and 7Zip , but if you want to give some files to extract or to test there integrity, they both (Winzip and 7Zip) will use sequential access and that's bad i think, so i have decided to implement a O(1) access that is very fast for extraction and and for testing the integrity etc. into my Parallel Archiver and for that i have used an in-memory hashtable that maintains the files names and there correponding file positions , and my second idea is that my Parallel Archiver is fault tolerant to power failures and also if your hardisk is full and you get file corruption etc. so my Parallel Archiver is fault tolerant to this kind of problems , 7Zip and Winzip i think are not fault tolerant to those kind of problems.

I have just played with 7Zip , and i have compressed 3 files into the archive and after than i have opened the archive with an editor and i have deleted some bytes and i have saved the file and after that when i have tried to open the archive, 7zip responded that the file is corrupted, so 7Zip is not fault tolerant, i think that with WinZip it's the same, but i have done the same test with  my Parallel archiver, and it's recovering from the file damage, so it's fault tolerant to this kind of damages, such as power failures and when also the disk is full and you get a file corruption etc. I have implemented this kind of  fault tolerancy into my Parallel archiver.

I have updated my Parallel archiver and i have added the Update() method, it's overloaded now in the first version you pass a key name and a TStream, and in the second version you pass a key name and a filename. Please look at the test_pzlib.pas demo inside the zip file to see how to use those methods.

So now you have all the methods to use my Parallel archiver as a Hashtable from the hardisk with direct access to the compressed and/or encrypted data with O(1) very fast acces to the data , the DeleteFiles() has a O(1) complexity the ExtractFiles() and Extract() have also O(1) complexity and  GetInfo() is also O(1) and of course the AddFiles() is also O(1), the Test() method is also O(1). So now it's extremely fast.
When you want to do solid compression with my Parallel archiver using  Bzip , you can use the same method as is using Tar , you can first archive your file with the compression level 0 and after that compress all your archive file using Bzip, and when you want to encrypt your data with Parallel AES encryption just give a password by setting the password property and when you don't want to encrypt just set the password property to a null string or don't set the password property , that's all.

Parallel archiver supports the storing and restoring of the following file attributes:

Hidden, Archive, System, and Read only attributes.

To store and restore them just set the AddAttributes property like this:

pzr.AddAttributes:=[ffArchive,ffReadOnly,ffHidden,ffSystem];
The compression ratio of my Parallel archiver with maximum level compression is the same as WinZip with maximum level compression, but it's less on compression ratio than 7Zip with maximun level compression by 15% to 17%, cause 7Zip uses "block solid compression".

And read this:

"We're a video sharing site located in China. We rewrote the PHP memcached client extension by replacing zlib with QuickLZ. Then our server loads were dramatically reduced by up to 50%, the page response time was also boosted. Thanks for your great work!

Jiang Hong"

http://www.quicklz.com/testimonials.html

http://www.quicklz.com/

So as you have noticed , like QuickLZ or Qpress, i have implemented Parallel archiver with  the idea to favor speed over compression ratio, so with with maximum level compression my Parallel archiver is the same as WinZip but less than 7Zip by 15% to 17%, but by using my Parallel Zlib or my Parallel LZO compression algorithms my Parallel archiver will  be very fast and as i have wrote in my webpage:

"So now you have all the methods to use my Parallel archiver as a Hashtable from the hardisk with direct access to the compressed and/or encrypted data with O(1) very fast acces to the data , the DeleteFiles() has a O(1) complexity the ExtractFiles() and Extract() have also O(1) complexity and  GetInfo() is also O(1) and of course the AddFiles() is also O(1), the Test() method is also O(1). So now it's extremely fast. "

So as you have noticed since my Parallel archiver  favor speed over the  compression ratio, you can use it as a hash table database from the Harddisk to lower more the load on your server (from internet or intranet) and boost the response time...

Hope you will enjoy my Parallel archiver.


You can download my Parallel archiver and my Parallel compression library from:

http://pages.videotron.com/aminer/


Here is the public methods that i have implemented:

Constructor Create(file1:string,size:integer;nbrprocs:integer);
- Creates a new TPZArchiver ready to use, size is the hashtable size for the index(Key file names and the corresponding file position ,and file1 is the file archive, nbrprocs is the number of cores you have specify to run Zlib , LZO , Bzip and LZMA  in parallel.

Destructor Destroy;
- Destroys the TPZArchiver object and cleans up.

function AddFiles;
- Add the files to the archive.

function AddStream;
-Add the stream to the archive.

function DeleteFiles;
- Delete the TStringList content from the archive.

function Erase;
   - Erase the data inside the archive and inside  the hashtable.

function Update;
- Update the file or the stream inside the archive

function ExtractFiles;
- Extract the TStringList content from the archive.

function ExtractAll;
- Extract all the files from the archive.

function Extract;
-Extract the file to the stream.

function Test;
- Test the integrity of the files inside the archive.

function GetInfo;
- Get the file info that is returned in a TZSearchRec record.

function ClearFile;
- Deletes all contents of the archive.

function Clean:boolean
- Clean the marked deleted items from the file.

function DeletedItems:integer
- Return the number of items marked deleted.

function LoadIndex:boolean
- Load the the file names keys and there correponding file positions values from the file passed to the constructor into the hashtable.

function Exists(Name : String) : Boolean;
- Returns True if a file Name exists

procedure GetKeys(Strings : Tstrings);
- Fills up a TStrings descendant with all the file names.

function Count : Integer;
- Returns the number of files inside the archive.


PUBLIC PROPERTIES:

Indicator : boolean
- To show the compression and decompression indicator.
CompressionLevel;
- Set and read the compression level.
Overwrite:boolean
- To update and overwrite the file without asking .
Freshen: boolean
-Add newer files to the archiver and extract newer files from the archive.
AddRecurse: boolean
- AddFiles() method will recurse on subdirectories.
AddAttributes: TAttrOptions
- FindFile attributes for the AddFiles() method, look inside FindFile component.

Language: FPC Pascal v2.2.0+ / Delphi 7+: http://www.freepascal.org/

Operating Systems: Win32 and Win64 (will be ported soon to Linux and Mac (x86)).

And inside defines.inc you can use the following defines:

{$DEFINE CPU32} and {$DEFINE Win32} for 32 bits systems

{$DEFINE CPU64} and {$DEFINE Win64} for 64 bits systems

Required FPC switches: -O3 -Sd -dFPC -dWin32 -dFreePascal

-Sd for delphi mode....

Required Delphi switches: -DMSWINDOWS -$H+ -DDelphi



Thank you,
Amine Moulay Ramdane.

aminer

  • Hero Member
  • *****
  • Posts: 956
Re: About my compression library and my Parallel archiver...
« Reply #1 on: April 08, 2013, 08:52:18 pm »

On 4/8/2013 2:28 PM, aminer wrote:> If we take a look at the Relibalitity, my Parallel compression library
> and my Parallel archiver are very stable now, and they don't take
> too much memory ressources.


Sorry i mean Reliability, my Parallel compression library
and my Parallel archiver are very stable now, and they don't take
too much memory ressources.



Amine Moulay Ramdane.

 

TinyPortal © 2005-2018