Recent

Author Topic: Very early support for .xlsx protection [PATCH]  (Read 11688 times)

shobits1

  • Sr. Member
  • ****
  • Posts: 271
  • .
Very early support for .xlsx protection [PATCH]
« on: March 04, 2017, 12:07:51 am »
I was trying to keep the Workbook, Worksheet and Cell protection information from the read .xlsx file to the saved one.
I'm happy with the result so far,, the saved file retains all the information from the original one ( except for the share file protection/revision). also you protect unprotected files but no password support for now.

Anyone interested the patch is attached.

BTW; wp, if I want to add password support what hash library you prefer?
« Last Edit: March 04, 2017, 12:10:52 am by shobits1 »

wp

  • Hero Member
  • *****
  • Posts: 11906
Re: Very early support for .xlsx protection [PATCH]
« Reply #1 on: March 04, 2017, 01:00:42 am »
Wow, that's a surprise. Thank you. I'll have to look into the patch in more detail. For the moment, just a few remarks:

Maybe one thing to consider: FPSpreadsheet should not be an Excel-only library. Could you find out how LibreOffice handles protection? Even if you would not want to write a reader/writer for protection in ods files the data types must be general enough to be compatible with both cases. But I think it will be similar.

How about removing the sheet protection enumerations of unsupported items from TsWorksheetProtection? We don't need spAutoFilter, spPivotTables or spScenario because being unsupported they will be dropped in reading anyway. They can be added later if these features should be available in some time.

In InitFormatRecord you initialize the ProtectionMode with [prLockCell]. Doesn't this mean that cells are locked by default? Or does it mean that cells are locked by default if the sheet is protected - that weird way of thinking which always confused me... Does Calc of LibreOffice follow the same strategy?

As for password hashing, most preferably I would not want to add a dependence on another package or an external dll. If absolutely necessary, the password stuff must be IFDEF'ed in order to be able to make fpspreadsheet standalone again. Maybe you should have a look at the fpc folder packages/hash/src - if you find anything suitable here this would be the way to go.

shobits1

  • Sr. Member
  • ****
  • Posts: 271
  • .
Re: Very early support for .xlsx protection [PATCH]
« Reply #2 on: March 04, 2017, 02:25:48 am »
Thank you for the quick reply :).

To tell the truth, I don't have deep experience with FPSpreadsheet nor with xlsx/ods file format.. I just implemented this when I saw a need for the modified file to be kept protected from user.
Anyway, I was reading ECMA-376, Fifth Edition, Part 1 - Fundamentals And Markup Language Reference.zip.. which .xlsx files are based on and that's why I add all the sheet protection enumerations (you may remove any unsupported enumeration,, I would've done that if I have known FPSpreadsheet very well).

the InitFormatRecord initialize the ProtectionMode with [prLockCell] because it's the default with Excel, each cell has Locked checked.. and the correct behavior (according to Excel) are non protection are enforced until the sheet is protected (once again I don't know how LibreOffice be haves).
Quote

Locking cells or hiding formulas has no effect until you protect the worksheet ...

from Excel

As for hashing algorithm, the format says you can use MD2, MD4, MD5, RIPEMD-128, RIPEMD-160, SHA-1, SHA-256, SHA-384, SHA-512, WHIRLPOOL,, but I found that Excel 2010 generate files that uses none of those (maybe it is using legacy algorithm)

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11444
  • FPC developer.
Re: Very early support for .xlsx protection [PATCH]
« Reply #3 on: March 04, 2017, 05:25:45 am »

wp

  • Hero Member
  • *****
  • Posts: 11906
Re: Very early support for .xlsx protection [PATCH]
« Reply #4 on: March 04, 2017, 10:53:44 pm »
Please test r5783 in which I applied your patch with some modifications:
  • I renamed some identifiers
  • Since Excel has that strange system of some protection options being on and others being off by default it is not possible to have the worksheet protection empty for an unprotected sheet. Hence I added another option, woProtected, which must be included in the workbook Options to enable protection. Simply call worksheet.Protect(true) to do this (the parameter false removes the option again and unprotects the sheet).
  • I wrote some unit tests to check integrity after writing and reading. All these tests are passed. In this context I realized that some access methods should be helpful. Therefore, the worksheet now has methods WriteCellProtection and ReadCellProtection.
  • I did not test the crypto parameters. My Excel 2007 does not show them, possibly they came in later. Could you test this please?
Also see the wiki (http://wiki.lazarus.freepascal.org/FPSpreadsheet#Protection) for a first version of the documentation.

Something strange: When I define a password for the workbook protection in Excel the file cannot be opened by fpspreadsheet any more, and the zip file signature (PK) is gone. Therefore, I assume that this password encrypts the file. But what is the difference to the "Encrypt document" command in the menu?
« Last Edit: March 04, 2017, 10:57:11 pm by wp »

shobits1

  • Sr. Member
  • ****
  • Posts: 271
  • .
Re: Very early support for .xlsx protection [PATCH]
« Reply #5 on: March 05, 2017, 01:46:00 am »
I'll try and do some tests tomorrow, but I have some remarks:

...it is not possible to have the worksheet protection empty for an unprotected sheet...
Code: XML  [Select][+][-]
  1. <sheetProtection password="CF7A" sheet="1" selectLockedCells="1"/>
The attribute sheet under sheetProtection is the one controlling if the worksheet is protected or not. if sheet="0",, it will ignore all other protection.

  • I did not test the crypto parameters. My Excel 2007 does not show them, possibly they came in later
I think Excel 2010 and 2007, and older format uses the same password hashing algorithm,, Excel 2013 and later uses the more secure hashes.
Code: Pascal  [Select][+][-]
  1. TsCryptoInfo = record
  2.   Password: string; // this will hold the password hash for older Excel version (2010 and Earlier)
  3.  
  4.   // This is for the later versions
  5.   AlgorithmName: string;
  6.   HashValue: string;
  7.   SaltValue: string;
  8.   SpinCount: Integer;
  9. end
  10.  
I intentionally add the password field so to make distinguishing between old and new hashing bit easier; maybe removing it and assigning AlgorithmName:=EXCELOLDHASH will make more sense

Code: Pascal  [Select][+][-]
  1. // This is the code for generating Excel 2010 and earlier password's hash
  2. function Hash_Password( const APassword: string ): string;
  3. const
  4.   Key = $CE4B;
  5. var
  6.   i : Integer;
  7.   HashValue: Word = 0;
  8. begin
  9.   for i := Length( APassword ) downto 1 do
  10.   begin
  11.     HashValue := ord(APassword[i]) xor HashValue;
  12.     HashValue := HashValue shl 1;
  13.   end;
  14.   HashValue := HashValue xor Length( APassword ) xor Key;
  15.  
  16.   Result := IntToHex(HashValue, 4);
  17. end;
  18.  

Something strange: When I define a password for the workbook protection in Excel the file cannot be opened by fpspreadsheet any more, and the zip file signature (PK) is gone. Therefore, I assume that this password encrypts the file. But what is the difference to the "Encrypt document" command in the menu?
Never happened to me under Excel 2010 (I assume it's the same with 2007),, only when Encrypting the document the files loses its signature; I usually protect the workbook from the Review tab.

shobits1

  • Sr. Member
  • ****
  • Posts: 271
  • .
Re: Very early support for .xlsx protection [PATCH]
« Reply #6 on: March 05, 2017, 11:23:44 am »
OK,, I did some testing and I think everything works as it should be.

one things though, according to the documentation :
Code: Pascal  [Select][+][-]
  1. cellprot := worksheet.ReadCellProtection(cell);
this should work but there is no ReadCellProtection, although the ReadProtection is there (maybe it needs renaming).

wp

  • Hero Member
  • *****
  • Posts: 11906
Re: Very early support for .xlsx protection [PATCH]
« Reply #7 on: March 05, 2017, 12:20:40 pm »
I'll try and do some tests tomorrow, but I have some remarks:

...it is not possible to have the worksheet protection empty for an unprotected sheet...
Code: XML  [Select][+][-]
  1. <sheetProtection password="CF7A" sheet="1" selectLockedCells="1"/>
The attribute sheet under sheetProtection is the one controlling if the worksheet is protected or not. if sheet="0",, it will ignore all other protection.
I think this is a very Excel-specific way to do it. I think that the elements of the sheet Protection set should behave like those of the cell protection: they need another "event" to become active - in case of the cell protection the sheet protection must be activated before the cell protection items become active. And in case of the sheet protection? The event to activate this in Excel's implementation is an element of the set itself. This concept is very confusing, and it took me a long time to understand what spSheet really means, and maybe some of the words in my previous post are wrong for this reason. In my implementation sheet protection is activated by adding soProtected to the all-purpose Options of the worksheet. I left spSheet in the set, but renamed it to spCells because I think there might be concepts in which all cells can be allowed for editing but anything else is frozen. (Probably "spEdit" would be a better word?). Next, I'll look at how LibreOffice handles security. If they do it like Excel, I'll probably remove spCells altogether.

I intentionally add the password field so to make distinguishing between old and new hashing bit easier; maybe removing it and assigning AlgorithmName:=EXCELOLDHASH will make more sense
[...]
This is the code for generating Excel 2010 and earlier password's hash

Thanks for posting the Excel hashing algorithm.

Since also the old Excel versions have the encrypted password in the file I would prefer to write it to the field "Hashvalue" of the TsCryptoInfo, yes, with EXCELHASH (or similar) in the AlgorithmName field. Maybe a better word for "HashValue" would be "PasswordHash"

Never happened to me under Excel 2010 (I assume it's the same with 2007),, only when Encrypting the document the files loses its signature; I usually protect the workbook from the Review tab.
My Excel experience with protection and encryption is very limited. Here's what I do: I open Excel 2007, it comes up with an empty worksheet. I go to "Review", select "Protect workbook" and enter a password ("test") in the edit box (leave the two checkboxes alone). I confirm the password in the next dialog, and then I save as xlsx. I open the file in a hex editor. The first two bytes in the file are $D0 $CF which is not the signature of a zip file which would be $50 $4B ('PK'). Therefore, fpspreadsheet cannot read the file. Even if we had an implementation of decryption it would not be possible to read the workbook cryptoinfo field for the algorithm to be applied for decryption. Really bad implementation by MS unless they assume that the file is always encrypted using their own algorithm...

Is there an Excel setting somewhere which automaticlly causes file encryption when workbook protection w/ password is activated?

shobits1

  • Sr. Member
  • ****
  • Posts: 271
  • .
Re: Very early support for .xlsx protection [PATCH]
« Reply #8 on: March 05, 2017, 06:18:01 pm »
I end-up downloading office/excel 2007; unfortunately, indeed it encrypts the file once the workbook protection is activated unlike Excel 2010.. I'll try searching and see if I can find a way to decrypt the file (no promises though).

shobits1

  • Sr. Member
  • ****
  • Posts: 271
  • .
Re: Very early support for .xlsx protection [PATCH]
« Reply #9 on: March 05, 2017, 10:30:57 pm »
While I'm searching, I found that the algorithm for calculating the password's hash (I previously posted), doesn't work well and can't handle special characters and characters from other languages (ex. the calculated hash of ok1YES3!@ is not the same as excel).. the following one should return the correct hash value (tested with Excel 2010/2007 for protected worksheet)

Code: Pascal  [Select][+][-]
  1. function ExcelPasswordHash( const APassword: string ): string;
  2. var
  3.   i : Integer;
  4.   PassLen : Integer;
  5.   Password: string;
  6.   PassHash: Word = 0;
  7. begin
  8.   // we are needed to work with single byte character.
  9.   Password:= UTF8ToWinCP( APassword );
  10.   PassLen := Length(Password);
  11.  
  12.   if PassLen = 0 then
  13.   begin
  14.     Result := '';
  15.     exit;
  16.   end;
  17.  
  18.   for i:= PassLen downto 1 do
  19.   begin
  20.     PassHash:= ((PassHash shr 14) and  $0001) or ((PassHash shl  1) and  $7fff);
  21.     PassHash:= PassHash xor ord(Password[i]);
  22.   end;
  23.  
  24.   PassHash:= ((PassHash shr 14) and  $0001) or ((PassHash shl  1) and  $7fff);
  25.   PassHash:= PassHash xor PassLen xor $CE4B;
  26.  
  27.   Result := IntToHex(PassHash, 4);
  28. end;
  29.  

sorry for any inconvenient.  :(

wp

  • Hero Member
  • *****
  • Posts: 11906
Re: Very early support for .xlsx protection [PATCH]
« Reply #10 on: March 06, 2017, 12:33:02 am »
Thank you, no problem. Although we don't have a usage for this function at the moment I put it into the new unit fpscrypto which will be supposed to contain all encryption/decryption related stuff.

Today I wrote readers for the old xls formats (biff2, biff5, biff8) and the ods format. All follow the same philosophy as the xlsx reader, but at a much coarser level of control. One exception, maybe, is that xls has additional records for file write protection and file read/write protection, each of of them with their own passwords, this means that the workbook is protected by three passwords here. While the read/write protection is something like file encryption I still don't know what's the difference between write protection and the combination of cell/sheet/book protection.

In ods I learned what the passwords for sheet and workbook protection are good for: they do not cause any encryption of any file content, they are just requested when the user wants to change the related protection settings. Is this the same with Excel2010+?

shobits1

  • Sr. Member
  • ****
  • Posts: 271
  • .
Re: Very early support for .xlsx protection [PATCH]
« Reply #11 on: March 06, 2017, 08:13:35 am »
In ods I learned what the passwords for sheet and workbook protection are good for: they do not cause any encryption of any file content, they are just requested when the user wants to change the related protection settings. Is this the same with Excel2010+?
yes it is the same at least for Excel 2010 (I don't have excel 2013/2016 atm), the attached file created with excel 2010 have the workbook and worksheet protected.


shobits1

  • Sr. Member
  • ****
  • Posts: 271
  • .
Re: Very early support for .xlsx protection [PATCH]
« Reply #12 on: March 12, 2017, 12:59:24 am »
I finally decrypted the office 2007 protected workbook successfully.

for those whose intersted the attached file is all you need;

Code: Pascal  [Select][+][-]
  1. uses
  2.    ....., xlsxdecrypter;
  3.  
  4.  
  5. procedure TForm1.Button3Click(Sender: TObject);
  6. const
  7.   EncryptedFile = 'PATH TO ENCRYPTED FILE';
  8.   DecryptedFile = 'PATH TO DECRYPT THE FILE';
  9.  
  10. var
  11.  ExcelDecrypt :TExcelFileDecryptor;
  12.  DecryptedStream : TFileStream;
  13.  
  14. begin
  15.   ExcelDecrypt := TExcelFileDecryptor.Create;
  16.  
  17.   if ExcelDecrypt.isEncryptedAndSupported(EncryptedFile) then
  18.   begin
  19.     DecryptedStream := TFileStream.Create(DecryptedFile, fmCreate + fmOpenReadWrite);
  20.     ExcelDecrypt.Decrypt(EncryptedFile, DecryptedStream);
  21.     DecryptedStream.free;
  22.   end;
  23.  
  24.   ExcelDecrypt.Free;
  25. end;
  26.  

if you want to make fpspreadsheet opens the file automatically just apply the patch then copy the `xlsxdecrypter.pas` to `[fpspreadsheet path]\source\common`

wp

  • Hero Member
  • *****
  • Posts: 11906
Re: Very early support for .xlsx protection [PATCH]
« Reply #13 on: March 12, 2017, 11:24:08 am »
Very interesting. Thank you.

I have a problem, though. I suppose that, regarding encryption/decryption, you already have looked through what comes with fpc and Lazarus. There is sha1 support (in fpc/source/packages/hash), but I did not find anything for aes. Therefore, I think that usage on DCPCrypt is unavoidable. On the other hand, since encryption/decryption will not be needed in most cases I want to keep fpspreadsheet independent of this package.

So, let me think about how this useful unit could be added without adding a dependence on DCPCrypt. Maybe by adding another package laz_fpspreadsheet_crypto?

Anyway, do you know if your decrypter is applicable also to the standard file protection? How would a password be applied to your class?

shobits1

  • Sr. Member
  • ****
  • Posts: 271
  • .
Re: Very early support for .xlsx protection [PATCH]
« Reply #14 on: March 12, 2017, 02:14:38 pm »
So, let me think about how this useful unit could be added without adding a dependence on DCPCrypt. Maybe by adding another package laz_fpspreadsheet_crypto?
or may be adding a AES support to `fpscrypto.pas`

Anyway, do you know if your decrypter is applicable also to the standard file protection? How would a password be applied to your class?
1.I don't know if it'll work with other encrypted files created with excel 2007 and earlier but I do know it won't work with excel 2010 encrypted files since it uses agile encryption (the EncryptionInfo is a sort of xml file).

2.there are:
Decrypt(inFileName: string; outStream: TStream; APassword: UnicodeString):string;
Decrypt(inStream: TStream; outStream: TStream; APassword: UnicodeString) :string;

I made them private since I didn't have the time to test files encrypted with office 2007 and other than default password... but theoretically it should work.

[EDIT]
I just tried an encrypted excel 2007 with random password.. it worked fine.
« Last Edit: March 12, 2017, 07:41:42 pm by shobits1 »

 

TinyPortal © 2005-2018