Recent

Author Topic: Excel file open problem [SOLVED]  (Read 10954 times)

totya

  • Hero Member
  • *****
  • Posts: 720
Excel file open problem [SOLVED]
« on: September 24, 2016, 07:33:55 pm »
Hi!

I like this component, but the newest version of one excel file is unreadable from this component (earlier version of this excel file are opened!). This is a very simple file, created/modified with official Office XP (2002).

I do not like to share this excel file, is there any way to know what is the problem with this file between FPSpreadsheet?

I tried these:
Examples/fpsgrid_no_install and excel8demo do not open this file

I tried 1.6.2 (no error message, but xls didn't open)
and the latest svn version: Examples/fpsgrid_no_install: I got error messages, for example: "...BIFF8Reader.ReadWideString Continue record excepted, but not found"
"Cannot read file Wrong, unknown or defective format?"

Edit: Lazarus version is 1.6 x64, this is official install package + official install x86 cross addon. The result are same with x64 and x86 output exe.

Thanks!
« Last Edit: September 25, 2016, 01:19:37 pm by totya »

lainz

  • Hero Member
  • *****
  • Posts: 4460
    • https://lainz.github.io/
Re: Excel file open problem
« Reply #1 on: September 24, 2016, 07:51:06 pm »
Try to create a simple excel file so anyone that knows how to use that component can help you, else how he can test?!

wp

  • Hero Member
  • *****
  • Posts: 11854
Re: Excel file open problem
« Reply #2 on: September 24, 2016, 07:51:59 pm »
Of course, it would be easier for me if you could give me the file. But let me ask you some "diagnostic" questions:
  • Which fpspreadsheet version did you use when the file could be opened? Which version are you using now?
  • It seems to be an xls file. Correct? biff8 (Excel97 format) or older?
  • How many sheets?
  • What is the size of each sheet? (i.e. number of rows and columns)
  • Which content is stored? Numbers, texts, hypertext links, images, charts, embedded OLE objects? Some of these are not supported and should be ignored, but who knows...
  • The CONTINUE record is needed if the sheet contains comments. Does it? I know that comments are not correctly handled for the xls format - a long-existing bug due to poor documentation. If the file does contain comments can you create a new file with just a comment, and see if the error occurs again. If it does please upload the file here (pack it into a zip, otherwise the forum software would reject it).
Wasn't there a Microsoft update which allowed Excel 2002 to write the xlsx format? Please seek for it, install it and create the same file in xlsx format. Does it have the same issues in fpspreadsheet?
« Last Edit: September 24, 2016, 07:59:03 pm by wp »

totya

  • Hero Member
  • *****
  • Posts: 720
Re: Excel file open problem
« Reply #3 on: September 24, 2016, 07:59:43 pm »
Try to create a simple excel file so anyone that knows how to use that component can help you, else how he can test?!

Hi!

All other excel files are opened, except this. But the excel open it without any problem.

"how to use that component": I wrote above, I used "official" examples: fpspreadsheet\examples\visual\fpsgrid_no_install.

I thinked I got answer from one developer, for example this component can create debug.log if possible or I truncate this file header and so on...

totya

  • Hero Member
  • *****
  • Posts: 720
Re: Excel file open problem
« Reply #4 on: September 24, 2016, 08:29:35 pm »
Hi!

Of course, it would be easier for me if you could give me the file. But let me ask you some "diagnostic" questions:

  • Which fpspreadsheet version did you use when the file could be opened? Which version are you using now?

As I wrote, I tried official fpspreadsheet-1.6.2, and now (and always) I use the svn, now is the 5205 revision.
And now I open these files with the official examples: \fpspreadsheet\examples\visual\fpsgrid_no_install

  • It seems to be an xls file. Correct? biff8 (Excel97 format) or older?

Office XP default format, if I open one of them earlier version from this problematic file  with fpsgrid_no_install, I see BIFF8 on the caption, next to the file name. This is mean, this file opened with BIFF8.

  • How many sheets?

Only one.

  • What is the size of each sheet? (i.e. number of rows and columns)

Columns: A-G
Rows: ~850

  • Which content is stored? Numbers, texts, hypertext links, images, charts, embedded OLE objects? Some of these are not supported and should be ignored, but who knows...

I wrote already, this is the very simple file.
Numbers YES
texts YES
hypertext links NO
images NO
charts NO
embedded OLE objects LOL NO

  • The CONTINUE record is needed if the sheet contains comments. Does it? I know that comments are not correctly handled for the xls format - a long-existing bug due to poor documentation. If the file does contain comments can you create a new file with just a comment, and see if the error occurs again. If it does please upload the file here (pack it into a zip, otherwise the forum software would reject it).

I don't see comments.
Okay I created sample file with excel 2003, I filled 4 cell with number, and I wrote one comment. If I open this file from fpsgrid_no_install, this file opened without any problem, and I see the comment too... lol.

  • Wasn't there a Microsoft update which allowed Excel 2002 to write the xlsx format? Please seek for it, install it and create the same file in xlsx format. Does it have the same issues in fpspreadsheet?

    I will try it later...

    Thank you...
« Last Edit: September 24, 2016, 08:36:26 pm by totya »

wp

  • Hero Member
  • *****
  • Posts: 11854
Re: Excel file open problem
« Reply #5 on: September 24, 2016, 08:56:46 pm »
The other points where CONTINUE records are used are
  • cells with very long text (> about 8000 bytes)
  • cells with rich-text formatting, i.e. individual formatting of characters (such as sub/superscript of individual characters, bold of individual words etc).
Does your file contain such features? If yes, again create a new file with this feature and see if the error is there.

An idea how you could locate the problematic cell:
  • Make a backup copy of the file, and modify this backup copy only
  • In Excel, delete the second half of all rows. Save under a temp name. Try to open with fpspreadsheet and check for the error
  • If the error occurs again delete the second half of this temp file now and repeat.
  • If there is no error go back to the previous version of the file, and now remove the first half of all rows. Do the test with this version.
  • Repeat the procedure until you have only one (or a few rows) left.
  • Repeat with the columns in the same way
  • Store again under a new name (to remove the undo buffer). If required modify the cell text and remove any confidential data. Test this file again. It it still shows the error upload it here.
« Last Edit: September 24, 2016, 09:01:29 pm by wp »

totya

  • Hero Member
  • *****
  • Posts: 720
Re: Excel file open problem
« Reply #6 on: September 24, 2016, 10:19:46 pm »
    The other points where CONTINUE records are used are

       
    • cells with very long text (> about 8000 bytes)

    I export this file to the csv, ant the longest line about 220 char.

       
    • cells with rich-text formatting, i.e. individual formatting of characters (such as sub/superscript of individual characters, bold of individual words etc).Does your file contain such features? If yes, again create a new file with this feature and see if the error is there.

    Individual formatting of characters, yes you are right, this file contains this. I create sample (with Excel 2003), and yes, I get error messages from Lazarus environment:
    first: "system read error" (this is different!)
    second: "cannot read file: wrong, unknown or defective file format?"

    An idea how you could locate the problematic cell:
    • Make a backup copy of the file, and modify this backup copy only
    • In Excel, delete the second half of all rows. Save under a temp name. Try to open with fpspreadsheet and check for the error
    • If the error occurs again delete the second half of this temp file now and repeat.
    • If there is no error go back to the previous version of the file, and now remove the first half of all rows. Do the test with this version.
    • Repeat the procedure until you have only one (or a few rows) left.
    • Repeat with the columns in the same way
    • Store again under a new name (to remove the undo buffer). If required modify the cell text and remove any confidential data. Test this file again. It it still shows the error upload it here.

    Very thank you for this detailed instructions, but in home I have only excel 2003 and 2007.
    If I open the problematic (office XP) file with office 2003, and I press save, the problem is gone, I can open this file from fpspreadsheet. But this is not solution for me! Because my job is important, and my app must open all (simple) files without problem  :)

    But interesting, If I open the problematic (office XP) file with office 2007, and I press save, the problem is stay, I can't open  this file from fpspreadsheet. This is great, because is good for your instructions!
    If I delete "D" column, the problem is gone. If I delete all column, except: "D", the problem is stay.
    This is mean, the problem is the "D" column. This column contains individual formatting chars only.
    If I delete lines with individual formatting chars (19 lines), the file is opened by fpspreadsheet, without error.
    Or, if I open original problematic file, and I delete individual formatting chars (select lines and press bold icon twice) then problem is gone too.

    Okay, possible the errors come from the " individual formatting chars", but these individual formatting must be present in the source file, and I can open this file without any magic and save. This is possible anyhow? If  individual formatting lost after open from FPSpreadsheet, that isn't a big problem, because I only need the real datas from the source file.

    I attached a simple sample with individual formatting chars.
    virustotal

    Thanks for the many help and instructions![/list]
    « Last Edit: September 24, 2016, 11:17:04 pm by totya »

    wp

    • Hero Member
    • *****
    • Posts: 11854
    Re: Excel file open problem
    « Reply #7 on: September 24, 2016, 11:43:53 pm »
    Please test r5207. It fixes at least the issue with the two files you supplied.

    The bug, indeed, is in the "individual formatting", in a special case where all strings of the workbook are collected in a "shared strings table". This is a feature of biff8. The bug was not detected in the fpspreadsheet standard write-read tests because fpspreadsheet does not use the shared string table when writing biff8 text cells.
    « Last Edit: September 24, 2016, 11:58:21 pm by wp »

    totya

    • Hero Member
    • *****
    • Posts: 720
    Re: Excel file open problem
    « Reply #8 on: September 25, 2016, 12:45:05 am »
    Please test r5207. It fixes at least the issue with the two files you supplied.

    The bug, indeed, is in the "individual formatting", in a special case where all strings of the workbook are collected in a "shared strings table". This is a feature of biff8. The bug was not detected in the fpspreadsheet standard write-read tests because fpspreadsheet does not use the shared string table when writing biff8 text cells.

    Hi!

    Very thank you for your quick job!

    It seems to me, two sample files are opened correctly in latest fpspreadsheet svn.

    Unfortunatelly the problem is stay, the original problematic xls file don't open. This file contains sensitive data, I do not want to share it. But I'm working on it to split smaller size, but it's hard, if I cut it, fpspreadsheet  opened it correctly...

    Thanks again!

    wp

    • Hero Member
    • *****
    • Posts: 11854
    Re: Excel file open problem
    « Reply #9 on: September 25, 2016, 09:53:51 am »
    You said that the problem goes away after removing column G. If, on the other hand, you drop everything except column G, is the problem still there? If it is could you share this reduced file?

    totya

    • Hero Member
    • *****
    • Posts: 720
    Re: Excel file open problem
    « Reply #10 on: September 25, 2016, 11:07:32 am »
    You said that the problem goes away after removing column G. If, on the other hand, you drop everything except column G, is the problem still there? If it is could you share this reduced file?

    Hi!

    Good morning! :)

    Okay, I know, my english is quite terrible... I wrote this:
    Quote
    If I delete "D" column, the problem is gone. If I delete all column, except: "D", the problem is stay.
    I try it again, but the result is same, certainly.

    So, the problematic column is the D. If I drop (delete) all column, except D, the problem is stay. Unfortunatelly, this column contain the all important data, if I split or rewrite data, FPSpreadsheet can open it.

    But interesting... I split the problematic excel file with the smallest size, which prduces FPSpreadsheet open error. The file is now:

    1-19 lines (rows) loks like similar of these (but with real datas):
    Quote
    1 test text
    2. test text 2
    3. test text 4

    Only the number, and the dot "." bolded.

    After these lines (rows)  (20-311) follow about 290 lines, without any formatting(!!!), similar of these:
    Quote
    name1
    name2
    name3
    name4
    (and so on...)

    These names are real names, not "name1" or "name2".

    Well, if I delete  1-19 lines, the problem is gone, But (but!) if 1-19 lines remain, and I try to delete the 20-311 lines, the problem is gone too...

    Elsewhere, I read your words carefoully:
    The bug, indeed, is in the "individual formatting", in a special case where all strings of the workbook are collected in a "shared strings table". This is a feature of biff8. The bug was not detected in the fpspreadsheet standard write-read tests because fpspreadsheet does not use the shared string table when writing biff8 text cells.

    This is mean for me: This error always possible with BIFF8 format/special individual formatting, because this "shared strings table" doesn't supported in FPSpreadsheet. So, if FPSpreadsheet will support "shared strings table", I think all of problematic excel file will open correctly.

    Thank you!

    Edit.: 5208 version arrived, but I don't see changes, the problematic xls file doesn't open.
    « Last Edit: September 25, 2016, 11:10:23 am by totya »

    wp

    • Hero Member
    • *****
    • Posts: 11854
    Re: Excel file open problem
    « Reply #11 on: September 25, 2016, 11:49:31 am »
    I think the problem happens when the SST (=shared strings table) contains rtf-formatted strings ("individual formatting of characters/words") and is longer than the minimum size of a record, such that a CONTINUE record is required. I tried to create such a file, but it is read correctly. Also a file created according to your instructions is read correctly.

    I am rather sure that the individual characters are not important. Could you replace the characters in the confidential words by just an 'x' or anything else? The length of each string as well as the location of the formatting should be kept. An example: if some cell contains the text "This is TOP-SECRET" the modified cell should be "xxxx xx xxx-xxxxxx". I guess that such a file would be suitable for upload here. If you still don't want to show the file publicly you could contact me by PM to get my e-mail address. I know this is a lot of work, maybe you can write a macro to help you. Or, maybe it is sufficient to modify only the top 19 rows that you mention and just use a constant text in the other 300-or-so rows.

    Quote
    Elsewhere, I read your words carefoully:...
    FPSpreadsheet does support the SST for reading, but not for writing. This means while Excel writes the text of a label cell into the SST and its index into the cell, FPSpreadsheet write the text into the cell directly. This is some kind of micro-optimization of Excel which reduces memory for the rather rare case of multiple usage of strings. But it introduces some overhead which I avoided for FPSpreadsheet. Adding write support for the SST would not solve your issue, but probably introduce new ones.

    totya

    • Hero Member
    • *****
    • Posts: 720
    Re: Excel file open problem
    « Reply #12 on: September 25, 2016, 12:22:18 pm »
    Hi!

    First... thank you for your patience!

    I am rather sure that the individual characters are not important.

    You are right, not important for process data (as I wrote), but my app must open this file without any reformatting (and save).

    Could you replace the characters in the confidential words by just an 'x' or anything else? (...)

    Belive me, I tried this for many hours...many-many hours... without success.. but now... :)

    I successfuly created this xls file (public sample), which produce open errors, and similar of the original problematic file. See attached file.

    Virustotal

    FPSpreadsheet does support the SST for reading, but not for writing. (...)
    Okay, I understand now. But it doesn't work in every case, see my case.

    Thank you!
    « Last Edit: September 25, 2016, 12:25:21 pm by totya »

    wp

    • Hero Member
    • *****
    • Posts: 11854
    Re: Excel file open problem
    « Reply #13 on: September 25, 2016, 12:38:32 pm »
    Thank you for the file, I can reproduce the issue now. It may take some time until I'll figure out what exactly is going on here.

    totya

    • Hero Member
    • *****
    • Posts: 720
    Re: Excel file open problem
    « Reply #14 on: September 25, 2016, 01:19:14 pm »
    Thank you for the file, I can reproduce the issue now. It may take some time until I'll figure out what exactly is going on here.

    Big thanks to you!

    The latest sample xls file, and the original problematic xls file now opened without any error messages in FPSpreadsheet 5209 svn version.

    Thank you again! :)

     

    TinyPortal © 2005-2018