Lazarus

Announcements => Third party => Topic started by: xinyiman on January 19, 2020, 08:00:10 pm

Title: XLSX file read
Post by: xinyiman on January 19, 2020, 08:00:10 pm
Hi guys, I don't understand how to read the contents of an XLSX file with lazarus and fpspreedsheet. Is there a small example somewhere?
Title: Re: XLSX file read
Post by: Handoko on January 19, 2020, 08:05:34 pm
Try this:
https://wiki.freepascal.org/FPSpreadsheet#Examples
Title: Re: XLSX file read
Post by: xinyiman on January 19, 2020, 08:31:10 pm
I have created a small example. That doesn't work. Can anyone tell me why?

Handoko I had already read that page.
Title: Re: XLSX file read
Post by: wp on January 19, 2020, 10:58:36 pm
There are two errors in your code:
Here is one possibility to fix the issue -- it requires that the TsWorkbookSource that you had put onto the form must be connected to the WorkbookSource property of the grid:
Code: Pascal  [Select][+][-]
  1. uses
  2.   Classes, SysUtils, Forms, Controls, Graphics, Dialogs, StdCtrls,
  3.   fpspreadsheetgrid, fpspreadsheetctrls, fpstypes,
  4.   xlsxooxml;   // THIS MUST BE ADDED
  5.  
  6. procedure TForm1.Button1Click(Sender: TObject);
  7. var
  8.    filename : string;
  9. begin
  10.      filename := Application.Location + 'file.xlsx';
  11.      if FileExists(filename) then
  12.      begin
  13.        sWorkbookSource1.LoadFromSpreadsheetFile(filename, sfOOXML);
  14.      end else begin
  15.        ShowMessage('File not exists: ' + filename);
  16.      end;
  17. end;
And this is another solution, here you only need the WorksheetGrid, no WorkbookSource:
Code: Pascal  [Select][+][-]
  1. uses
  2.   Classes, SysUtils, Forms, Controls, Graphics, Dialogs, StdCtrls,
  3.   fpspreadsheetgrid, fpspreadsheetctrls, fpstypes,
  4.   xlsxooxml;   // THIS MUST BE ADDED
  5.  
  6. procedure TForm1.Button1Click(Sender: TObject);
  7. var
  8.    filename : string;
  9. begin
  10.      filename := Application.Location + 'file.xlsx';
  11.      if FileExists(filename) then
  12.      begin
  13.        sWorksheetGrid1.LoadFromSpreadsheetFile(filename, sfOOXML);
  14.      end else begin
  15.        ShowMessage('File not exists: ' + filename);
  16.      end;
  17. end;
Title: Re: XLSX file read
Post by: xinyiman on January 20, 2020, 11:54:23 am
Hi wp, thanks for the info. They work properly on small files. But having to process a file with 300000 lines, the memory (8 GB of ram) and the program crashes.

Is there a way to read the file line by line without first having imported the entire block of lines from the file?
Title: Re: XLSX file read
Post by: wp on January 20, 2020, 12:36:03 pm
having to process a file with 300000 lines
What is a "line" in this context? Rows of the spreadsheet? How many columns, i.e. how many cells?

Assume there are 1 million cells and that each cell occupies 100 bytes on the average you would need 100 MB to store the worksheet. During reading, the worksheet xml is loaded into memory as well. But i doubt that memory usage exceeds several hundreds of MB -- far away from your 8GB. How much of your memory is occupied? Or did you compile a 32-bit program where you can run out of memory relatively easily? Compile a 64-bit program so that you can access all the free space of the 8GB.

It also depends on what you want to do with the data? Display in a grid? Then there's no other option (BTW - who should look at 1 million cells?). On the other hand, if you want to do an analysis on the fly without storing all cells in a worksheet you can go to "virtual mode" which disposes each cell after reading, of course after giving you the opportunity to analyze the cell. Virtual mode is explained in the wiki article (https://wiki.lazarus.freepascal.org/FPSpreadsheet#Virtual_mode), and there are also some demos among the examples provided.
Title: Re: XLSX file read
Post by: xinyiman on January 20, 2020, 01:51:56 pm
Line is row of the spreedsheet. 120 columns * 300000 rows.
Title: Re: XLSX file read
Post by: wp on January 20, 2020, 05:21:58 pm
This comes up to 36 million cells. With maybe 100 bytes per cell, we have something like 3.6 GB. And since the uncompressed input xml file is in memory, too, (maybe during parsing even twice) I'd guess: no chance...
TinyPortal © 2005-2018