Forum > FPSpreadsheet

memory consumption

(1/3) > >>

Hansvb:
With FPspread sheet i write datasets to different worksheets. I notice that when there a lot of rows and columns to export, the memory consumption is verry high.

I can not use
--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---workbook.OnWriteCellData := @WriteCellDataHandler;because i write to different worksheets.

Is there a way to lower the memory consumption? Or am i doing something wrong?
When i use Ole Excel automation, there is almost no memory use at all. But then the export is verry verry slow.


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---procedure TExport_To_File.SaveAs_FPExcelFile_Complete_lvwoz_scan(  Adbgrid: TDBGrid; ASheetName: widestring);var  sheet: TsWorksheet;  szLeesuit : String;  f, R: Integer;begin  screen.Cursor:=crHourglass;   if (Active_DBGrid.DataSource = nil) or     (Active_DBGrid.DataSource.DataSet.IsEmpty = True) then    begin      Frm_main.Logging.WriteToLog('INFORMATIE | De dataset is leeg, er is geen export naar Excel aangemaakt.');      //hier eventueel een leeg werkblad laten aanmaken.    end  else    begin //De export naar Excel uitvoeren      Frm_Main.Logging.WriteToLog('INFORMATIE | Start met het opslaan van ' + ASheetName);       //Werkblad aanmaken      sheet := MyWorkbook.AddWorksheet(Asheetname);      szLeesuit :='';                    //kopteksten      for f := 0 to  Adbgrid.DataSource.DataSet.FieldCount -1 do        begin          sheet.WriteText(0, f, widestring(Adbgrid.DataSource.DataSet.Fields[f].FieldName));        end;       //de data naar Excel schrijven      with Adbgrid.DataSource.DataSet do        begin          Adbgrid.DataSource.DataSet.DisableControls;          First;          for R := 0 to Adbgrid.DataSource.DataSet.RecordCount-1 do            begin              for F := 0 to Adbgrid.DataSource.DataSet.FieldCount-1 do                begin                  szLeesuit :='';                  if Adbgrid.DataSource.DataSet.Fields[F].AsString <> '' then                    szLeesuit :=  Adbgrid.DataSource.DataSet.Fields[F].AsString;                  sheet.WriteText(R+1, F+0, widestring(szLeesuit) );                end;              next;            end;          First;   //weer terug naar het eerste record.          Adbgrid.DataSource.DataSet.EnableControls;           //tussentijds opslaan (dus na elke query)          try            if SaveDialog_FileName <> '' then              MyWorkbook.WriteToFile(widestring(SaveDialog_FileName));          except            on E: Exception do              begin                Frm_main.Logging.WriteToLog('FOUT      | Het tussentijds opslaan van het Fpspreadsheet Workbook is mislukt.');                Frm_main.Logging.WriteToLog('FOUT      | Een onverwachte uitzondering heeft plaats gevonden:');                Frm_main.Logging.WriteToLog('FOUT      | ' + E.Message);                Frm_main.Logging.WriteToLogAndFlush('FOUT      | ');              end;          end;        end;    end;     screen.Cursor:=crdefault;end;

wp:
Memory consumption of fpspreadsheet is high because the entire workbook is held in memory. A lot of optimization has been applied in the past to reduce the memory load per cell, I still see a few places where a few bytes can gained, but in total this will be a marginal effect. Excel, on the other hand, applies some tricky disk swapping mechanism which is not available here. Of course somebody can write a patch for it...

To overcome the memory limitation virtual mode was introduced. This allows to write cells "on the fly" without keeping them in memory, and this is what you mean with the OnWriteCellData event. Unfortunately I made a design error and assigned the virtual mode to the workbook, not to the worksheet. Therefore, you cannot handle worksheets differently in virtual mode. At the moment I am busy with updating the Turbo Power Visual PlanIt components. When I am done the virtual mode design flaw will be fixed. Most probably, however, it will break existing code.

I am not the expert with OLE automation. When I applied it in some old Delphi project I soon got disappointed by this low-performance programming model and the interference with already running Excel instances. I did notice that writing an Excel file cell by cell (as you do) is extremely slow, but I found somewhere a note that speed increases considerably if cell ranges are used and the cells are addressed as elements in such arrays - sorry, I don't know any more where I had picked this up.

Since you deal with databases (I guess), did you consider using tools provided by your database engine? I did a lot of work with Delphi and Microsoft's ADO these days; there was an extension of the SELECT query which could be used to write to Excel (and other formats). I am posting here the code that I had used. Note that this is Delphi code, it will not work with Lazarus, but maybe you can get similar things with your database engine, or with ODBC, or with ZEOS, I don't know:


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---type  TExportFormat = (    efMDB,    efExcel3, efExcel4, efExcel5, efExcel8,    efLotusWK1, efLotusWK3, efLotusWK4,  // Lotus WK4 is not working    efDBase3, efDBase4, efDBase5,    efParadox3, efParadox4, efParadox5,    efText,    efHTML,    efXML  ); const  SQL_Mask = 'SELECT %s INTO [%s] IN "%s" %s FROM %s';  //                 |        |       |   |        |  //               fields     |       |   |        |  //                     dest table   |   |        |  //                        dest database |        |  //                              database format  |  //                                         source dataset  DBCodes : array[TExportFormat] of string = (    '',    '"Excel 3.0;"', '"Excel 4.0;"', '"Excel 5.0;"', '"Excel 8.0;"',    '"Lotus WK1;"', '"Lotus WK3;"', '"Lotus WK4;"',    '"dBASE III;"', '"dBASE IV;"', '"dBASE 5.0;"',    '"Paradox 3.x;"', '"Paradox 4.x;"', '"Paradox 5.x;"',    '"Text;"',    '"HTML Export;"',    ''  ); function ADOExportQuery(AConnection:TADOConnection;   ASourceSQL,ADestFile:string;  AFormat:TExportFormat; ADestTableName:string) : integer;const  WhiteSpace = [' ', #13, #10, #9];  Ls = 6;  // = Length('SELECT');  Lf = 4;  // = Length('FROM');var  SQL : string;  p, p1, p2, p3 : integer;  done : boolean;  sql1, sql2, _sql : string;begin  result := -1;  if (ASourceSQL='') or (ADestFile='') or (AConnection=nil) then    raise Exception.Create('Incomplete parameters.');   if AFormat=efXML then begin    result := ADOExportToXML(AConnection, ASourceSQL, ADestFile);  end else begin    while (ASourceSQL<>'') and (ASourceSQL[1] in WhiteSpace) do       System.Delete(ASourceSQL, 1, 1);    _sql := Uppercase(ASourceSQL) + ' ';    p1 := Ls+1;    while (_sql<>'') and (_sql[p1] in WhiteSpace) do inc(p1);    p2 := 0;    done := false;    while not done do begin      p := pos('FROM', _sql);      if p<>0 then begin        done := (_sql[p-1] in WhiteSpace) and (_sql[p+Lf+1] in WhiteSpace);        if not done then begin          p2 := p2 + p;          _sql := copy(_sql, p+Lf, Length(_sql));        end;      end else        done := true;    end;    p3 := p2 + Lf;    while (p2>0) and (ASourceSQL[p2] in WhiteSpace)      do dec(p2);    while (p3<Length(ASourceSQL)) and (ASourceSQL[p3] in WhiteSpace)      do inc(p3);    sql1 := copy(ASourceSQL, p1, p2-p1-1);  // between "SELECT" and "FROM"    sql2 := copy(ASourceSQL, p3, Length(ASourceSQL)); // part after "FROM"    result := ADOExportTable(AConnection, sql2, sql1, ADestFile, AFormat,      ADestTableName);  end;end; function ADOExportTable(AConnection:TADOConnection;  ASourceTable,AFieldList,ADestFile:string;  AFormat:TExportFormat; ADestTableName:string) : integer;var  SQL : string;  wasConn : boolean;  fdir : string;  fnam : string;  L : TStringList;begin  result := -1;   if (ASourceTable='') or (ADestFile='') or (AConnection=nil) then    raise Exception.Create('Incomplete parameters.');   if AFieldList='' then AFieldList := '*';  if AFormat=efXML then begin    SQL := Format('SELECT %s FROM %s', [AFieldList, ASourceTable]);    result := ADOExportToXML(AConnection, SQL, ADestFile);  end else begin    fdir := ExtractFileDir(ADestFile);    fnam := ExtractfileName(ADestFile);    case AFormat of      efDBase3,      efDBase4,      efDBase5,      efParadox3,      efParadox4,      efParadox5,      efLotusWK1,      efText,      efHTML :        begin          if FileExists(ADestFile) then DeleteFile(ADestFile);          SQL := Format(SQL_Mask,            [AFieldList, fnam, fdir, DBCodes[AFormat], ASourceTable]);        end;      efLotusWK3,      //efLotusWK4,  --> not working !      efExcel3,      efExcel4 :         begin           if FileExists(ADestFile) then DeleteFile(ADestFile);           SQL := Format(SQL_Mask,             [AFieldList, fnam, ADestFile, DBCodes[AFormat], ASourceTable]);         end;      efExcel5,      efExcel8 :        begin         if FileExists(ADestFile) then DeleteFile(ADestFile);         SQL := Format(SQL_Mask,           [AFieldList, ADestTableName, ADestFile, DBCodes[AFormat],             ASourceTable]);        end;      efMDB :        SQL := Format(SQL_Mask,          [AFieldList, ADestTableName, ADestFile, DBCodes[AFormat],            ASourceTable]);      else        raise Exception.Create('ExportTable: Fileformat not supported.');    end;    with AConnection do begin      wasConn := Connected;      Connected := true;      Execute(SQL, result);      if not wasConn then Connected := false;    end;  end;end;

Thaddy:

--- Quote from: wp on July 08, 2016, 11:21:43 pm ---. Excel, on the other hand, applies some tricky disk swapping mechanism which is not available here.

--- End quote ---
Well, actually the Excel behavior  is available by default when you use the standard windows functions for memory allocation / de-allocation in your memory manager.
That would be VirtualAlloc and the likes with proper flags to allow swapping to disk instead of forcing into actual memory hardware.
(I told many times the FPC memory manager sucks, here's another example)
In Delphi the memory manager on top of the OS code allows this and that should work.

Holding just the sheet in memory, though, would be an improvement either way. I concur.

But there are no 'tricks'  involved other than OS memory management behavior.

There are alternative memory managers available to solve the issue, including one I wrote myself for fun. Even very simple ones that just use the heap calls under windows.

The offset is that they - mostly - consume more physical memory than FPC's default memory manager. But they will swap automatically.

Note this write up is highly Windows specific! Warranted by the appearance of Excel in this story ;)

wp:
@Hansvb:
The new fpspreadsheet version on trunk has a modified virtual mode. The properties VirtualColCount and VirtualRowCount, as well as the event OnWriteCellData, have been moved from the workbook to the worksheet. Therefore, is is possible now to treat worksheets individually when they are written in virtual mode.

As already noted above, this modification breaks existing virtual mode code. But it's easy to fix, the compiler will tell you what's wrong:

old code                   ---> new code
--------------------------------------------------------------
MyWorkbook.VirtualRowCount ---> MyWorksheet.VirtualRowCount
MyWorkbook.VirtualColCount ---> MyWorksheet.VirtualColCount
MyWorkbook.OnWriteCellData ---> MyWorksheet.OnWriteCellData

And, the first parameter of the OnWriteCellData event handler is a TsWorksheet now, no longer a TObject.

[EDIT]
When you apply this to your project be sure to set workbook option boBufStream, otherwise all cells will be written to a temporary memory stream which will overflow memory in the same way as non-virtual mode, just a bit delayed.

Hansvb:
Thanks, i will try this

Navigation

[0] Message Index

[#] Next page

Go to full version