FPSpreadsheet SVN: reading files .ods

bonmario

Sr. Member
Posts: 346

FPSpreadsheet SVN: reading files .ods

« on: September 02, 2015, 08:48:51 am »

Hi,
doing debugging of a program, I noticed something strange in the management of files .ods.
For each file .ods, these three files are handled: styles.xml, content.xml, settings.xml.
In each case, these operations are performed:
- The file XML is unzipped and saved to disk
- The file XML is read and saved to a stream
- The file XML is deleted

This management can create a problem if, for example, 2 programs that use fpspreadsheet accessing at the same time to 2 ods files present in the same directory.
Also, if a program were to analyze several files .ods, this administration would do several disk accesses, slowing the program.

I think it can be arranged by unzip the 3 files in a stream, without saving them to disk, as is explained here:
http://wiki.lazarus.freepascal.org/paszlib#Unzip_file_to_a_stream

Thanks in advance, Mario

Logged

wp

Hero Member
Posts: 11916

Re: FPSpreadsheet SVN: reading files .ods

« Reply #1 on: September 02, 2015, 11:49:59 am »

Quote from: bonmario on September 02, 2015, 08:48:51 am

I think it can be arranged by unzip the 3 files in a stream, without saving them to disk, as is explained here:
http://wiki.lazarus.freepascal.org/paszlib#Unzip_file_to_a_stream

Putting them into memory streams would mean duplication of memory usage of the program while reading. There are guys here in this forum who want to read really HUGE spreadsheet files; for them, this strategy is not a good idea.

But what you request is already contained in fpspreadsheet: Instead of using "ReadFromFile" you should call "ReadFromStream" which unzips directly to memory streams:

Code: [Select]

stream := TFileStream.Create(ASpreadsheetFileName, fmOpenReadWrite+fmShareDenyNone);
try
  myworkbook.ReadFromStream(stream, sfOpenDocument);
finally
  stream.Free;
end;

In order to avoid the ambiguity issue that you mention I am using a "GetUniqueTempDir" function now (--> r4310) which creates a uniquely named subfolder of the system's temp dir. Unzipped files are stored in this folder. Now it is possible to run multiple instances in parallel because every instance unzips to its own directory.

Logged

bonmario

Sr. Member
Posts: 346

Re: FPSpreadsheet SVN: reading files .ods

« Reply #2 on: September 02, 2015, 02:03:34 pm »

Quote from: wp on September 02, 2015, 11:49:59 am

In order to avoid the ambiguity issue that you mention I am using a "GetUniqueTempDir" function now (--> r4310) which creates a uniquely named subfolder of the system's temp dir. Unzipped files are stored in this folder. Now it is possible to run multiple instances in parallel because every instance unzips to its own directory.

Ok, tkanks, Mario

Logged

felipemdc

Administrator
Hero Member
Posts: 3538

Re: FPSpreadsheet SVN: reading files .ods

« Reply #3 on: September 07, 2015, 05:44:15 pm »

Quote from: wp on September 02, 2015, 11:49:59 am

In order to avoid the ambiguity issue that you mention I am using a "GetUniqueTempDir" function now (--> r4310) which creates a uniquely named subfolder of the system's temp dir. Unzipped files are stored in this folder. Now it is possible to run multiple instances in parallel because every instance unzips to its own directory.

temporary files? Why is this needed? Are you absolutely sure this is necessary? Because I think FPC can handle zip totally in-memory.

Logged

wp

Hero Member
Posts: 11916

Re: FPSpreadsheet SVN: reading files .ods

« Reply #4 on: September 07, 2015, 06:00:22 pm »

Quote from: felipemdc on September 07, 2015, 05:44:15 pm

temporary files? Why is this needed? Are you absolutely sure this is necessary? Because I think FPC can handle zip totally in-memory.

Read my response above: "Putting them into memory streams would mean duplication of memory usage of the file while reading. There are guys here in this forum who want to read really HUGE spreadsheet files; for them, this strategy is not a good idea."

Logged

felipemdc

Administrator
Hero Member
Posts: 3538

Re: FPSpreadsheet SVN: reading files .ods

« Reply #5 on: September 08, 2015, 01:31:50 pm »

Quote from: wp on September 07, 2015, 06:00:22 pm

Read my response above: "Putting them into memory streams would mean duplication of memory usage of the file while reading. There are guys here in this forum who want to read really HUGE spreadsheet files; for them, this strategy is not a good idea."

Maybe in this case a switch could be provided to control the behavior. Creating files might be undesirable in many cases. In Android phones for example its an additional permission required, and space might be scarce.

Logged

wp

Hero Member
Posts: 11916

Re: FPSpreadsheet SVN: reading files .ods

« Reply #6 on: September 08, 2015, 02:13:35 pm »

Files are created in the system's temp folder by calling osutils.GetTempDir. Is this an issue as well?

There is some kind of switch already, a bit hidden, though: If the file is read by means of "Workbook.ReadFromStream" via a TFileStream then all the unzipping is done to memory streams instead of temp files. I have to admit, though, that this is not very straightforward because everybody would call the direct "Workbook.ReadFromFile". I'll think about it...

Logged

felipemdc

Administrator
Hero Member
Posts: 3538

Re: FPSpreadsheet SVN: reading files .ods

« Reply #7 on: September 08, 2015, 04:50:29 pm »

I think it is a potential problem in some situations, and it should be easy to fix. Hidden solutions are often problematic as they might stop working in the future as its not clear that it should behave like that. It should be easy to make this configurable:

Either add a default valued parameter to ReadFromFile like AAllowCreatingTempFiles: Boolean = True
or
Add a new function ReadFromFile_NoTempFiles

Logged

wp

Hero Member
Posts: 11916

Re: FPSpreadsheet SVN: reading files .ods

« Reply #8 on: September 11, 2015, 07:27:06 pm »

Ok - I switched readers/writers to use memory streams by default now, no more temporary files. If they run out of memory then the new workbook option "boFileStream" can be set which instructs the readers/writers to access data files directly by means of file streams and to create temporary files if required. This is an extension of the other workbook option "boBufStream" which uses a "buffered stream" - this is essentially a memory stream, but it swaps content to disk if the buffer becomes too small.

Logged

Lazarus

Bookstore

Search

Recent

Author Topic: FPSpreadsheet SVN: reading files .ods (Read 5772 times)

bonmario

FPSpreadsheet SVN: reading files .ods

wp

Re: FPSpreadsheet SVN: reading files .ods

bonmario

Re: FPSpreadsheet SVN: reading files .ods

felipemdc

Re: FPSpreadsheet SVN: reading files .ods

wp

Re: FPSpreadsheet SVN: reading files .ods

felipemdc

Re: FPSpreadsheet SVN: reading files .ods

wp

Re: FPSpreadsheet SVN: reading files .ods

felipemdc

Re: FPSpreadsheet SVN: reading files .ods

wp

Re: FPSpreadsheet SVN: reading files .ods

	Computer Math and Games in Pascal (preview)
	Lazarus Handbook