Recent

Author Topic: TFileStream file size limit  (Read 14394 times)

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: TFileStream file size limit
« Reply #15 on: October 05, 2021, 01:32:03 pm »
In de context of 32 bit, Sarah. That's why I suggested it to express address space.

You don't mention 32-bit here anywhere except for the PE flag (and the other messages before yours hadn't mentioned 32-bit either):

AFAIK Address space is limited to high(qword), at lot.., but single read/writes are limited to high(nativeint) which is indeed 2G
Note that in practise it limits to available memrory otherwise a EOutOfMemory is thrown.
Note that for 32 bit WIN there is a PE flag - since windows 7 - that extends available memory to 4G but still with the limitation ffor single read/writes

lagprogramming

  • Sr. Member
  • ****
  • Posts: 405
Re: TFileStream file size limit
« Reply #16 on: October 17, 2021, 04:54:01 pm »
   I've stopped using the file related classes like TFileStream and went to a lower level: FileOpen, FileCreate, FileClose, FileSeek and so on.
   The following function returns a negative number when passing a 3GB file. The problem is at FileSeek.

Code: Pascal  [Select][+][-]
  1. Function GetFileSizeUsingFileSeek(FilePath:string):Int64;
  2. var
  3.   FileH:THandle;
  4. begin
  5.   FileH:=FileOpen(FilePath,fmOpenRead);
  6.   Result:=FileSeek(FileH,0,fsFromEnd);//Also FileSeek returns a wrong result for files greater than 2GB. In linux-x86_64 it may return negative values.
  7.   FileClose(FileH);
  8. end;

   In order to avoid such problems I've decided to modify the file format that is used by the application and avoid file related classes by using low level routines, which means that for me the problem will be solved soon. To be frank, I've never expected such a problem. Probably I'm one of the few who uses files so large in size. In the future I expect other developers to have the same problem.

AlexTP

  • Hero Member
  • *****
  • Posts: 2384
    • UVviewsoft
Re: TFileStream file size limit
« Reply #17 on: October 17, 2021, 05:04:10 pm »
Quote
The following function returns a negative number when passing a 3GB file. The problem is at FileSeek.
Code: Pascal  [Select][+][-]
  1. Function FileSeek (Handle : THandle; FOffset, Origin: Longint) : Longint;
  2. Function FileSeek (Handle : THandle; FOffset: Int64; Origin: Longint) : Int64;
  3.  
your code calls 1st overload! so it returns nagative res.
try to call 2nd overload.

Result:=FileSeek(FileH,Int64(0),fsFromEnd);
« Last Edit: October 17, 2021, 05:06:46 pm by Alextp »

Thaddy

  • Hero Member
  • *****
  • Posts: 14197
  • Probably until I exterminate Putin.
Re: TFileStream file size limit
« Reply #18 on: October 17, 2021, 05:05:39 pm »
The problem is merely the declaration as a signed type instead of unsigned.
Specialize a type, not a var.

SymbolicFrank

  • Hero Member
  • *****
  • Posts: 1313
Re: TFileStream file size limit
« Reply #19 on: October 26, 2021, 11:30:21 pm »
What is the speed increase of using the largest buffer the OS will give you over that 1 GB buffer? Is that measurable?

I mean, you need at least an SSD, which write in blocks the size of megabytes. Many random reads will overflow their cache memory, because the individual blocks are too small.

But that's just the thing SSDs excel at: a very large amount of random writes. Because there is no mechanical movement. No need to wait for the head to slowly travel to the right location.

For the fastest possible, sustained sequential write of a large file, the main thing is to kill all other applications that read or write files, or even use the PCI bus. Kill all other running programs.

And at that point, it doesn't matter how large your buffer is, because the tiny overhead is washed away by having all available bandwidth for yourself. Operating systems are quite good at optimizing the available resources.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: TFileStream file size limit
« Reply #20 on: October 27, 2021, 09:33:08 am »
What is the speed increase of using the largest buffer the OS will give you over that 1 GB buffer? Is that measurable?

Please note that this will depend on the hardware. If the device is connected by USB for example the transfer size is caped at 1 MiB or at most 2 MiB cause larger sizes tend not to be supported that well by hardware. ATA, SCSI and NVMe also have some limits. The OS will handle that transparently for you of course, but this might result in copying and thus decreased performance compared to manually splitting the transfers (of course one needs to know the best transfer size then...).

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11382
  • FPC developer.
Re: TFileStream file size limit
« Reply #21 on: October 27, 2021, 10:12:13 am »
What is the speed increase of using the largest buffer the OS will give you over that 1 GB buffer? Is that measurable?

I mean, you need at least an SSD, which write in blocks the size of megabytes. Many random reads will overflow their cache memory, because the individual blocks are too small.

Most SSD DRAM cache are in the single GIGAbyte order too, at best. And some of those many small writes might vacate the cache faster than one large big block (depending on the firmware)

post edited, one gigabyte as magnitude, not megabyte of course
« Last Edit: October 27, 2021, 02:23:45 pm by marcov »

SymbolicFrank

  • Hero Member
  • *****
  • Posts: 1313
Re: TFileStream file size limit
« Reply #22 on: October 27, 2021, 11:13:58 am »
So, the difference between one and many IOPS? Then again, the OS will try to use all free memory as disk cache as well. What is faster, a single, large buffer, managed by the application and only a small cache, managed by the OS, or the other way around? Does the OS limit the cache size for a single IOP?

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11382
  • FPC developer.
Re: TFileStream file size limit
« Reply #23 on: October 27, 2021, 02:34:13 pm »
So, the difference between one and many IOPS?

IOPS where? Between application and kernel (iow syscall), between kernel and device controller, or between

Quote
Then again, the OS will try to use all free memory as disk cache as well.

But if it must hold a part of a large buffer till the controller is ready for it, that is waste too, since the already written part could have been returned to the OS.

Quote
What is faster, a single, large buffer, managed by the application and only a small cache, managed by the OS, or the other way around? Does the OS limit the cache size for a single IOP?

For the last bit: seems win7 at least had a 2GB limit https://community.osr.com/discussion/193831/limitations-on-dma-transfer-size

In general, the rule of thumb is that reducing of number of IOPS only improves performance logarithmically, while extremely large buffers might hit all kind of small inefficiencies. Note also that many DMA controllers (even on embedded ARMs and MIPS like PIC32) have scatter/gather DMA, which allow to write multiple buffers in one go. (IOPS).

Also most IOPS benchmarks are for large (high core/socket count) transactional system with multiple threads doing independent (non related) I/O, and such benchmarks scale badly to synchonous (one thread) I/O systems because all time waiting while the device to complete a transaction is lost IOPS.

SymbolicFrank

  • Hero Member
  • *****
  • Posts: 1313
Re: TFileStream file size limit
« Reply #24 on: October 27, 2021, 03:06:59 pm »
In other words: do use a buffer and don't write each byte individually, but leave the rest to the OS and hardware?

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11382
  • FPC developer.
Re: TFileStream file size limit
« Reply #25 on: October 27, 2021, 03:19:26 pm »
In other words: do use a buffer and don't write each byte individually, but leave the rest to the OS and hardware?

Do use a buffer, but 32kb to 1mB is quite alright for most things. Maybe if you write a disk clone software higher buffers are more worthwhile, but even that will have decreasing returns while the buffer gets larger. I can't really imagine 1GB buffers adding much.

SymbolicFrank

  • Hero Member
  • *****
  • Posts: 1313
Re: TFileStream file size limit
« Reply #26 on: October 27, 2021, 05:40:35 pm »
Agreed.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: TFileStream file size limit
« Reply #27 on: October 28, 2021, 09:21:29 am »
Maybe if you write a disk clone software higher buffers are more worthwhile, but even that will have decreasing returns while the buffer gets larger.

Our main product at work is disk cloning software and we see a decrease already with 4 or 8 MiB buffers.

lagprogramming

  • Sr. Member
  • ****
  • Posts: 405
Re: TFileStream file size limit
« Reply #28 on: January 22, 2022, 02:57:43 pm »
   I expect this to be related to the subject.
   TMemoryStream.SaveToFile silently writes an empty file if the stream size is greater than the maximum value of a longint, at least in a linux-x86_64 environment.

Code: Pascal  [Select][+][-]
  1. procedure WriteGarbage(const FileAddr:string; const GarbageSize:longword);
  2. var s:TMemoryStream;
  3. begin
  4.   s:=TMemoryStream.Create;
  5.   s.SetSize(GarbageSize);
  6.   s.SaveToFile(FileAddr);
  7.   s.Free;
  8. end;
  9.  
  10. procedure TForm1.Button1Click(Sender: TObject);
  11. begin
  12.   writegarbage('garbagelongint.bin', high(longint));//Works OK!
  13.   writegarbage('garbagelongword.bin', high(longint)+1024*1024*1024);//FAILURE!!!
  14. end;

ASerge

  • Hero Member
  • *****
  • Posts: 2222
Re: TFileStream file size limit
« Reply #29 on: January 22, 2022, 03:20:47 pm »
I think this is a question of a separate topic. In Windows, a zero file of 3 GB is written.

 

TinyPortal © 2005-2018