Recent

Author Topic: TFileStream, read files larger than 2gig  (Read 3451 times)

MaartenJB

  • Full Member
  • ***
  • Posts: 112
TFileStream, read files larger than 2gig
« on: June 15, 2019, 04:50:11 pm »
Hi,

I'm using TFileStream, and when read beyond 2g I get garbage. Is there a limit with TFileStream? If so, is there a solution to read data past the 2g mark?

I tried it on Windows10(NTFS)  and Linux (Ext4).

Best regards,

Maarten

howardpc

  • Hero Member
  • *****
  • Posts: 4144
Re: TFileStream, read files larger than 2gig
« Reply #1 on: June 15, 2019, 04:59:35 pm »
Everything has its limit.
TFileStream Size is a Longint. But since a stream can only have positive Size values, the maximum Size possible is 21,447,483,647.
However, any file size that approaches close to that absolute limit is probably unwise.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: TFileStream, read files larger than 2gig
« Reply #2 on: June 15, 2019, 05:11:00 pm »
Are you trying to read it in one read() command ? Or multiple?

Afaik you can use larger files, but only read up to 2GB at a time on Windows.

MaartenJB

  • Full Member
  • ***
  • Posts: 112
Re: TFileStream, read files larger than 2gig
« Reply #3 on: June 15, 2019, 05:21:56 pm »
thanks for replying.

@howard, but I'm not even close to that limit, just at a 10th, but maybe it's indeed better to split the file to a maximum of 2gig.

@marcov, my file contains structures (packed records), I read them one by one.



marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: TFileStream, read files larger than 2gig
« Reply #4 on: June 15, 2019, 06:00:55 pm »
It's totally against my knowledge, so that would require retesting. And doubly weird that it is on multiple OSes.

Afaik linearly reading files that are >4GB don't need much support from the RTL, the reads are just passed to the OS, so it is weird you have problems.

I do assume you are at least using the last release (3.0.4). Maybe you can prepare something we can test?
« Last Edit: June 15, 2019, 06:02:27 pm by marcov »

lucamar

  • Hero Member
  • *****
  • Posts: 4219
Re: TFileStream, read files larger than 2gig
« Reply #5 on: June 15, 2019, 06:11:03 pm »
Everything has its limit.
TFileStream Size is a Longint. But since a stream can only have positive Size values, the maximum Size possible is 21,447,483,647.

Huh... don't you mean rather 2,147,483,647?
Seems an extra "4" inserted itself from somewhere :)
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus/FPC 2.0.8/3.0.4 & 2.0.12/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

SymbolicFrank

  • Hero Member
  • *****
  • Posts: 1313
Re: TFileStream, read files larger than 2gig
« Reply #6 on: June 15, 2019, 06:32:42 pm »
If your CPU and OS are 64-bits, you might want to memory-map the file.

bylaardt

  • Sr. Member
  • ****
  • Posts: 309
Re: TFileStream, read files larger than 2gig
« Reply #7 on: June 15, 2019, 06:33:50 pm »
Everything has its limit.
TFileStream Size is a Longint.
actually is a SizeInt:
Code: Pascal  [Select][+][-]
  1.     {$ifdef CPU64}
  2.       SizeInt = Int64;
  3.     ...
  4.     {$ifdef CPU32}
  5.       SizeInt = Longint;
  6.  
theoretically maybe can work on 64-bit systems.
In this case the theoric limit (I have no memory enough to test it) must be 8 Exabytes-1 = 9.223.372.036.854.775.807

Thaddy

  • Hero Member
  • *****
  • Posts: 14204
  • Probably until I exterminate Putin.
Re: TFileStream, read files larger than 2gig
« Reply #8 on: June 15, 2019, 06:43:05 pm »
If that's the case it is a bug. Should be High(NativeUint) in both cases. Otherwise you can only address half the memory..
So even for 64 bit you come up short.
« Last Edit: June 15, 2019, 06:45:45 pm by Thaddy »
Specialize a type, not a var.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: TFileStream, read files larger than 2gig
« Reply #9 on: June 15, 2019, 06:44:38 pm »
But where do you use _any_ of such integers when linearly reading records ?

I can imagine some *nix needing O_LARGE_FILE or so, but same issue on multiple OSes at the same time? We need to see code.

Thaddy

  • Hero Member
  • *****
  • Posts: 14204
  • Probably until I exterminate Putin.
Re: TFileStream, read files larger than 2gig
« Reply #10 on: June 15, 2019, 06:47:16 pm »
I consider it a bug, Marco. In TStream.
« Last Edit: June 15, 2019, 06:50:02 pm by Thaddy »
Specialize a type, not a var.

ASerge

  • Hero Member
  • *****
  • Posts: 2223
Re: TFileStream, read files larger than 2gig
« Reply #11 on: June 15, 2019, 09:36:53 pm »
I still don't understand what the problem is? The Size and Position properties of TStream have long been of type Int64. Yes, you won't be able to read it at once on 32-bit systems just because addressing doesn't allow it. But to read in parts - not see problems. It is clear that on x64 it is possible to read a big fragment at once.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: TFileStream, read files larger than 2gig
« Reply #12 on: June 15, 2019, 09:45:13 pm »
I consider it a bug, Marco. In TStream.

I haven't see anything yet, so I wonder how you can conclude that.

SymbolicFrank

  • Hero Member
  • *****
  • Posts: 1313
Re: TFileStream, read files larger than 2gig
« Reply #13 on: June 15, 2019, 10:38:34 pm »
I still don't understand what the problem is? The Size and Position properties of TStream have long been of type Int64. Yes, you won't be able to read it at once on 32-bit systems just because addressing doesn't allow it. But to read in parts - not see problems. It is clear that on x64 it is possible to read a big fragment at once.

Even on ancient Windows, file size has been unsigned 64-bit integers. If you throw a single, signed integer in the mix somewhere, bad stuff happens.

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: TFileStream, read files larger than 2gig
« Reply #14 on: June 15, 2019, 10:41:37 pm »
howard mentions: 21,447,483,647
MaartenJB says 10th of that: ~ 2,144,748,364
lucamar corrects howard's number: 2,147,483,647
crystal ball: predicts that MaartenJB is using Position or Seek with some wrong variable type. Something along:
Code: Pascal  [Select][+][-]
  1. var
  2.   SomePos: Integer; { 32bit system }
  3. ...
  4. begin
  5. ...
  6.   FileStream.Position := SomePos; { or Seek }

The type of the variable is 4Byte signed type, limiting the range to 2GB on two different operation systems.

@MaartenJB, use Int64. And pay some attention to the warning/hint the compiler had given you.

 

TinyPortal © 2005-2018