Recent

Author Topic: Strange bytes when using TFileStream on Linux  (Read 528 times)

dogriz

  • Full Member
  • ***
  • Posts: 119
    • Tech blog - Delphi, Lazarus, Firebird, Windows, Linux, Android...
Strange bytes when using TFileStream on Linux
« on: September 25, 2020, 09:45:56 am »
AssignFile/Rewrite saves text to a file exactly as given in the code below.
When FileStream is used to save the same text to a file, it adds 4 bytes at the begining (HEX: 0C 00 00 00).
Code: Pascal  [Select][+][-]
  1. program writeFileTest;
  2. {$mode objfpc}{$H+}
  3. uses
  4.   Classes, sysutils;
  5. var
  6.   fs: TFileStream;
  7.   s: String;
  8.   tf: TextFile;
  9. begin
  10.   s := 'Hello world.';
  11.   AssignFile(tf, ExtractFilePath(ParamStr(0)) + 'file1.txt');
  12.   Rewrite(tf);
  13.   Write(tf, s);
  14.   CloseFile(tf);
  15.   fs := TFileStream.Create(ExtractFilePath(ParamStr(0)) + 'file2.txt', fmCreate);
  16.   fs.WriteAnsiString(s);
  17.   fs.Free;
  18. end.

Code: [Select]
file1.txt HEX: 48 65 6C 6C 6F 20 77 6F 72 6C 64 2E
file2.txt HEX: 0C 00 00 00 48 65 6C 6C 6F 20 77 6F 72 6C 64 2E


My first guess is that's some encoding information for text files on linux, but I'm not sure.
"file" command in console gives me following information:

Code: [Select]
$ file -bi file1.txt
text/plain; charset=us-ascii

Code: [Select]
$ file -bi file2.txt
application/octet-stream; charset=binary

Is there a way to change how TFileStream saves files (to be the same as AssignFile/Rewrite works)?
FPC 3.0.4
Lazarus 2.0.6
Debian x86_64, arm

hansotten

  • Jr. Member
  • **
  • Posts: 59
Re: Strange bytes when using TFileStream on Linux
« Reply #1 on: September 25, 2020, 10:02:29 am »
Looks like the length is also stored in the stream. (0c 00 00 00 = 12, the text has 12 chars).
Length is in shortstrings and ansistrings,

https://wiki.lazarus.freepascal.org/Character_and_string_types ?

TfileStream is not special for text files, so stores the variable as is, with length. it is a binary file, without line ends.

write on a text file stores the string as a text line. because you did not use a writeln the line ending is not in the text file.

If you want  a textfile with tfilestream you should feed it characters and line endings, not just a string


 
« Last Edit: September 25, 2020, 10:15:07 am by hansotten »

lucamar

  • Hero Member
  • *****
  • Posts: 3227
Re: Strange bytes when using TFileStream on Linux
« Reply #2 on: September 25, 2020, 10:14:10 am »
It's written in the docs about TStream (the base class):

Quote
WriteAnsiString writes the AnsiString S (i.e. 4 bytes) to the stream. This is a utility function which simply calls the Write function. The ansistring is written as a 4 byte length specifier, followed by the ansistring's content. The ansistring can be read from the stream using the ReadAnsiString function.

To emulate Assign/Rewrite/Write you have to use Stream.Write or Stream.WriteBuffer:

Code: Pascal  [Select][+][-]
  1.   fs := TFileStream.Create(ExtractFilePath(ParamStr(0)) + 'file2.txt', fmCreate);
  2.   fs.WriteBuffer(s[1], Length(s));
  3.   fs.Free;
« Last Edit: September 25, 2020, 10:19:53 am by lucamar »
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus/FPC 2.0.8/3.0.4 & 2.0.10/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

rvk

  • Hero Member
  • *****
  • Posts: 4395
Re: Strange bytes when using TFileStream on Linux
« Reply #3 on: September 25, 2020, 10:16:33 am »
You could do this:

Code: Pascal  [Select][+][-]
  1. fs := TFileStream.Create(ExtractFilePath(ParamStr(0)) + 'file2.txt', fmCreate);
  2. fs.WriteBuffer(Pointer(s)^, length(s));
  3. fs.Free;

Note: you used Write(tf, s) so this example with WriteBuffer also only writes the string, not a CR+LF (or CR) after it.
Reading would be more difficult because you can't use readln and you would need to recognize CR.

If you want to handle longer files with strings you might want to use TStringList, TStringStream etc.
« Last Edit: September 25, 2020, 10:18:51 am by rvk »

dogriz

  • Full Member
  • ***
  • Posts: 119
    • Tech blog - Delphi, Lazarus, Firebird, Windows, Linux, Android...
Re: Strange bytes when using TFileStream on Linux
« Reply #4 on: September 25, 2020, 10:48:59 am »
Thanks all for clarification. TFileStream.WriteBuffer gives me what I expeted.
FPC 3.0.4
Lazarus 2.0.6
Debian x86_64, arm

hansotten

  • Jr. Member
  • **
  • Posts: 59
Re: Strange bytes when using TFileStream on Linux
« Reply #5 on: September 25, 2020, 11:18:45 am »
Thanks all for clarification. TFileStream.WriteBuffer gives me what I expeted.

fs.WriteBuffer(s[1], Length(s));

Yes, you skip the length being written this way. But the result not a text file.
You still have to think about line endings if you want the equivalent of writeln  to a textfile if that is what you want.

rvk

  • Hero Member
  • *****
  • Posts: 4395
Re: Strange bytes when using TFileStream on Linux
« Reply #6 on: September 25, 2020, 11:25:53 am »
writeln() equivalent would be

Code: Pascal  [Select][+][-]
  1. s := s + sLineBreak;
  2. fs.WriteBuffer(s[1], length(s));

Reading would be much harder  :D

lucamar

  • Hero Member
  • *****
  • Posts: 3227
Re: Strange bytes when using TFileStream on Linux
« Reply #7 on: September 25, 2020, 01:00:12 pm »
You still have to think about line endings if you want the equivalent of writeln  to a textfile if that is what you want.

It's still a text file, it just have a single neverending story line. That's, among other reasons, why TStream.WriteAnsiString() and  TStream.ReadAnsiString() write the string's length before its contents, to be able to write/read the string as it is/was.
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus/FPC 2.0.8/3.0.4 & 2.0.10/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

PascalDragon

  • Hero Member
  • *****
  • Posts: 2424
  • Compiler Developer
Re: Strange bytes when using TFileStream on Linux
« Reply #8 on: September 25, 2020, 03:12:16 pm »
Yes, you skip the length being written this way. But the result not a text file.
You still have to think about line endings if you want the equivalent of writeln  to a textfile if that is what you want.

Of course the result is a text file. If I store a file in Notepad that only contains Hello World without any trailing line ending it will have the same data. (If you want to write multi line text however then you need to make sure that line endings are written as well, however)

 

TinyPortal © 2005-2018