Recent

Author Topic: Potential compiler bug with fpc_ansistr_decr_ref  (Read 3567 times)

ASerge

  • Hero Member
  • *****
  • Posts: 2222
Re: Potential compiler bug with fpc_ansistr_decr_ref
« Reply #15 on: April 07, 2020, 12:20:08 am »
According to the example, the main mistake of the topic's author is trying to combine incompatible things. Read (or write) at a time an entry containing of managed types. This is absurd. Managed types are pointers to memory. Reading and writing them to the stream is pointless. Therefore, for data whose length is unknown, you can only use sequential byte-by-byte reads, and not the entire structure at a time. It is clear that in this case, the packed declaration for records is meaningless.
Need to change the algorithm. Read and write record fields by one of the stream, byte-by-byte, or built-in TStream functions that have already been mentioned, such as ReadString, ReadDWord, etc.

jamie

  • Hero Member
  • *****
  • Posts: 6090
Re: Potential compiler bug with fpc_ansistr_decr_ref
« Reply #16 on: April 07, 2020, 12:25:02 am »
I kind of see what the attempt was I hope..

The code I offered should work in a variant type of data stream, that is, it checks for both the send of the stream via the position and #0 marker... So ether way, if the end of the stream comes up before the #0 does, it'll terminate with what ever it has..

 The proxy function should be testing for the stream.position prior to attempting to read a Null string in from  a stream.

 This could be coming from a un-uniformed file format which is very common..
The only true wisdom is knowing you know nothing

Superdisk

  • Jr. Member
  • **
  • Posts: 73
Re: Potential compiler bug with fpc_ansistr_decr_ref
« Reply #17 on: April 07, 2020, 04:56:55 am »
According to the example, the main mistake of the topic's author is trying to combine incompatible things. Read (or write) at a time an entry containing of managed types. This is absurd. Managed types are pointers to memory. Reading and writing them to the stream is pointless. Therefore, for data whose length is unknown, you can only use sequential byte-by-byte reads, and not the entire structure at a time. It is clear that in this case, the packed declaration for records is meaningless.
Need to change the algorithm. Read and write record fields by one of the stream, byte-by-byte, or built-in TStream functions that have already been mentioned, such as ReadString, ReadDWord, etc.

If I'm not mistaken, I didn't try to do a stream read into managed variables. What do you mean by this?

My problem turned out to be that I was performing TStream.Read into a dynamic array, but I only specified the variable of the array itself and so the pointer (and following memory) was destroyed.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: Potential compiler bug with fpc_ansistr_decr_ref
« Reply #18 on: April 07, 2020, 09:46:54 am »
Well, The TStream.readansistring relies on what I explained, Sven. Look at its implementation. (Be a bit kind, I test this...and it is not nonsense)

Where does the TStream.ReadAnsiString come from, suddenly? It reads first the length and then the string data. This is not what Superdisk needs. They need to read a NUL-terminated string which can only be read by reading Byte-by-Byte until a NUL-character. >:(

Note also that in your code this line:
Code: [Select]
  if B = 0 then Exit(Result);might produce unexpected results. Simply doing:
Code: [Select]
  if B = 0 then Exit;should be enough.

While it is unusual and unneeded it won't lead to unexpected results.

According to the example, the main mistake of the topic's author is trying to combine incompatible things. Read (or write) at a time an entry containing of managed types. This is absurd. Managed types are pointers to memory. Reading and writing them to the stream is pointless. Therefore, for data whose length is unknown, you can only use sequential byte-by-byte reads, and not the entire structure at a time. It is clear that in this case, the packed declaration for records is meaningless.
Need to change the algorithm. Read and write record fields by one of the stream, byte-by-byte, or built-in TStream functions that have already been mentioned, such as ReadString, ReadDWord, etc.

Superdisk is reading character-by-character and appends each to a managed string. There is nothing wrong with that.

My problem turned out to be that I was performing TStream.Read into a dynamic array, but I only specified the variable of the array itself and so the pointer (and following memory) was destroyed.

So is your problem solved now or do you still have it?

Bart

  • Hero Member
  • *****
  • Posts: 5275
    • Bart en Mariska's Webstek
Re: Potential compiler bug with fpc_ansistr_decr_ref
« Reply #19 on: April 07, 2020, 10:56:20 am »
My problem turned out to be that I was performing TStream.Read into a dynamic array, but I only specified the variable of the array itself and so the pointer (and following memory) was destroyed.

And you did a Read with Size=0.

Bart

BrunoK

  • Sr. Member
  • ****
  • Posts: 452
  • Retired programmer
Re: Potential compiler bug with fpc_ansistr_decr_ref
« Reply #20 on: April 07, 2020, 11:54:22 am »
And you did a Read with Size=0.
Without digging into that, I would consider that reading a count of 0 bytes should be reading nothing and shouldn't raise an exception. But that's a different matter.

ASerge

  • Hero Member
  • *****
  • Posts: 2222
Re: Potential compiler bug with fpc_ansistr_decr_ref
« Reply #21 on: April 07, 2020, 02:37:15 pm »
A slightly safer version of the program:
Code: Pascal  [Select][+][-]
  1. program rgb2gbdk;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. uses
  6.   Classes, SysUtils;
  7.  
  8. type
  9.   TRSymbol = record
  10.     Name: string;
  11.     SymType: Byte;
  12.     FileName: string;
  13.     LineNum, SectionID, Value: LongInt;
  14.   end;
  15.  
  16.   TRSection = record
  17.     Name: string;
  18.     Size: LongInt;
  19.     SectType: Byte;
  20.     Org: LongInt;
  21.     Bank: LongInt;
  22.     Align: LongInt;
  23.     Data: array of Byte;
  24.     NumPatches: LongInt;
  25.   end;
  26.  
  27.   TRObjFixPart = packed record
  28.     ID: array[0..3] of AnsiChar;
  29.     RevisionNumber: LongInt;
  30.     NumberOfSymbols: LongInt;
  31.     NumberOfSections: LongInt;
  32.   end;
  33.  
  34.   TRObj = record
  35.     FixPart: TRObjFixPart;
  36.     Symbols: array of TRSymbol;
  37.     Sections: array of TRSection;
  38.   end;
  39.  
  40. const
  41.   //WRAM0 = 0;
  42.   //VRAM = 1;
  43.   ROMX = 2;
  44.   ROM0 = 3;
  45.   //HRAM = 4;
  46.   //WRAMX = 5;
  47.   //SRAM = 6;
  48.   //OAM = 7;
  49.  
  50. function ReadNullTerm(S: TStream): string; // really
  51. var
  52.   B: Byte;
  53. begin
  54.   Result := '';
  55.   repeat
  56.     B := S.ReadByte;
  57.     if B = 0 then
  58.       Break;
  59.     Result += Char(B);
  60.   until False;
  61. end;
  62.  
  63. function ReadRSymbol(S: TStream): TRSymbol;
  64. begin
  65.   Result.Name := ReadNullTerm(S);
  66.   Result.SymType := S.ReadByte;
  67.   Result.FileName := '';
  68.   Result.LineNum := 0;
  69.   Result.SectionID := 0;
  70.   Result.Value := 0;
  71.   if (Result.SymType and $7F) <> 1 then
  72.   begin
  73.     Result.FileName := ReadNullTerm(S);
  74.     Result.LineNum := S.ReadDWord;
  75.     Result.SectionID := S.ReadDWord;
  76.     Result.Value := S.ReadDWord;
  77.   end;
  78. end;
  79.  
  80. function ReadRSection(S: TStream): TRSection;
  81.  
  82.   procedure Die(const S: string);
  83.   begin
  84.     WriteLn(S);
  85.     Readln;
  86.     Halt;
  87.   end;
  88.  
  89. begin
  90.   Result.Name := ReadNullTerm(S);
  91.   Result.Size := S.ReadDWord;
  92.   Result.SectType := S.ReadByte;
  93.   Result.Org := S.ReadDWord;
  94.   Result.Bank := S.ReadDWord;
  95.   Result.Align := S.ReadDWord;
  96.   Result.Data := nil;
  97.   Result.NumPatches := 0;
  98.   if (Result.SectType = ROMX) or (Result.SectType = ROM0) then
  99.   begin
  100.     SetLength(Result.Data, Result.Size);
  101.     S.ReadBuffer(Result.Data[0], Result.Size);
  102.     Result.NumPatches := S.ReadDWord;
  103.     if Result.NumPatches <> 0 then
  104.       Die('Patches not supported');
  105.   end;
  106. end;
  107.  
  108. function ReadObjFile(S: TStream): TRObj;
  109. var
  110.   i: SizeInt;
  111. begin
  112.   Result.Symbols := nil;
  113.   Result.Sections := nil;
  114.   S.ReadBuffer(Result.FixPart, SizeOf(TRObj.FixPart));
  115.   SetLength(Result.Symbols, Result.FixPart.NumberOfSymbols);
  116.   SetLength(Result.Sections, Result.FixPart.NumberOfSections);
  117.   for i := 0 to Result.FixPart.NumberOfSymbols - 1 do
  118.     Result.Symbols[i] := ReadRSymbol(S);
  119.   for i := 0 to Result.FixPart.NumberOfSections - 1 do
  120.     Result.Sections[i] := ReadRSection(S);
  121. end;
  122.  
  123. var
  124.   S: TStream;
  125.   RObj: TRObj;
  126. begin
  127.   S := TFileStream.Create('preview.obj', fmOpenRead or fmShareDenyWrite);
  128.   try
  129.     RObj := ReadObjFile(S);
  130.   finally
  131.     S.Free;
  132.   end;
  133. end.

Warfley

  • Hero Member
  • *****
  • Posts: 1499
Re: Potential compiler bug with fpc_ansistr_decr_ref
« Reply #22 on: April 07, 2020, 05:15:24 pm »
On a side note, for my websockets libarary I've written a function that reads a stream until a pattern is found, you could simply reuse that function from me: https://github.com/Warfley/LazWebsockets/blob/master/src/wsutils.pas#L268

A little bit off topic: Dependening on how much you want to read, with this method of reading (i.e. appending each char) you can run into performance issues pretty fast. The reason being is that Result += char is actually equivalent to:
Code: Pascal  [Select][+][-]
  1. SetLength(Result, Result.length + 1);
  2. Result[Result.Length] := char;
And SetLength is rather runtime expensive (especially if the heap is very fragmented). Alternatives would be buffered reads (i.e. reading into a fixed size buffer and writing this buffer as a whole to the string) or geometric size increase (as TList does, i.e. increase the memory usage when needed by a factor of 2).

Simple experiment:
Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. uses
  6.   Classes, sysutils;
  7.  
  8. function readStr(str: TStream; len: SizeInt): String;
  9. begin
  10.   Result := '';
  11.   while len > 0 do
  12.   begin
  13.     Result += chr(str.ReadByte);
  14.     dec(Len);
  15.   end;
  16. end;
  17.  
  18. function readStrBuffered(str: TStream; len: SizeInt): String;
  19.  
  20. procedure appendToString(var str: String; const buff; buffLen: SizeInt); inline;
  21. var
  22.   oldLen: Integer;
  23. begin
  24.   oldLen := str.length;
  25.   SetLength(str, oldLen + buffLen);
  26.   Move(buff, str[oldLen + 1], buffLen);
  27. end;
  28.  
  29. var
  30.   buff: Array[0..1023] of Byte;
  31.   i: SizeInt;
  32. begin
  33.   i:=0;
  34.   Result := '';
  35.   while len > 0 do
  36.   begin
  37.     buff[i] := str.ReadByte;
  38.     Inc(i);
  39.     if i >= Length(buff) then
  40.     begin
  41.       appendToString(Result, buff[Low(buff)], Length(buff));
  42.       i := 0;
  43.     end;
  44.     dec(Len);
  45.   end;
  46.   appendToString(Result, buff[low(buff)], i);
  47. end;
  48.  
  49. function readStrGeo(str: TStream; len: SizeInt): String;
  50. var
  51.   capacity: SizeInt;
  52.   resLen: SizeInt;
  53.   c: Char;
  54. begin
  55.   capacity := 32;
  56.   resLen := 0;
  57.   SetLength(Result, capacity);
  58.   while len > 0 do
  59.   begin
  60.     c := chr(str.ReadByte);
  61.     if resLen >= capacity then
  62.     begin
  63.       capacity *= 2;
  64.       SetLength(Result, capacity);
  65.     end;
  66.     Result[resLen + 1] := c;
  67.     Inc(resLen);
  68.     dec(Len);
  69.   end;
  70.   SetLength(Result, resLen);
  71. end;
  72.  
  73. var str: TStringStream;
  74.   start, extendTime, buffTime, geoTime: LongWord;
  75.   s1, s2, s3, s: String;
  76. begin;
  77.   str := TStringStream.Create('');
  78.   while not eof do
  79.   begin
  80.     ReadLn(s);
  81.     str.Write(s[1], s.Length);
  82.   end;
  83.   str.Seek(0, 0);
  84.   start := GetTickCount64;
  85.   s1 := readStr(str, str.Size);
  86.   extendTime := GetTickCount64 - start;
  87.   str.Seek(0, 0);
  88.   start := GetTickCount64;
  89.   s2 := readStrBuffered(str, str.Size);
  90.   buffTime := GetTickCount64 - start;    
  91.   str.Seek(0, 0);
  92.   start := GetTickCount64;
  93.   s3 := readStrGeo(str, str.Size);
  94.   geoTime := GetTickCount64 - start;
  95.   WriteLn(extendTime);
  96.   WriteLn(buffTime);
  97.   WriteLn(geoTime);
  98. end.

On around 2 megabytes of data simply appending characters takes around 100 ms, reading via a buffer takes around 30ms (i.e. a factor 3 faster) and reading using geometric size increase takes around 15 ms (i.e. a factor 6 faster).  This is of course without any memory fragmentation, because this program does nothing else than this job. So the numbers are rather optimistic, and they scale very differently.
For less than 4 k the buffered read only needs a single resize, which is much better than any other option while on very large (mutiple mb) texts the geometic increase will outperform the others even more. But the geometric increase takes in the worst case 2 times the memory the string actually takes, while the buffered approach only adds a flat the buffer size (here) 4k.

If you are only reading strings of a few dozend bytes, your approach should be fast enough, but keep in mind if you are doing this really often, or you are reading long strings, this is not a very efficient way of doing so
« Last Edit: April 07, 2020, 05:17:30 pm by Warfley »

Bart

  • Hero Member
  • *****
  • Posts: 5275
    • Bart en Mariska's Webstek
Re: Potential compiler bug with fpc_ansistr_decr_ref
« Reply #23 on: April 07, 2020, 07:46:40 pm »
Without digging into that, I would consider that reading a count of 0 bytes should be reading nothing and shouldn't raise an exception. But that's a different matter.

I stand corrected, it is not the read with size=0 that raises the exception, but accessing Data[0], because we just did SetLength(Data, Size).

Bart

Superdisk

  • Jr. Member
  • **
  • Posts: 73
Re: Potential compiler bug with fpc_ansistr_decr_ref
« Reply #24 on: April 07, 2020, 09:48:32 pm »
Without digging into that, I would consider that reading a count of 0 bytes should be reading nothing and shouldn't raise an exception. But that's a different matter.

I stand corrected, it is not the read with size=0 that raises the exception, but accessing Data[0], because we just did SetLength(Data, Size).

Bart

What's the correct way to read into a dynamic buffer like that? Surely not the Data[0] thing.

Cyrax

  • Hero Member
  • *****
  • Posts: 836
Re: Potential compiler bug with fpc_ansistr_decr_ref
« Reply #25 on: April 07, 2020, 10:15:54 pm »
Without digging into that, I would consider that reading a count of 0 bytes should be reading nothing and shouldn't raise an exception. But that's a different matter.

I stand corrected, it is not the read with size=0 that raises the exception, but accessing Data[0], because we just did SetLength(Data, Size).

Bart

What's the correct way to read into a dynamic buffer like that? Surely not the Data[0] thing.

Yes, it is the correct way if you are using dynamic arrays. If you allocate a buffer from memory by using GetMem/AllocMem then there is no need to access a first index of the array. Only pointer dereference is needed (Buffer^).

 

TinyPortal © 2005-2018