Well, they probably make assumptions about internla structures they shouldn't:
https://wiki.freepascal.org/User_Changes_Trunk#System_-_Ref._count_of_strings
Please and re-please would it be possible to do commits that does not break compatibility?
Hum, who are "they probably make assumptions"?Your application probably makes an assumption about the internal layout of the header that precedes ansistrings (tansirec in the system unit, rtl/inc/astrings.inc). This record type is not exported via the interface of the system unit, which means code outside the compiler/RTL cannot rely on its layout or contents (except via public helpers such as System.StringRefCount, as mentioned on the User Changes page).
The fpc committers or the libraries of my applications?
Hum, who are "they probably make assumptions"?Your application probably makes an assumption about the internal layout of the header that precedes ansistrings (tansirec in the system unit, rtl/inc/astrings.inc). This record type is not exported via the interface of the system unit, which means code outside the compiler/RTL cannot rely on its layout or contents (except via public helpers such as System.StringRefCount, as mentioned on the User Changes page).
The fpc committers or the libraries of my applications?
except via public helpers such as System.StringRefCountSadly StringRefCount works only for Tansistring and Tunicodestrings.
So the proposition is this, use a define to give the possibility to user to not use the patch:
So the proposition is this, use a define to give the possibility to user to not use the patch:
This is not how it works. We can't cater to faulty code. If you can't fix it, you simply have to stay
with older versions till you can fix the problem in your code.
After deep tests it appears that the problems occur with the patch done to rtl/inc/astrings.inc and /rtl/inc/ustrings ( patch for compiler/aasmcnst.pas is OK. )If you revert the changes in the RTL, you also have to revert the compiler change. Otherwise the compiler will generate data for the new definition while the code in the RTL uses the old definition.
Quoteexcept via public helpers such as System.StringRefCountSadly StringRefCount works only for Tansistring and Tunicodestrings.
And so everybody is happy, even the people that uses libraries that dont like the original commit.
Yes but the problem is not only in my code but in code of other libraries too.
Quoteexcept via public helpers such as System.StringRefCountSadly StringRefCount works only for Tansistring and Tunicodestrings.
First of they're named AnsiString and UnicodeString without any T-prefix and second these two types are the only two types that use the record so that StringRefCount only works with those two types is no problem.
patch by Sergey Larin: Reducing and aligning the size of TAnsiRec, TUnicodeRec for CPU64
Yes but the problem is not only in my code but in code of other libraries too.No, the problem is relying on undocumented implementation details. We don't give guarantees that implementation details stay the same, because these details might be because of some bug that needs to be fixed.
Does somebody have a idea what could affect that difference in the reference counter of AnsiString and UnicodeString ?The reason nobody really has any idea about this or can't tell you this, is because this change cannot affect any code that uses ansistrings/unicodestrings "normally": concatenating, passing as parameters, returning as function result, using them as a pchar, converting them to other string types, ...
Does somebody have a idea what could affect that difference in the reference counter of AnsiString and UnicodeString ?It can only cause problems if MSE somewhere e.g. typecasts an ansistring or unicodestring to a pointer, and then uses pointer arithmetic to manually calculate what it thinks is the offset of the reference count (or another field whose offset changed).
Your code probably casts an (ansi-)string to a pointer, and accesses memory in front of the pointer.
Something like:
if PLongInt(MySomeString)[-1] = 0 then
Such code assumes that it finds the length or refcount or codepage of the string at the address accessed by the expression.
For example this procedure (in kernel/msestrings.pas)?Yes, absolutely. Here's an alternative way to implement that hack. It's still quite iffy, but at least a bit safer, and compatible both with old and new FPC versions:
procedure stringaddref(var str: ansistring);
var
temp: ansistring;
begin
{ increase refcount if it was > 0}
temp:=str;
{ prevent the compiler from finalising temp, so it won't decrease the refcount again }
pointer(temp):=nil;
end;
For example this procedure (in kernel/msestrings.pas)?Exactly that.
Signal: SIGSEGV: File msestrings.pas:878 Function: DOSTRINGTOUTF8
An unhandled exception occurred at $000000000060E501 :
??????????????????????
... (+ much other chars)
[Inferior 1 (process 8215) exited with code 0331]
Note that I did already try, before your post, with this but it crash without any info from debugger:
procedure stringaddref(var str: ansistring); var po1: int64; begin if pointer(str) <> nil then begin po1:= StringRefCount(str); if po1 >= 0 then begin inc(po1); end; end; end;
The excellent news is that with Jonas procedure, it crash with note from the debugger.This code is fine. It probably crashes here because it's working on an invalid string.
The debugger message:QuoteSignal: SIGSEGV: File msestrings.pas:878 Function: DOSTRINGTOUTF8
Here the code pointing to error:
procedure dostringtoutf8(const value: pmsechar; var count: integer; const dest: pointer; const options: utfoptionsty); var ps,pe: pcard16; pd: pcard8; ca1: card16; ... begin pd:= dest; ps:= pointer(value); pe:= ps + count; while ps < pe do begin ca1:= ps^; // Debugger point to this ...
[EDIT] Tried with changing this:Don't start changing the types of random variables used during string processing. Analyse what needs changing, then change.
var ps,pe: pcard32; // here change to card32 pd: pcard8; ca1: card32; // here change to card32
There is also that code that surely needs change (but with what?) :Here you can change the assignment of po1 with StringRefCount(str).
procedure stringsafefree(var str: string; const onlyifunique: boolean); var po1: psizeint; begin if pointer(str) <> nil then begin po1:= pointer(str)-2*sizeof(sizeint); if (po1 >= 0) and (not onlyifunique or (po1 = 1)) then begin fillchar(pointer(str)^,length(str)*sizeof(str[1]),0); str:= ''; end; end; end;
This code is fine. It probably crashes here because it's working on an invalid string.
Here you can change the assignment of po1 with StringRefCount(str).
3) Using your previous code for stringaddref for ansistring and msestring:
procedure stringaddref(var str: ansistring); var temp: ansistring; begin { increase refcount if it was > 0} temp:=str; { prevent the compiler from finalising temp, so it won't decrease the refcount again } pointer(temp):=nil; end;
@Jonas, Martin
Thank you for your help.
I wonder if there won't be an issue also with this (still in msestrings.pas):
stringheaderty = packed record {$ifdef hascodepage} CodePage: TSystemCodePage; ElementSize: Word; {$ifdef CPU64} { align fields } Dummy: DWord; {$endif CPU64} {$endif} ref: sizeint; len: sizeint; end;
I will look at it closer tomorrow.
If I understood well, that code should not make conflict with some fpc because stringheaderty is not defined in fpc.
But that code does not profit of the ee10850a commit from fpc that remove the "Dummy" element in 64 bit, nor do better alignment.
If this is correct, imho it would be good to adapt the mse stringheaderty code and do the same as TAnsiRec, TUnicodeRec in ee10850a fpc commit.
If I understood well, that code should not make conflict with some fpc because stringheaderty is not defined in fpc.
But that code does not profit of the ee10850a commit from fpc that remove the "Dummy" element in 64 bit, nor do better alignment.
There won't be a type conflict, yes, but any code that uses stringheaderty will access wrong data on 64-bit, because the only use of tstringheaderty is to interact with the header record of Ansi- and UnicodeString which now have a mismatching memory layout.If this is correct, imho it would be good to adapt the mse stringheaderty code and do the same as TAnsiRec, TUnicodeRec in ee10850a fpc commit.
It would not only be good, but also correct to adjust the code. Better yet: check whether you can drop it.
Better yet: check whether you can drop it.
It would not only be good, but also correct to adjust the code. Better yet: check whether you can drop it.