Lazarus

Free Pascal => Windows => Topic started by: CCRDude on January 30, 2019, 09:30:51 pm

Title: TRegistry - cleanup and fixes
Post by: CCRDude on January 30, 2019, 09:30:51 pm
I’ve spend a few days rewriting an old application of mine. Back then, it was written in Delphi 7, Ansi. These days, it needs to be Unicode. And that’s where the things get complicated 😉
A registry editor needs to support all existing registry entry types. I remember many years ago I supplied a patch that adds more types (issue 32114 (https://bugs.freepascal.org/view.php?id=32114)). Meanwhile, a few things have changed:

Issue 1: REG_QWORD
There are 8 bit integers as well. I’ve supplied a patch (https://bugs.freepascal.org/view.php?id=34875) to add rdInt64 to the enum, REG_QWORD to the associated array, and ReadInt64/WriteInt64 functions. Any idea how I can speed this up? Working around this in every registry handling operation is ugly.

Issue 2: *A vs *W functions for ReadString/WriteString
Back when I first supplied a patch for more types, functions were using the *A api calls. Now they use the *W calls. ReadString and WriteString do convert the PWideChar using UTF8Encode before returning them as string. According to this (https://bugs.freepascal.org/view.php?id=34876), this is wrong, since the FCL should not use UTF8. How would I be able to get Unicode values from the registry at all otherwise?

Issue 3: *W function for ReadStringList/WriteStringList
Same issue as above, ReadStringList / WriteStringList do use the *W functions, but do not use any conversion from PWideChar to String, resulting in broken TStringList content with #0 inserted between each character (since it’s latin unicode characters). My patch was to use UTF8 here as well, but since UTF8 is not wanted, should I supply a patch that converts it to plain AnsiString, loosing content, or is there any other way to retrieve Unicode registry content through TRegistry?

Issue 4: Extending existing TRegDataType without the patch applied above
I do have written my own TRegistry descendant to handle ReadInt64/WriteInt64, for example. I need it anyway for a bunch of advanced functions (moving keys, duplicating keys, &c), I could implement all the string functions as Unicode functions with my own data reading/writing backend there - but adding a custom extended TRegDataType clone type would make things unnecessarily complex. Is there any soft way to extend it without FCL modification? One issue is that even if I could „extend“ the enum using code, the associated array is not in the interface, but an implementation, and would need to be overridden as well.

Any recommendations on how patches that both do comply with the FCL AnsiString idea and at the same time provide Unicode support should look like are welcome.
I'm willing to spend more time on this, just need directions how this time should be spent to have a chance to improve fcl-registry :)
Title: Re: TRegistry - cleanup and fixes
Post by: Remy Lebeau on February 02, 2019, 12:33:26 am
Issue 2: *A vs *W functions for ReadString/WriteString
Back when I first supplied a patch for more types, functions were using the *A api calls. Now they use the *W calls. ReadString and WriteString do convert the PWideChar using UTF8Encode before returning them as string. According to this (https://bugs.freepascal.org/view.php?id=34876), this is wrong, since the FCL should not use UTF8. How would I be able to get Unicode values from the registry at all otherwise?

I would make ReadString()/WriteString() return/take UnicodeString explicitly, to match the encoding of the Registry's W functions. Let the caller deal with any conversions between String<->UnicodeString as needed. When String is UnicodeString, this would be a no-op both ways. And when String is a codepaged AnsiString, converting from String to UnicodeString would use whatever codepage is stored in the String, and converting from UnicodeString to String would use the RTL's default codepage, which can be customized at runtime.

Issue 3: *W function for ReadStringList/WriteStringList
Same issue as above, ReadStringList / WriteStringList do use the *W functions, but do not use any conversion from PWideChar to String, resulting in broken TStringList content with #0 inserted between each character (since it’s latin unicode characters). My patch was to use UTF8 here as well, but since UTF8 is not wanted, should I supply a patch that converts it to plain AnsiString, loosing content, or is there any other way to retrieve Unicode registry content through TRegistry?

In this case, you are stuck dealing with a list of String values and not UnicodeString values, but really that is not a problem because of what I said above.
Title: Re: TRegistry - cleanup and fixes
Post by: ASerge on February 02, 2019, 10:41:01 am
As far as I can see now in the trunk is an implementation with Utf8Encode conversions and a W version API call. Unfortunately, if you do not use LCL, this results in encoding errors for the returned names.
Title: Re: TRegistry - cleanup and fixes
Post by: CCRDude on February 08, 2019, 09:58:37 am
@ASerge: mostly, yes, expect for REG_MULTISZ, where it's broken.

@Remy Lebeau: I would favor a WideString version as well - but that would break the Delphi 7 compatibility.
I'm indeed stuck with a list of string values - the issue is actually is a problem because it's not a list of string values, but due to missing string type conversion, a list of characters (for standard latin characters, every second byte is #0, which for REG_MULTISZ is a line separator, so I get one character per list item).

My main question is how to do updates to it that won't go wasted but comply with what follows conventions.
Title: Re: TRegistry - cleanup and fixes
Post by: Remy Lebeau on February 09, 2019, 03:07:22 am
@Remy Lebeau: I would favor a WideString version as well - but that would break the Delphi 7 compatibility.

Not really.  Delphi 7 has WideString, and conversions between AnsiString and WideString.

for standard latin characters, every second byte is #0

That is true only if you store characters U+0000..U+00FF in a Unicode UTF-16 string.  That is not true at all for ANSI/UTF-8 strings.

which for REG_MULTISZ is a line separator, so I get one character per list item

You misunderstand how REG_MULTI_SZ actually works.

List items are separated by a NULL CHARACTER, with an extra NULL CHARACTER at the end of the list.  The size of each character is 1 or 2 bytes, determined by whether you use an ANSI or Unicode API to write/read the REG_MULTI_SZ data.

ANSI uses 1-byte character elements, so the item separator is a 1-byte NULL CHARACTER #00.  As such, it is not possible for ANSI list items to have embedded NULL BYTES/CHARACTERS.

UTF-16 uses 2-byte character elements, so the item separator is a 2-byte NULL CHARACTER #0000.  As such, it is perfectly acceptable for Unicode list items to have embedded NULL BYTES, such as for Latin characters, they just can't contain embedded NULL CHARACTERS.

If you need to store string lists with embedded NULL CHARACTERS, you can always use REG_BINARY instead.

My main question is how to do updates to it that won't go wasted but comply with what follows conventions.

There is no problem if you stick with Unicode UTF-16 strings at the API layer.  You just have to worry about possible conversions between ANSI/UTF-8 and UTF-16 at the input/output layer.  Which I think you should let the RTL handle for you.
Title: Re: TRegistry - cleanup and fixes
Post by: marcov on February 09, 2019, 02:58:18 pm
Any idea how I can speed this up?

Whine about it on the forum, apparently. r41267
Title: Re: TRegistry - cleanup and fixes
Post by: engkin on February 09, 2019, 06:22:44 pm
Issue 2: *A vs *W functions for ReadString/WriteString
Back when I first supplied a patch for more types, functions were using the *A api calls. Now they use the *W calls. ReadString and WriteString do convert the PWideChar using UTF8Encode before returning them as string. According to this (https://bugs.freepascal.org/view.php?id=34876), this is wrong, since the FCL should not use UTF8. How would I be able to get Unicode values from the registry at all otherwise?
Instead of:
Code: Pascal  [Select]
  1.        AList.Text := UTF8Encode(Data);
use:
Code: Pascal  [Select]
  1.        AList.Text := String(Data);

Issue 3: *W function for ReadStringList/WriteStringList
Same issue as above, ReadStringList / WriteStringList do use the *W functions, but do not use any conversion from PWideChar to String, resulting in broken TStringList content with #0 inserted between each character (since it’s latin unicode characters). My patch was to use UTF8 here as well, but since UTF8 is not wanted, should I supply a patch that converts it to plain AnsiString, loosing content, or is there any other way to retrieve Unicode registry content through TRegistry?
Instead of:
Code: Pascal  [Select]
  1.   Data := UTF8Decode(StringReplace(List.Text, LineEnding, #0, [rfReplaceAll]) + #0#0);
use
Code: Pascal  [Select]
  1.   Data := UnicodeString(StringReplace(List.Text, LineEnding, #0, [rfReplaceAll]) + #0#0);
Title: Re: TRegistry - cleanup and fixes
Post by: ASerge on February 09, 2019, 09:19:20 pm
use
Code: Pascal  [Select]
  1.   Data := UnicodeString(StringReplace(List.Text, LineEnding, #0, [rfReplaceAll]) + #0#0);
May be better
Code: Pascal  [Select]
  1.   Data := UnicodeString(StringReplace(List.Text, List.LineEnding, #0, [rfReplaceAll]) + #0#0);
Title: Re: TRegistry - cleanup and fixes
Post by: CCRDude on February 09, 2019, 09:50:56 pm
Not really.  Delphi 7 has WideString, and conversions between AnsiString and WideString.

My apologies, in my memory it was a pre-String=WideString version and the RTL was still AnsiString back then. Should've checked first though.

That is true only if you store characters U+0000..U+00FF in a Unicode UTF-16 string.  That is not true at all for ANSI/UTF-8 strings.

I was speaking about TRegistry, which used the *W functions, which return UTF-16 ;)

You misunderstand how REG_MULTI_SZ actually works. ...
List items are separated by a NULL CHARACTER, with an extra NULL CHARACTER at the end of the list.  The size of each character is 1 or 2 bytes, determined by whether you use an ANSI or Unicode API to write/read the REG_MULTI_SZ data.

Fully understood here :) Since TRegistry was using *W, I was just ignoring the *A case.

If you need to store string lists with embedded NULL CHARACTERS, you can always use REG_BINARY instead.

I don't need to do that at all :)

But if you want to have some fun, find programs that use REG_MULTI_SZ and use the binary editor of regedit to remove the trailing zeros... you'll be astonished how many fail of correctly handling the data then (e.g.  the start page setting of many older IE versions was just interpreting the stuff in memory behind as "start pages").

Any idea how I can speed this up?

Whine about it on the forum, apparently. r41267

Great, so I'll just stop trying to find out how to submit useful patches, and start whining more?

Come on, I wasn't asking for someone to do the job for me (at least that was not my intention), but for instructions on how to better supply qualified patches.
Title: Re: TRegistry - cleanup and fixes
Post by: Remy Lebeau on February 09, 2019, 10:29:24 pm
My apologies, in my memory it was a pre-String=WideString version and the RTL was still AnsiString back then. Should've checked first though.

It is, actually.  But it does have WideString available.  WideString was first introduced in Delphi 3.

You misunderstand how REG_MULTI_SZ actually works. ...
List items are separated by a NULL CHARACTER, with an extra NULL CHARACTER at the end of the list.  The size of each character is 1 or 2 bytes, determined by whether you use an ANSI or Unicode API to write/read the REG_MULTI_SZ data.

Fully understood here :) Since TRegistry was using *W, I was just ignoring the *A case.

Apparently not, since you thought the presence of NULL BYTES in a UTF-16 string would cause items to be separated incorrectly.  IT DOES NOT.  NULL BYTES are fine in a UTF-16 string.  NULL CHARACTERS are not fine.  There is a difference!
Title: Re: TRegistry - cleanup and fixes
Post by: ASerge on February 10, 2019, 07:19:00 am
I saw a new implementation of TRegistry in the trunk (revision 41273) and could not resist to fix it (new version attached).

===In all===
Fixed: Excluded annoying hint for unused parameters

===regini.inc===
Fixed: TRegIniFile.ReadSectionValues check value type before read it as string to exclude runtime errors

===registry.pp===
Fixed: TRegistry LoadKey, ReplaceKey, RestoreKey, SaveKey, UnloadKey, MoveKey tag as unimplemented
Fixed: TRegistry.WriteBinaryData, append const to Buffer param, now accept const buffers too
Fixed: TRegistry WriteDate and WriteTime set inline to code WriteDateTime, because it do the same
Fixed: TRegistry.CloseKey set as class and static, because it's not used Self
Fixed: TRegistryIniFile ReadBinaryStream, WriteBinaryStream, UploadFile tag as unimplemented
Fixed: TRegistry.Create micro optimization: exclude zeroing fields (already done in constructor and documented)
Fixed: TRegistry Create, GetData, PutData, ReadString move raise to internal procedure - speed optimization
Fixed: TRegistry GetDataType, GetDataSize call SysGetData directly, exclude unnecessary checks - speed optimization
Fixed: TRegistry ReadBinaryData, ReadInteger, ReadInt64, ReadStringList move raise to separate procedure - speed and size (used more than ones) optimization
Fixed: TRegistry ReadDate and ReadTime call ReadDateTime with Trunc or Frac - size optimization
Fixed: TRegistry.ReadString use local var to exclude multiple call of Length
Fixed: TRegistry.ReadString exclude / and round for integers value, use div
Fixed: TRegistry.ReadString use string(U) instead of Utf8Encode(U) to work with or without LCL
Fixed: TRegistry.ReadStringList unicode support
Fixed: TRegistry ReadStringList, WriteStringList use AList.LineBreak, not global LineEnding
Fixed: TRegistry.WriteExpandString use UnicodeString(Value) instead of Utf8Decode(U) to work with or without LCL
Fixed: TRegistry.WriteStringList unicode support
Fixed: TRegistry.WriteString use UnicodeString(Value) instead of Utf8Decode(U) to work with or without LCL
Fixed: TRegistry DeleteValue, SysGetData, SysPutData, ReadString, ReadStringList API support nil, so change typecast from UnicodeChar(U), to Pointer(U) - speed optimization
Fixed: internal PrepKey rename to Normalize and return UnicodeString, not PChar, and change anywhere Utf8Encode(PChar(PrepKey(...))) to Normalize(...) - speed optimization and work with or without LCL

Code style refactor according coding style from  http://wiki.freepascal.org/DesignGuidelines (http://wiki.freepascal.org/DesignGuidelines)

If I post in bugtracker, the admins, as usual, will ignore, because a lot of changes  :'(
Title: Re: TRegistry - cleanup and fixes
Post by: Bart on February 10, 2019, 01:40:27 pm
If I post in bugtracker, the admins, as usual, will ignore, because a lot of changes  :'(

It will surely go unnoticed if you leave it here.
Further more: you did not attach a patch but complete sources, this is a sure way to have it rejected.
If you create a proper patch (either using svn or git) it makes it a lot easier to review.

Also, devels often ask to split big changes into seperate patches, so thing can be more easily studied and/or reverted when necessary.

And specific to this topic: hardly any fpc devel uses the registry, so response time may be slow.
Also there are no tests w.r.t. fcl-registry, so it is very hard to test regressions, which in turn makes devels reluctant to apply changes...
I know this because I recently wrote some fixes for the TRegIniFile wrapper around the registry.
(I don't use the registry myself, and never used TRegIniFile in my Delphi days, and I wonder why anybody would nowadays. Just learn to use TRegistry directly, it's not that difficult).

Making tests for all basic functionality of registry would be a first step (not one big test, but separate tests for each functionality and also tests for solved bugs).
Please also notice that, while the registry is a Windows thing, the code is supposed to work in other platforms as well (it stores data in some xml file), so you should not break that.
(IMO they shold not have done that, but then again, who am I to judge?)

And FWIW: I think that the more patches for the registry appear in the bugtracker, the higher the chance is that we will get feedback.

Bart
Title: Re: TRegistry - cleanup and fixes
Post by: Bart on February 10, 2019, 02:29:11 pm
Fixed: TRegistry.ReadString use string(U) instead of Utf8Encode(U) to work with or without LCL
Why? This already works without LCL. The string returned has codepage UTF8 and assigning it to another string will trigger the conversion from UTF8 to current codepage.

Here's a snippet from a pure fpc program (no LCL/LazUtils dependancy):
(DefaultSystemCodePage=1252)
Code: Pascal  [Select]
  1.     S := Reg.ReadString('StringUtf8');
  2.     if (S <> '') then
  3.     begin
  4.       writeln('ReadString(''StringUtf8'')');
  5.       writeln('StringCodePage=',StringCodePage(S));
  6.       for i := 1 to length(S) do write(IntToHex(Ord(S[i]),2),#32);
  7.       writeln;
  8.       writeln(S);
  9.     end;

This is the output:
Code: Pascal  [Select]
  1. ReadString('StringUtf8') //value should be 'äëï'. This is of course inside my codepage, if it were outside at any stage conversion would loose characters
  2. StringCodePage=65001
  3. C3 A4 C3 AB C3 AF  //(bytes are UTF8 encoded)
  4. äëï //output in console

Bart
Title: Re: TRegistry - cleanup and fixes
Post by: ASerge on February 10, 2019, 04:46:36 pm
Why? This already works without LCL. The string returned has codepage UTF8 and assigning it to another string will trigger the conversion from UTF8 to current codepage.
This show wrong names with trunc, but right with my patch.
Code: Pascal  [Select]
  1. program RegTest;
  2. {$APPTYPE CONSOLE}
  3. {$MODE OBJFPC}
  4. {$LONGSTRINGS ON}
  5.  
  6. uses SysUtils, Classes, Registry;
  7.  
  8. var
  9.   R: TRegistry;
  10.   L: TStringList;
  11.   Name, Value: string;
  12.   Info: TRegKeyInfo;
  13. begin
  14.   try
  15.     R := TRegistry.Create;
  16.     try
  17.       if R.OpenKeyReadOnly('...Some not ansi key') then
  18.       begin
  19.         if R.GetKeyInfo(Info) then
  20.           Writeln(DateTimeToStr(Info.FileTime)); // Forgot, this also patched, because original return in UTC not local time
  21.         L := TStringList.Create;
  22.         try
  23.           R.GetValueNames(L);
  24.           for Name in L do
  25.             if R.GetDataType(Name) in [rdString, rdExpandString] then
  26.             begin
  27.               Value := R.ReadString(Name);
  28.               Writeln(Name, '=', Value);
  29.             end;
  30.         finally
  31.           L.Free;
  32.         end;
  33.       end;
  34.     finally
  35.       R.Free;
  36.     end;
  37.   except
  38.     on E: Exception do
  39.       ShowException(E, nil);
  40.   end;
  41.   Readln;
  42. end.
Title: Re: TRegistry - cleanup and fixes
Post by: Bart on February 10, 2019, 07:03:47 pm
This may or may not applye. Please don't shoot me.

The "some not ansi key" string must be defined somewhere in your sourcecode.
If you write that code in Lazarus IDE then the sourcecode will be stored as UTF8, without BOM.
If you then compile this from commandline, the string literal will contains UTF encoded bytes, but it will be interpreted as local codepage encoded string, so it will have a totally other content than what you expected.
As a result you will not be able to open the key.

Bart
Title: Re: TRegistry - cleanup and fixes
Post by: ASerge on February 10, 2019, 07:20:23 pm
The "some not ansi key" string must be defined somewhere in your sourcecode.
It's in registry, not is code! Key may be ansi, but value names not - error appear.
Title: Re: TRegistry - cleanup and fixes
Post by: Bart on February 10, 2019, 07:26:14 pm
The "some not ansi key" string must be defined somewhere in your sourcecode.
It's in registry, not is code! Key may be ansi, but value names not - error appear.

If you can open the key then ReadString will return a string with UTF8 encoding.
That should not be a problem.

Opening a key of which the name is not ascii however turns out to be a problem.

Code: Pascal  [Select]
  1. program notascii;
  2.  
  3. {$codepage cp1252}
  4. {$mode objfpc}{$h+}
  5. uses
  6.   registry;
  7.  
  8. var
  9.   R: TRegistry;
  10.   S: String;
  11. begin
  12.   R := TRegistry.Create(KEY_READ);
  13.   R.RootKey := HKEY_CURRENT_USER;
  14.   if not R.OpenKeyReadOnly('Software\XXXXXXXXXX') then
  15.   begin
  16.     writeln('OpenKey failed');
  17.     exit;
  18.   end;
  19.   S := R.ReadString('äëï');
  20.   //S := R.ReadString('abc');
  21.   writeln('S="',S,'"');
  22.   R.Free;
  23. end.

Saved in notepad with default encoding, compiled from commandline gives with fpc trunk:
Code: [Select]
C:\Users\Bart\LazarusProjecten\bugs\Console\registry>notascii
S=""

Compiled with fpc 3.0.4 it gives:
Code: [Select]
C:\Users\Bart\LazarusProjecten\bugs\Console\registry>notascii
S="a-umlaut,e-umlaut,i-umlaut"

So, that definitely is a bug and should be reported.


Bart
Title: Re: TRegistry - cleanup and fixes
Post by: engkin on February 10, 2019, 07:42:17 pm
Trunk version is using UTF8Decode:
Code: Pascal  [Select]
  1. function TRegistry.SysGetData(const Name: String; Buffer: Pointer;
  2.           BufSize: Integer; Out RegData: TRegDataType): Integer;
  3. Var
  4.   u: UnicodeString;
  5.   RD : DWord;
  6.  
  7. begin
  8.   u := UTF8Decode(Name);
  9.   FLastError:=RegQueryValueExW(fCurrentKey,PWideChar(u),Nil,@RD,Buffer,lpdword(@BufSize));
Title: Re: TRegistry - cleanup and fixes
Post by: CCRDude on February 10, 2019, 07:53:19 pm
Apparently not, since you thought the presence of NULL BYTES in a UTF-16 string would cause items to be separated incorrectly.  IT DOES NOT.  NULL BYTES are fine in a UTF-16 string.  NULL CHARACTERS are not fine.  There is a difference!

That is not what I thought, but obviously as a non-native speaker I'm not able to transport what I thought.

What happened is simple - I had a REG_MULTI_SZ with two lines, "Hello" and "World", and using ReadStringList I got items "H", "e", "l", "l", "o" .... "d".

I did NOT think the presence of NULL BYTES in a UTF-16 string was causing the problem, but that the code interpreted the UTF-16 bytes as Ansi bytes.
Title: Re: TRegistry - cleanup and fixes
Post by: marcov on February 10, 2019, 08:21:45 pm
Whine about it on the forum, apparently. r41267

Great, so I'll just stop trying to find out how to submit useful patches, and start whining more?

No, no. I just wanted to post a notice that I had commited it, and couldn't quickly come up with a text. Tried to be funny, nothing seriously meant with it.

Note though that I'm not really knowledgeable about registry and even less regini stuff, since I usually use TXMLConfig (even under Delphi).
Title: Re: TRegistry - cleanup and fixes
Post by: Remy Lebeau on February 10, 2019, 09:52:19 pm
What happened is simple - I had a REG_MULTI_SZ with two lines, "Hello" and "World", and using ReadStringList I got items "H", "e", "l", "l", "o" .... "d".

The only way that can happen is if ReadStringList() read the list into a UTF-16 string and then parsed the raw byte octets as if they were an ANSI instead of as UTF-16.

I did NOT think the presence of NULL BYTES in a UTF-16 string was causing the problem

That is not what your original message suggested.  But so be it, let's just chalk it up to language differences.

but that the code interpreted the UTF-16 bytes as Ansi bytes.

Exactly.
Title: Re: TRegistry - cleanup and fixes
Post by: Bart on February 10, 2019, 11:16:09 pm
So, that definitely is a bug and should be reported.

Reported as Issue 35060 (https://bugs.freepascal.org/view.php?id=35060).

Bart
Title: Re: TRegistry - cleanup and fixes
Post by: CCRDude on February 11, 2019, 08:21:59 am
I've written a demonstration for the ReadStringList/WriteStringList issue because I seem to have expressed myself unclear repeatedly :)

https://gitlab.com/ccrdude/freepascal-issue-34876-readmultistring-bug

The test code writes a correct two lined REG_MULTI_SZ (you can verify using regedit.exe), but the output is 25 lines.

edit: ASerge has the fix in his huge list of fixes, for example.
Title: Re: TRegistry - cleanup and fixes
Post by: Bart on February 11, 2019, 12:07:33 pm
I've written a demonstration for the ReadStringList/WriteStringList issue because I seem to have expressed myself unclear repeatedly :)

https://gitlab.com/ccrdude/freepascal-issue-34876-readmultistring-bug

Can you do the following?

1.
Describe how you manually enter  REG_MULTI_SZ in the registry using regedit?
Alternatively: can you create such a thing in HKCU\Software\Bug34876 and then export it (reg export HKCU\Software\Bug34876 bug34876.reg) and attach that here?
(I have never had to do such a thing with the registry, so I really have no clue.)

2.
Give a short fpc only program (preferrably also compilable with Delphi (so use AnsiString instead of String)) that reads the stringlist from the registry?

Issue 34876 (https://bugs.freepascal.org/view.php?id=34876) misses a test program (one that does not rely on Lazarus) and it would help if we could proof that the output is not compatible with Delphi.

Bart
Title: Re: TRegistry - cleanup and fixes
Post by: CCRDude on February 11, 2019, 12:33:58 pm
1.
Regedit:
a. navigate to the registry key,
b. right-click the value list,
c. select "New",
d. select "Multi-String Value",
e. double-click the new value,
f. enter two lines into the multi-line edit field of the editing dialog.

I'm also attaching the .reg file.

2.
That short program is in the repository above. Or are you referring to a program that would read the list "correctly" instead of using TRegistry?

Delphi XE doesn't even know registry data types like REG_MULTI_SZ until today I just learned:
http://docwiki.embarcadero.com/Libraries/Rio/en/System.Win.Registry.TRegistry_Methods
Title: Re: TRegistry - cleanup and fixes
Post by: Bart on February 11, 2019, 02:42:00 pm
Thanks, I figured it (regedit) out myself.
I wrote a new patch and attached it to the bugtracker.
The orginal patch had some flaws (see my notes in the bugtracker).

Bart
Title: Re: TRegistry - cleanup and fixes
Post by: Bart on February 11, 2019, 03:33:43 pm
I asked on the fpc-devel ML to take a look at the various bugreports about TRegistry/TRegIniFile.

Bart
Title: Re: TRegistry - cleanup and fixes
Post by: Bart on February 12, 2019, 11:33:04 am
One of the devels has promised to look into it, somewhere in the next week, so be patient.

Bart
Title: Re: TRegistry - cleanup and fixes
Post by: ASerge on February 13, 2019, 12:32:14 am
Problem: when using the string type in the functions of the TRegistry and Unicode API, there are problems with the correct conversion between the strings.
Attached the test with .reg file. You can expand by using your language and adding entries to .reg. The result is: if used LCL, there is no difference how to convert between UnicodeString and string. If not used LCL, the output is best convert via Utf8Encode, but in the input strings (names) needs to analyze the code page before deciding whether to use Utf8Decode.
Title: Re: TRegistry - cleanup and fixes
Post by: Bart on February 13, 2019, 09:43:25 am
This problem cannot be solved in as long as we keep using strings instead of either using UTF8 of UTF16 everywhere.

At some point a conversion will take place and if your current codepage is NOT Utf8 data loss can occur.
Lazarus programs should not suffer such a problem, all conversions should be lossless.

Personally I think that since the registry is just a wrapper around a Windows specific API we should make that interface use UnicodeString in all of it's string parameters.
This way it is clear, right from the start, that any conversion problem is on the side of the programmer if he uses ansistrings there.
Again: no problem here for Lazarus.
(Possible problem: TStrings does not support UnicodeStrings yet?)

Also IMHO we should drop the non-windows implementation of Registry and make it a Windows only package.
I cannot believe that any person of sound mind would use TRegistry or TRegIniFile on such a platform.
Better alternatives exists: TIniFile (a sort of native solution for *nix) or TXmlConfig.

I'm quite sure the fpc devels will not agree on that last point though.

Bart
Title: Re: TRegistry - cleanup and fixes
Post by: PascalDragon on February 13, 2019, 10:27:53 am
Also IMHO we should drop the non-windows implementation of Registry and make it a Windows only package.
I cannot believe that any person of sound mind would use TRegistry or TRegIniFile on such a platform.
Better alternatives exists: TIniFile (a sort of native solution for *nix) or TXmlConfig.

I'm quite sure the fpc devels will not agree on that last point though.
Correct. Because that would be a break of backwards compatibility and considering that we receive bug reports about this feature people are using it.
Title: Re: TRegistry - cleanup and fixes
Post by: Bart on February 13, 2019, 12:27:41 pm
Correct. Because that would be a break of backwards compatibility and considering that we receive bug reports about this feature people are using it.

Well, we could poll to see if anybody uses that at all?
We could deprecate it and remove it in 4.0?

Why would a *nix user expect that a TRegistry implementation exists at all?
It's like Windows users expecting to be able to use package cocoaint, dbus or libcups.

Bart
Title: Re: TRegistry - cleanup and fixes
Post by: lucamar on February 13, 2019, 12:46:39 pm
Why would a *nix user expect that a TRegistry implementation exists at all?

A *nix user wouldn't even think about it but a developer porting a Windows application making heavy use of it would probably be mighty thankful to have it. Just as a *nix user porting an app to Windows would appreciate having DBus there instead of having to learn how to do things with COM or whatever :)
Title: Re: TRegistry - cleanup and fixes
Post by: CCRDude on February 13, 2019, 12:50:08 pm
I can confirm that: as someone porting software from Windows to Linux and MacOS, I'm glad about that feature, since it allows me to port existing code easily, without having to port the settings storage to a platform independent solution first.

On the other hand, as a mid- to long-term solution for software I port, I prefer the mentioned alternatives, since even on Windows, they would add the benefit of easier conversion to a portable app, for example.
Title: Re: TRegistry - cleanup and fixes
Post by: lucamar on February 13, 2019, 12:54:55 pm
On the other hand, as a mid- to long-term solution for software I port, I prefer the mentioned alternatives, since even on Windows, they would add the benefit of easier conversion to a portable app, for example.

Yes, I think the same. In fact, IIRC, I have used the registry (in Windows) for config storage in just one app and that only because I was forced to do so (by the then boss).
Title: Re: TRegistry - cleanup and fixes
Post by: Bart on February 13, 2019, 02:17:49 pm
Well, the drawback is that if you fix TRegistry on Windows (so you probably are a Windows developer or at least have a Windows machine for development, and you have at least a rudimentary knowledge of TRegistry and the Windows API it uses), you may accidentally break the not-Windows part of it.
This will result in rejection of the patch unless the not-windows side is fixed.
And who is going to do that?

No devel uses TRegistry.
No Windows user uses the XML implementation of TRegistry.
No Linux user in his right mind uses TRegistry.
So, it's up to whoever fixed TRegistry on Windows to fix code he/she most likely knows nothing about, nor wishes to know anything about, nor wishes to use it before hell freezes over.

This is truly motivating for people supplying patches.

Note: I supplied patches for TRegistry and TRegIniFile, I don't even use TRegistry (let alone TRegIniFile) anywhere. I'm not going to try to fix things on the not-Windows side if that got broke (I just hope my patches did not).

Bart
Title: Re: TRegistry - cleanup and fixes
Post by: PascalDragon on February 14, 2019, 09:16:48 am
Correct. Because that would be a break of backwards compatibility and considering that we receive bug reports about this feature people are using it.

Well, we could poll to see if anybody uses that at all?
We could deprecate it and remove it in 4.0?

We receive bug reports for it, so there is no need for a poll as there clearly are users using it. And as the others mentioned: it eases porting.
Title: Re: TRegistry - cleanup and fixes
Post by: Bart on February 14, 2019, 10:53:01 am
We receive bug reports for it, so there is no need for a poll as there clearly are users using it. And as the others mentioned: it eases porting.

Fair enough.

Bart
Title: Re: TRegistry - cleanup and fixes
Post by: Bart on February 23, 2019, 05:27:52 pm
Well, we're getting somewhere:
r41325+41352: Fixed bug 35060, proper unicode-handling of registry-keynames
r41415: fix writing unicode strings in the Windows registry

Remaining issues:
#0035022 (https://bugs.freepascal.org/view.php?id=35022): TRegIni.WriteString writes to wrong Key (patch fixes also #0035023 (https://bugs.freepascal.org/view.php?id=35023) and #0033980 (https://bugs.freepascal.org/view.php?id=33980)).
#0034876 (https://bugs.freepascal.org/view.php?id=34876): TRegistry's rdMultiString treats Unicode strings as AnsiStrings (patch attached).
#0035132 (https://bugs.freepascal.org/view.php?id=35132) TRegistry.DeleteKey inconsistent behaviour Windows vs other platforms (patch and test attached). (DeleteKey on non-Windows acts as DeleteTree)

If all this is fixed I can write a TRegistry.DeleteTree implementation.
(My first attempt to do this remove the entire HKCU\Software section when I ran the code in Delphi for comparison. In the end I had to delete my account and recreate it...)

Bart