Recent

Author Topic: TRegistry - cleanup and fixes  (Read 7232 times)

ASerge

  • Hero Member
  • *****
  • Posts: 1423
Re: TRegistry - cleanup and fixes
« Reply #15 on: February 10, 2019, 07:20:23 pm »
The "some not ansi key" string must be defined somewhere in your sourcecode.
It's in registry, not is code! Key may be ansi, but value names not - error appear.

Bart

  • Hero Member
  • *****
  • Posts: 3548
    • Bart en Mariska's Webstek
Re: TRegistry - cleanup and fixes
« Reply #16 on: February 10, 2019, 07:26:14 pm »
The "some not ansi key" string must be defined somewhere in your sourcecode.
It's in registry, not is code! Key may be ansi, but value names not - error appear.

If you can open the key then ReadString will return a string with UTF8 encoding.
That should not be a problem.

Opening a key of which the name is not ascii however turns out to be a problem.

Code: Pascal  [Select]
  1. program notascii;
  2.  
  3. {$codepage cp1252}
  4. {$mode objfpc}{$h+}
  5. uses
  6.   registry;
  7.  
  8. var
  9.   R: TRegistry;
  10.   S: String;
  11. begin
  12.   R := TRegistry.Create(KEY_READ);
  13.   R.RootKey := HKEY_CURRENT_USER;
  14.   if not R.OpenKeyReadOnly('Software\XXXXXXXXXX') then
  15.   begin
  16.     writeln('OpenKey failed');
  17.     exit;
  18.   end;
  19.   S := R.ReadString('äëï');
  20.   //S := R.ReadString('abc');
  21.   writeln('S="',S,'"');
  22.   R.Free;
  23. end.

Saved in notepad with default encoding, compiled from commandline gives with fpc trunk:
Code: [Select]
C:\Users\Bart\LazarusProjecten\bugs\Console\registry>notascii
S=""

Compiled with fpc 3.0.4 it gives:
Code: [Select]
C:\Users\Bart\LazarusProjecten\bugs\Console\registry>notascii
S="a-umlaut,e-umlaut,i-umlaut"

So, that definitely is a bug and should be reported.


Bart

engkin

  • Hero Member
  • *****
  • Posts: 2513
Re: TRegistry - cleanup and fixes
« Reply #17 on: February 10, 2019, 07:42:17 pm »
Trunk version is using UTF8Decode:
Code: Pascal  [Select]
  1. function TRegistry.SysGetData(const Name: String; Buffer: Pointer;
  2.           BufSize: Integer; Out RegData: TRegDataType): Integer;
  3. Var
  4.   u: UnicodeString;
  5.   RD : DWord;
  6.  
  7. begin
  8.   u := UTF8Decode(Name);
  9.   FLastError:=RegQueryValueExW(fCurrentKey,PWideChar(u),Nil,@RD,Buffer,lpdword(@BufSize));

CCRDude

  • Hero Member
  • *****
  • Posts: 502
Re: TRegistry - cleanup and fixes
« Reply #18 on: February 10, 2019, 07:53:19 pm »
Apparently not, since you thought the presence of NULL BYTES in a UTF-16 string would cause items to be separated incorrectly.  IT DOES NOT.  NULL BYTES are fine in a UTF-16 string.  NULL CHARACTERS are not fine.  There is a difference!

That is not what I thought, but obviously as a non-native speaker I'm not able to transport what I thought.

What happened is simple - I had a REG_MULTI_SZ with two lines, "Hello" and "World", and using ReadStringList I got items "H", "e", "l", "l", "o" .... "d".

I did NOT think the presence of NULL BYTES in a UTF-16 string was causing the problem, but that the code interpreted the UTF-16 bytes as Ansi bytes.

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7633
Re: TRegistry - cleanup and fixes
« Reply #19 on: February 10, 2019, 08:21:45 pm »
Whine about it on the forum, apparently. r41267

Great, so I'll just stop trying to find out how to submit useful patches, and start whining more?

No, no. I just wanted to post a notice that I had commited it, and couldn't quickly come up with a text. Tried to be funny, nothing seriously meant with it.

Note though that I'm not really knowledgeable about registry and even less regini stuff, since I usually use TXMLConfig (even under Delphi).

Remy Lebeau

  • Hero Member
  • *****
  • Posts: 681
    • Lebeau Software
Re: TRegistry - cleanup and fixes
« Reply #20 on: February 10, 2019, 09:52:19 pm »
What happened is simple - I had a REG_MULTI_SZ with two lines, "Hello" and "World", and using ReadStringList I got items "H", "e", "l", "l", "o" .... "d".

The only way that can happen is if ReadStringList() read the list into a UTF-16 string and then parsed the raw byte octets as if they were an ANSI instead of as UTF-16.

I did NOT think the presence of NULL BYTES in a UTF-16 string was causing the problem

That is not what your original message suggested.  But so be it, let's just chalk it up to language differences.

but that the code interpreted the UTF-16 bytes as Ansi bytes.

Exactly.
Remy Lebeau
Lebeau Software - Owner, Developer
Internet Direct (Indy) - Admin, Developer (Support forum)

Bart

  • Hero Member
  • *****
  • Posts: 3548
    • Bart en Mariska's Webstek
Re: TRegistry - cleanup and fixes
« Reply #21 on: February 10, 2019, 11:16:09 pm »
So, that definitely is a bug and should be reported.

Reported as Issue 35060.

Bart

CCRDude

  • Hero Member
  • *****
  • Posts: 502
Re: TRegistry - cleanup and fixes
« Reply #22 on: February 11, 2019, 08:21:59 am »
I've written a demonstration for the ReadStringList/WriteStringList issue because I seem to have expressed myself unclear repeatedly :)

https://gitlab.com/ccrdude/freepascal-issue-34876-readmultistring-bug

The test code writes a correct two lined REG_MULTI_SZ (you can verify using regedit.exe), but the output is 25 lines.

edit: ASerge has the fix in his huge list of fixes, for example.
« Last Edit: February 11, 2019, 08:31:00 am by CCRDude »

Bart

  • Hero Member
  • *****
  • Posts: 3548
    • Bart en Mariska's Webstek
Re: TRegistry - cleanup and fixes
« Reply #23 on: February 11, 2019, 12:07:33 pm »
I've written a demonstration for the ReadStringList/WriteStringList issue because I seem to have expressed myself unclear repeatedly :)

https://gitlab.com/ccrdude/freepascal-issue-34876-readmultistring-bug

Can you do the following?

1.
Describe how you manually enter  REG_MULTI_SZ in the registry using regedit?
Alternatively: can you create such a thing in HKCU\Software\Bug34876 and then export it (reg export HKCU\Software\Bug34876 bug34876.reg) and attach that here?
(I have never had to do such a thing with the registry, so I really have no clue.)

2.
Give a short fpc only program (preferrably also compilable with Delphi (so use AnsiString instead of String)) that reads the stringlist from the registry?

Issue 34876 misses a test program (one that does not rely on Lazarus) and it would help if we could proof that the output is not compatible with Delphi.

Bart

CCRDude

  • Hero Member
  • *****
  • Posts: 502
Re: TRegistry - cleanup and fixes
« Reply #24 on: February 11, 2019, 12:33:58 pm »
1.
Regedit:
a. navigate to the registry key,
b. right-click the value list,
c. select "New",
d. select "Multi-String Value",
e. double-click the new value,
f. enter two lines into the multi-line edit field of the editing dialog.

I'm also attaching the .reg file.

2.
That short program is in the repository above. Or are you referring to a program that would read the list "correctly" instead of using TRegistry?

Delphi XE doesn't even know registry data types like REG_MULTI_SZ until today I just learned:
http://docwiki.embarcadero.com/Libraries/Rio/en/System.Win.Registry.TRegistry_Methods

Bart

  • Hero Member
  • *****
  • Posts: 3548
    • Bart en Mariska's Webstek
Re: TRegistry - cleanup and fixes
« Reply #25 on: February 11, 2019, 02:42:00 pm »
Thanks, I figured it (regedit) out myself.
I wrote a new patch and attached it to the bugtracker.
The orginal patch had some flaws (see my notes in the bugtracker).

Bart

Bart

  • Hero Member
  • *****
  • Posts: 3548
    • Bart en Mariska's Webstek
Re: TRegistry - cleanup and fixes
« Reply #26 on: February 11, 2019, 03:33:43 pm »
I asked on the fpc-devel ML to take a look at the various bugreports about TRegistry/TRegIniFile.

Bart

Bart

  • Hero Member
  • *****
  • Posts: 3548
    • Bart en Mariska's Webstek
Re: TRegistry - cleanup and fixes
« Reply #27 on: February 12, 2019, 11:33:04 am »
One of the devels has promised to look into it, somewhere in the next week, so be patient.

Bart

ASerge

  • Hero Member
  • *****
  • Posts: 1423
Re: TRegistry - cleanup and fixes
« Reply #28 on: February 13, 2019, 12:32:14 am »
Problem: when using the string type in the functions of the TRegistry and Unicode API, there are problems with the correct conversion between the strings.
Attached the test with .reg file. You can expand by using your language and adding entries to .reg. The result is: if used LCL, there is no difference how to convert between UnicodeString and string. If not used LCL, the output is best convert via Utf8Encode, but in the input strings (names) needs to analyze the code page before deciding whether to use Utf8Decode.
« Last Edit: February 13, 2019, 12:34:44 am by ASerge »

Bart

  • Hero Member
  • *****
  • Posts: 3548
    • Bart en Mariska's Webstek
Re: TRegistry - cleanup and fixes
« Reply #29 on: February 13, 2019, 09:43:25 am »
This problem cannot be solved in as long as we keep using strings instead of either using UTF8 of UTF16 everywhere.

At some point a conversion will take place and if your current codepage is NOT Utf8 data loss can occur.
Lazarus programs should not suffer such a problem, all conversions should be lossless.

Personally I think that since the registry is just a wrapper around a Windows specific API we should make that interface use UnicodeString in all of it's string parameters.
This way it is clear, right from the start, that any conversion problem is on the side of the programmer if he uses ansistrings there.
Again: no problem here for Lazarus.
(Possible problem: TStrings does not support UnicodeStrings yet?)

Also IMHO we should drop the non-windows implementation of Registry and make it a Windows only package.
I cannot believe that any person of sound mind would use TRegistry or TRegIniFile on such a platform.
Better alternatives exists: TIniFile (a sort of native solution for *nix) or TXmlConfig.

I'm quite sure the fpc devels will not agree on that last point though.

Bart