* * *

Author Topic: Krakozyabry - fear  (Read 2848 times)

Фролов

  • New member
  • *
  • Posts: 15
Krakozyabry - fear
« on: April 18, 2017, 10:35:19 pm »
Hey. I switched from Delphi, and started using the function


Code: Pascal  [Select]
  1. function WriteFile(hFile: THandle; const Buffer; nNumberOfBytesToWrite: DWORD;
  2.   var lpNumberOfBytesWritten: DWORD; lpOverlapped: POverlapped): boolean;
  3.   stdcall; external kernel32 Name 'WriteFile';
  4.  
  5.  
  6. function SaveToFile(FileName: string; Str: UnicodeString): Boolean;
  7. var
  8.   TFile: THandle;
  9.   Size: DWORD;
  10. begin
  11.   TFile := FileCreate(FileName);
  12.   if TFile <> INVALID_HANDLE_VALUE then
  13.   begin
  14.     WriteFile(TFile, Pointer(Str)^, Length(Str) * SizeOf(WideChar), Size, 0);
  15.     FileClose(TFile);
  16.   end;
  17. end;
  18.  

However, this does not work as expected (!

Each successive character is equal to the NULL character, alternating the desired one

sorry for my English.

Cyrax

  • Hero Member
  • *****
  • Posts: 543
Re: Krakozyabry - fear
« Reply #1 on: April 18, 2017, 10:41:41 pm »
Don't use Pointer(Str)^. Use Str[1] instead.

Фролов

  • New member
  • *
  • Posts: 15
Re: Krakozyabry - fear
« Reply #2 on: April 18, 2017, 11:00:43 pm »
Nothing has changed

https://drive.google.com/open?id=0B0CNGfIF4x9zbzdiQ2JxVVlrWmM


The problem is in the characters that contain unicode (for example smiles)
« Last Edit: April 18, 2017, 11:33:55 pm by Фролов »

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 4381
    • wiki
Re: Krakozyabry - fear
« Reply #3 on: April 19, 2017, 02:22:26 am »
Lazarus use utf8 (Delphi I think uses utf16).

If the text you load is utf16, you need to convert it, or use "widestring".

In utf16 the letter "A" is represented by 2 bytes: 0, 65
So that probably is why you get #0 chars.

JuhaManninen

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3201
  • I like bugs.
Re: Krakozyabry - fear
« Reply #4 on: April 19, 2017, 08:09:33 am »
I also think it works as expected. You used UnicodeString, WideChar and Pointer explicitly. You clearly wanted to write UtF-16. I believe Delphi would behave the same way.
If you want to write UTF-8 then use "String" type and functions from FPC / Lazarus libs.
Your code will be cross-platform and often compatible with Delphi at source code level.

Фролов

  • New member
  • *
  • Posts: 15
Re: Krakozyabry - fear
« Reply #5 on: April 19, 2017, 08:28:23 am »
Is there an option to enable UTF16? I need it

Thaddy

  • Hero Member
  • *****
  • Posts: 4651
Re: Krakozyabry - fear
« Reply #6 on: April 19, 2017, 08:38:24 am »
{$mode delphiunicode}.....

But due to muddy choices in the past,don't keep your hopes up with Lazarus. That simply doesn't work very good yet in that mode.
Also FPC itself needs some work in non-core libraries for UTF16.
In the future this will obviously improve.
What is the specific reason you need UTF16? UTF8 works OK with Lazarus. You just need to get used to 4 times the buffer spaces you need compared to Ansi.
"Logically, no number of positive outcomes at the level of experimental testing can confirm a scientific theory, but a single counterexample is logically decisive."

JuhaManninen

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3201
  • I like bugs.
Re: Krakozyabry - fear
« Reply #7 on: April 19, 2017, 11:36:13 am »
Is there an option to enable UTF16? I need it
Thaddy's answer is correct but I am interested what you want to achieve. In your example code you wanted to write a UTF-16 buffer to a file and it worked. The "Pointer()" typecast prevents any type checks and conversions. It writes the UTF-16 UnicodeString as is. Even the length was calculated correctly because of "SizeOf(WideChar)".
You were not happy with '0's in the UTF-16 output, so apparently you wanted UTF-8.
Now you say you want to enable UTF-16. I am puzzled...

Фролов

  • New member
  • *
  • Posts: 15
Re: Krakozyabry - fear
« Reply #8 on: April 19, 2017, 12:08:08 pm »
First of all, I included the delphiunicode in the project setup

This does not work, as before, with the NULL character.

I will attach the source code

46 and 65 line



Code: Pascal  [Select]
  1. program Project1;
  2.  
  3. {$mode delphiunicode}
  4.  
  5. uses
  6.   PHPLexer,
  7.   Windows,
  8.   Classes,
  9.   SysUtils;
  10.  
  11.   function sprintf(S: PAnsiChar; const Format: PAnsiChar): integer; cdecl;
  12.   varargs; external 'msvcrt.dll';
  13.  
  14.  
  15. var
  16.   startTime, stopTime, iCounterPerSec: int64;
  17.   time: single;
  18.  
  19. var
  20.   SOpt: TArrOptionString;
  21.   BIdx: integer;
  22.  
  23.   OutStr: UnicodeString;  // string
  24.   Output: ansistring;
  25.  
  26.   TFile: THandle;
  27.   Size: DWORD;
  28. begin
  29.   SetMultiByteConversionCodePage(65001);
  30.  
  31.   try
  32.     BIdx := 0;
  33.     SetLength(SOpt, BIdx + 1);
  34.  
  35.     with SOpt[BIdx] do
  36.     begin
  37.  
  38.       LoadPHPFile(SOpt, BIdx, '1.php');
  39.  
  40.       if ISFile then
  41.       begin
  42.         QueryPerformanceCounter(startTime);
  43.         OutStr := '';
  44.         while not GetNextToken(SOpt, BIdx) do
  45.         begin
  46.           OutStr += IntToStr(Row + 2) + ') ' + GetTokenName(CurrentToken) +
  47.             '. Value(' + IntToStr(CurrentLenToken) + '):' + Value + #13#10;
  48.  
  49.         end;
  50.  
  51.         if QueryPerformanceCounter(stopTime) then
  52.         begin
  53.           QueryPerformanceFrequency(iCounterPerSec);
  54.  
  55.           time := (0 - startTime + stopTime) / iCounterPerSec;
  56.  
  57.           SetLength(Output, 30);
  58.  
  59.           SetLength(Output, sprintf(@Output[1], '%f sec.', time));
  60.  
  61.           writeln(Output);
  62.         end;
  63.  
  64.  
  65.         TFile := FileCreate('Lexem.txt');
  66.         if TFile <> INVALID_HANDLE_VALUE then
  67.         begin
  68.           WriteFile(TFile, OutStr[1], Length(OutStr) * SizeOf(widechar), Size, nil);
  69.           FileClose(TFile);
  70.         end;
  71.       end
  72.       else
  73.         writeln('File not found: ' + fileName);
  74.     end;
  75.  
  76.     Readln;
  77.     EXIT;
  78.   except
  79.     on E: Exception do
  80.     begin
  81.       writeln(E.ClassName, ': ', E.Message);
  82.       Readln;
  83.     end;
  84.   end;
  85.  
  86. end.

Фролов

  • New member
  • *
  • Posts: 15
Re: Krakozyabry - fear
« Reply #9 on: April 19, 2017, 12:27:50 pm »
I was very surprised that the lazarus  was very fast, compared to Delphi

0.012515 against 0.009027

But the encoding kills

ASerge

  • Sr. Member
  • ****
  • Posts: 454
Re: Krakozyabry - fear
« Reply #10 on: April 19, 2017, 08:36:03 pm »
Each successive character is equal to the NULL character, alternating the desired one
Strange. This code is working fine:
Code: Pascal  [Select]
  1. program Project1;
  2.  
  3. uses Windows, SysUtils;
  4.  
  5. function SaveToFile(const FileName: string; const Str: UnicodeString): Boolean;
  6. var
  7.   HFile: THandle;
  8.   Unused: DWORD;
  9. begin
  10.   Result := False;
  11.   HFile := FileCreate(FileName);
  12.   if HFile <> INVALID_HANDLE_VALUE then
  13.   try
  14.     Result := WriteFile(HFile, Pointer(Str)^, Length(Str) * SizeOf(WideChar), {%H-}Unused, nil);
  15.   finally
  16.     FileClose(HFile);
  17.   end;
  18. end;
  19.  
  20. var
  21.   S: string = 'Строка по русски';
  22. begin
  23.   SetMultiByteConversionCodePage(CP_UTF8);
  24.   SaveToFile('c:\temp\ИмяФайла.txt', S);
  25. end.

ASerge

  • Sr. Member
  • ****
  • Posts: 454
Re: Krakozyabry - fear
« Reply #11 on: April 19, 2017, 08:57:16 pm »
Don't use Pointer(Str)^. Use Str[1] instead.
Disagree
Code: Pascal  [Select]
  1. program Project1;
  2.  
  3. {$RANGECHECKS ON}
  4.  
  5. procedure Dummy(const Buffer);
  6. begin
  7. end;
  8.  
  9. var
  10.   S: string = '';
  11. begin
  12.   Dummy(Pointer(S)^); // Working fine
  13. {$IFNDEF Skip_Cyrax_Hint}
  14.   Dummy(S[1]); // Runtime error 201
  15. {$ENDIF}
  16. end.

JuhaManninen

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3201
  • I like bugs.
Re: Krakozyabry - fear
« Reply #12 on: April 19, 2017, 10:05:12 pm »
Strange. This code is working fine:
Code: Pascal  [Select]
  1. ...
  2.   SetMultiByteConversionCodePage(CP_UTF8);
  3. ...
That line changes a lot as you probably knew already.
I don't quite understand why you guys want to write Windows specific code with different explicit string types, and finally write out UTF-8 text which would already be supported without any hassle.

HeavyUser

  • Full Member
  • ***
  • Posts: 118
Re: Krakozyabry - fear
« Reply #13 on: April 19, 2017, 10:18:21 pm »
Strange. This code is working fine:
Code: Pascal  [Select]
  1. ...
  2.   SetMultiByteConversionCodePage(CP_UTF8);
  3. ...
That line changes a lot as you probably knew already.
I don't quite understand why you guys want to write Windows specific code with different explicit string types, and finally write out UTF-8 text which would already be supported without any hassle.
because utf-8 adds complexity to processing with out any gains. UTF8 is used only because it is mandatory if it was not even the files would be utf16.

ASerge

  • Sr. Member
  • ****
  • Posts: 454
Re: Krakozyabry - fear
« Reply #14 on: April 19, 2017, 10:41:53 pm »
I don't quite understand why you guys want to write Windows specific code with different explicit string types, and finally write out UTF-8 text which would already be supported without any hassle.
I don't want. Is the answer for topic starter. Without SetMultiByteConversionCodePage code invalid.

 

Recent

Get Lazarus at SourceForge.net. Fast, secure and Free Open Source software downloads Open Hub project report for Lazarus