Recent

Author Topic: TDBMemo mistranslating ASCII 248  (Read 961 times)

RedOctober

  • Sr. Member
  • ****
  • Posts: 450
TDBMemo mistranslating ASCII 248
« on: August 12, 2022, 05:23:41 pm »
Platform: 
Windows Server 2016, Lazarus 2.0.20, FPC 3.2.0, IBX 2.3-4
Firebird  v3.0.73
Database Character Set:  WIN1252

Lazarus Application problem field linkage:
-- Interface --
TDBMemo
TIBMemoField
TIBDataSet
FireBird BLOB field (CREATE DOMAIN DOM_NOTE AS BLOB SUB_TYPE TEXT SEGMENT SIZE 80 CHARACTER SET WIN1252;)
-- Backend --

Problem description:

The TDBMemo is unable to properly save a degree symbol (ASCII 248). When entered in the user interface, by holding down the Alt key, then typing 248, the degree symbol appears. I save the record and reopen the record and there is a funny capital A with a squiggle over top of it, just to the left of the degree symbol.  That unwanted A character is the problem.  It should not be there.


Attempts to localize the problem:

When I use my database management tool, (FireBolt) I can use an UPDATE statement to save the degree symbol to the same field, successfully. It then displays correctly in FireBolt. When I use the UPDATE statement in FireBolt, and the degree symbol is successfully saved and showing correctly in FireBolt, when I refresh the record in my Lazarus app, the TDBMemo field shows the degree symbol correctly.
I can successfully save and retrieve a degree symbol in an TIBString field, but in a TDBMemo, I cannot. I get that extra funny A character when using a TDBMemo.
The above tests, tells me that there is a problem with saving (only), from a TDBMemo to a TIBMemoField. Loading the TDBMemo from a TIBMemoField works correctly.
The problem is not with the Firebird database.

Attempts to fix the problem:
I examined the properties of the TIBMemoField, and found one setting that I thought might be involved: Transliterate.  It was set to OFF.  I set it ON and compiled and re-ran my test.  The problem persists.  The funny A character appears in the backend field as well as the TDBMemo control when saving.


Question:
What do I need to do, to get the TDBMemo to properly save a degree symbol, ASCII 248 ?

Thanks in advance for any help you can provide.

Jurassic Pork

  • Hero Member
  • *****
  • Posts: 1228
Re: TDBMemo mistranslating ASCII 248
« Reply #1 on: August 13, 2022, 02:12:53 am »
Hello,
The TDBMemo is unable to properly save a degree symbol (ASCII 248).
Are you sure that  248 is the degree symbol. On my computer it's the o with slash character ->  ø
I have no problem to use this character in my firebird database mushrooms coming with the lazarus database examples  in my TDBMemo field Note (see attachments). But the charset of the database is UTF8.
Friendly, J.P
Jurassic computer : Sinclair ZX81 - Zilog Z80A à 3,25 MHz - RAM 1 Ko - ROM 8 Ko

RedOctober

  • Sr. Member
  • ****
  • Posts: 450
Re: TDBMemo mistranslating ASCII 248
« Reply #2 on: August 13, 2022, 03:28:17 am »
Hi Jurrassic, yes I'm sure 248 is the degree symbol.  Here it is here:  °

Also shown here on the ASCII chart:

https://theasciicode.com.ar/extended-ascii-code/degree-symbol-ascii-code-248.html

Also, in my original post, I said that my TDBEdit fields can properly store and retrieve this character.  I produce it by holding down my ALT key + 248 on the number pad.

Jurassic Pork

  • Hero Member
  • *****
  • Posts: 1228
Re: TDBMemo mistranslating ASCII 248
« Reply #3 on: August 13, 2022, 08:48:36 am »
Hi Jurrassic, yes I'm sure 248 is the degree symbol.  Here it is here:  °
Also, in my original post, I said that my TDBEdit fields can properly store and retrieve this character.  I produce it by holding down my ALT key + 248 on the number pad.
° is ALT+0176
ø is ALT+0248
see Attachment (click on picture to start animation)

Friendly, J.P
Jurassic computer : Sinclair ZX81 - Zilog Z80A à 3,25 MHz - RAM 1 Ko - ROM 8 Ko

MarkMLl

  • Hero Member
  • *****
  • Posts: 6647
Re: TDBMemo mistranslating ASCII 248
« Reply #4 on: August 13, 2022, 09:19:50 am »
° is ALT+0176

176 in the Unicode encoding, and OP's description suggests that he's getting a UTF-8 sequence back from the database even though he doesn't investigate what that actually is ("funny capital A with a squiggle over top of it" isn't a very good description of a problem).

There's multiple things here: data entry of the correct character, potential conversion by database libraries, storage/retrieval by the backend database, potential conversion by database libraries, and final display by the TDBMemo.

MarkMLl

p.s. There is really no such thing as "ASCII 248".

MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

RedOctober

  • Sr. Member
  • ****
  • Posts: 450
Re: TDBMemo mistranslating ASCII 248
« Reply #5 on: August 13, 2022, 04:41:02 pm »
Hi Mark and Jurassic,

The problem is not the generation of the degree symbol.  I am able to create it on my keyboard, so that is not the problem. I've done some more tracing and here is what I found:

Code: Pascal  [Select][+][-]
  1.  
  2. procedure TIBStringField.SetAsString(const Value: string);  // <- This works correctly. I included it to show the difference between how a StringField works and a MemoField works.
  3. var
  4.   Buffer: PChar;
  5.   s: RawByteString;
  6. begin
  7.   Buffer := nil;
  8.   IBAlloc(Buffer, 0, DataSize);
  9.   try
  10.     s := Value;                                   // <- s = 'Ariel'#194#176  Value = 'Ariel°'
  11.     if StringCodePage(s) <> CodePage then         // <- Always resolves to True  Database is CP 1252 and CodePage variable is 1252
  12.       SetCodePage(s,CodePage,CodePage<>CP_NONE);  // <- Always executes.  s = 'Ariel'#194#176  CodePage = 1252
  13.     StrLCopy(Buffer, PChar(s), DataSize-1);       // <- Buffer = 'Ariel'#176   s = 'Ariel'#176
  14.     if Transliterate then                         // <- Transliterate is False
  15.       DataSet.Translate(Buffer, Buffer, True);    // <- Does not execute
  16.     SetData(Buffer);                              // <- Buffer is 'Ariel'#176
  17.   finally
  18.     FreeMem(Buffer);
  19.   end;
  20. end;                                               // <- Data is saved correctly.  This is a TIBStringField.
  21.  
  22.  
  23.  
  24. procedure TIBMemoField.SetAsString(const AValue: string);
  25. var s: RawByteString;
  26. begin
  27.   s := AValue;                                    // <- s = 'Ariel'#194#176  Value = 'Ariel°'
  28.   if StringCodePage(s) <> CodePage then           // <- Always resolves to True  Database is CP 1252 and CodePage variable is 1252
  29.     SetCodePage(s,CodePage,CodePage<>CP_NONE);    // <- Always executes.  s = 'Ariel'#194#176  CodePage = 1252
  30.   inherited SetAsString(s);                       // <- s = 'Ariel'#176   PROBLEM HERE IN SetAsString()
  31. end;                                        // <- Data is saved INCORRECTLY.  This is a TIBMemoField.
  32.  
  33. Problem Description:
  34.  
  35. SetAsString() changes the 'Ariel'#176  to 'Ariel'#194#176                
  36.  
  37. This function is part of the TBlobField type in the DB unit.
  38.  
  39.  

So, I still need help with this problem.


RedOctober

  • Sr. Member
  • ****
  • Posts: 450
Re: TDBMemo mistranslating ASCII 248
« Reply #6 on: August 13, 2022, 04:58:51 pm »
I was able to make a work-around for this problem. I'm hoping someone can help with the root cause of this problem, because this is just a patch for now.

Code: Pascal  [Select][+][-]
  1.  
  2. SET TERM ^ ;
  3. CREATE TRIGGER CLNS_LST_MOD
  4. ACTIVE BEFORE UPDATE POSITION 2
  5. ON CLNS
  6. AS
  7. BEGIN
  8.  
  9.   -- Prevent UNICODE #194 from coming in
  10.   IF (INSERTING OR UPDATING) THEN
  11.     BEGIN
  12.       IF (NEW.ADNL_CNTCT_INFO IS DISTINCT FROM OLD.ADNL_CNTCT_INFO) THEN
  13.         BEGIN
  14.           IF (POSITION('Â' IN NEW.ADNL_CNTCT_INFO) > 0) THEN
  15.             BEGIN
  16.               NEW.ADNL_CNTCT_INFO = REPLACE(NEW.ADNL_CNTCT_INFO, 'Â', '');
  17.             END
  18.         END
  19.     END
  20.  
  21. END
  22.  ^
  23. COMMIT^
  24. SET TERM ;^
  25.  
  26.  
  27.  
  28.  

wp

  • Hero Member
  • *****
  • Posts: 11831
Re: TDBMemo mistranslating ASCII 248
« Reply #7 on: August 13, 2022, 05:34:58 pm »
Doing an experiment: Add a button on a form with the following OnClick handler

Code: Pascal  [Select][+][-]
  1. procedure TForm1.Button1Click(Sender: TObject);
  2. var
  3.   s: String;
  4. begin
  5.   s := '°';
  6.   ShowMessage(IntToStr(Length(s)));
  7.   //ShowMessage(IntToHex(ord(s[1]), 2) + ' ' + IntToHex(ord(s[2]), 2));
  8. end;

The degree character is entered here via keyboard as ALT+248. Running this displays the message value 2, i.e. the entered symbol is 2 bytes long, i.e it is a UTF8 encoding. And when you activate the second showmessage in this code you see that the code is "C2 B0" - checking with the Lazarus character map (Menu "Edit") it can be confirmed that this indeed is the '°' in UTF8 encoding.

You mention that your database uses encoding 1252 - but as we saw, you enter text in UTF8 encoding. This must go wrong. What you can do to fix this is to convert the entered string to CP1252 before posting it to the database. This is what the dataset's Transliterate method is good for. It fires the OnTranslate event of the dataset and here you can do the conversion by calling ConvertEncoding(s, FromEncoding, ToEncoding) with FromEncoding='utf8' and ToEncoding='cp1252' (requires unit LConvEncoding). The Transliterate property of the stringfield utilizing this must be set to true. Likewise, when you display strings from the db in Lazarus, you must do the oppositve conversion. The conversion direction in the OnTranslate event handler is determined by the ToOEM parameter.

See https://wiki.freepascal.org/Lazarus_Tdbf_Tutorial#Code_pages_and_string_encoding_issues (this is written for DBase but should apply to all database systems).

MarkMLl

  • Hero Member
  • *****
  • Posts: 6647
Re: TDBMemo mistranslating ASCII 248
« Reply #8 on: August 13, 2022, 05:39:38 pm »
The problem is not the generation of the degree symbol.  I am able to create it on my keyboard, so that is not the problem.

What makes you so confident of that? Your intention is apparently to work in Win1252, but what the database is spitting back out at you looks like UTF-8. You need to use something specific to the database to find out what's actually going into the table, and once you've done that start looking at the codepage setting of your code units and of the various entry/display components you're using.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

RedOctober

  • Sr. Member
  • ****
  • Posts: 450
Re: TDBMemo mistranslating ASCII 248
« Reply #9 on: August 13, 2022, 05:55:24 pm »
From my posted patch, it shows that the problem is in the FreePascal unit "DB"...

SetAsString() changes the 'Ariel'#176  to 'Ariel'#194#176               
 
This function is part of the TBlobField type in the DB unit.

Nothing to do with the database, it works correctly.  I was even able to make a trigger to just "cut out" character #194 from incoming data.  Going the other way (reading from DB field, to Lazarus, Windows .exe works correctly).  The problem is only on writing.  SetAsString() must have a bug in it.  I am not on the latest FPC, but upgrading a project that's been underway for six years, and this is the only bug, I'm reluctant.

 

TinyPortal © 2005-2018