Recent

Author Topic: Firebird WIN1251 encoding problem and solution  (Read 297 times)

zgabrovski

  • New Member
  • *
  • Posts: 14
Firebird WIN1251 encoding problem and solution
« on: June 22, 2019, 11:12:14 am »
Hello, everybody! First of all, please apologies for my bad english.
A week ago i tried to develop a simple helper application, which help to my customer to do some additional tasks into him ERP system.
ERP system is a old development, based on Delphi, BDE and firebird as a database server.
The encoding into database is set to "NONE", because of the BDE (or I dont know why), but string data are coded into WIN1251 encoding. So, if a connect ti DB with a  Flame Robin or EMS Firebird manager And I setup a connection encoding to WIN1251, everything looks perfect, it display data with the correct encoding on the screen, both on Ubuntu 18 and/or Windows 10 environment.
But - when I start my development, I was very surprised, that instead of expected Cyrillic symbols, it display a question marks under windows or some other bullshits under ubuntu.
First of all, I set a "WIN1251" encoding on the connection level, but noting happens. Still the same.
I did some research into sources (d.pas, sqldb.pas, ibconnection.pp) and I found what is the problem.
into sqldb.pas - there is originally the following code:

Code: Pascal  [Select]
  1. procedure TSQLConnection.DoConnect;
  2. var ConnectionCharSet: string;
  3. begin
  4.   inherited;
  5.  
  6.   // map connection CharSet to corresponding local CodePage
  7.   // do not set FCodePage to CP_ACP if FCodePage = DefaultSystemCodePage
  8.   // aliases listed here are commonly used, but not recognized by CodePageNameToCodePage()
  9.   ConnectionCharSet := LowerCase(GetConnectionCharSet);
  10.   case ConnectionCharSet of
  11.     'utf8','utf-8','utf8mb4':
  12.       FCodePage := CP_UTF8;
  13.     'win1250','cp1250':
  14.       FCodePage := 1250;
  15.     'win1252','cp1252','latin1','iso8859_1':
  16.       FCodePage := 1252;
  17.     else
  18.       begin
  19.       FCodePage := CodePageNameToCodePage(ConnectionCharSet);
  20.       if FCodePage = CP_NONE then
  21.         FCodePage := CP_ACP;
  22.       end;
  23.   end;
  24. end;

Which obviously does not handle "WIN1251" encoding.

I made the following change:

Code: Pascal  [Select]
  1. procedure TSQLConnection.DoConnect;
  2. var ConnectionCharSet: string;
  3. begin
  4.   inherited;
  5.  
  6.   // map connection CharSet to corresponding local CodePage
  7.   // do not set FCodePage to CP_ACP if FCodePage = DefaultSystemCodePage
  8.   // aliases listed here are commonly used, but not recognized by CodePageNameToCodePage()
  9.   ConnectionCharSet := LowerCase(GetConnectionCharSet);
  10.   case ConnectionCharSet of
  11.     'utf8','utf-8','utf8mb4':
  12.       FCodePage := CP_UTF8;
  13.     'win1250','cp1250':
  14.       FCodePage := 1250;
  15. [b]    'win1251','cp1251':
  16.       FCodePage := 1251;[/b]
  17.     'win1252','cp1252','latin1','iso8859_1':
  18.       FCodePage := 1252;
  19.     else
  20.       begin
  21.       FCodePage := CodePageNameToCodePage(ConnectionCharSet);
  22.       if FCodePage = CP_NONE then
  23.         FCodePage := CP_ACP;
  24.       end;
  25.   end;
  26. end;

and afther recompile the fpc packages and lazaris IDE, now the "FCodepage" of the ibconnection becomes "1251", which was the right way.

BUT - still no fixed the problem, still become question marks and bullshits.
After that, I found that there is FCodepage on the TStringField class, which was still set to CP_NONE, nevertheless of the fix above.
Unfortunately, it is a Pivate declaration and I can not  change runtime with a helper class.
I found, that there is a CodePage publisher of that private declaration, but it is a read-only property. So, I changed it into db.pas to

Code: Pascal  [Select]
  1.     [b]property CodePage : TSystemCodePage read FCodePage write FCodePage; [/b]


And did the following on my AfterOpen event:

Code: Pascal  [Select]
  1. procedure TMainForm.qGetCustomersAfterOpen(DataSet: TDataSet);
  2. var i : integer;
  3. begin
  4. qGetCustomers.DisableControls;
  5. try
  6. for i := 0 to qGetCustomers.Fields.Count - 1 do
  7.   if qGetCustomers.Fields[ i ].DataType in [ftString, ftMemo] then begin
  8.     TStringField(  qGetCustomers.Fields[ i ] ).CodePage := 1251;
  9.     end;
  10. finally
  11. qGetCustomers.EnableControls;
  12. end;
  13. end;

so, after this fix, all the data on the sceen was OK.

Finally, I discover that the problem is into Ibconnection.pp:

Code: Pascal  [Select]
  1.       // [var]char or blob column character set NONE or OCTETS overrides connection charset
  2.       if ((TransType in [ftString, ftFixedChar]) and (PSQLVar^.sqlsubtype and $FF in [CS_NONE,CS_BINARY])) or
  3.          ((TransType = ftMemo) and (PSQLVar^.relname_length>0) and (PSQLVar^.sqlname_length>0) and (GetBlobCharset(@PSQLVar^.relname,@PSQLVar^.sqlname) in [CS_NONE,CS_BINARY])) then
  4. [b][i]        FieldDefs.Add(PSQLVar^.AliasName, TransType, TransLen, TransPrec, (PSQLVar^.sqltype and 1)=0, False, i+1, CP_NONE)[/i][/b]
  5.       else
  6.         AddFieldDef(FieldDefs, i+1, PSQLVar^.AliasName, TransType, TransLen, TransPrec, True, (PSQLVar^.sqltype and 1)=0, False);

in that case, it adds a Field def with a CP_NONE Code page, instead of call AddFieldDef method, which will add a fielddef with a rigth 1251 code page from ibConnection.

Is there anybody help me why it add a field def on this way?

What this code means?
Code: Pascal  [Select]
  1. if ((TransType in [ftString, ftFixedChar]) and (PSQLVar^.sqlsubtype and $FF in [CS_NONE,CS_BINARY])) or
  2.          ((TransType = ftMemo) and (PSQLVar^.relname_length>0) and (PSQLVar^.sqlname_length>0) and (GetBlobCharset(@PSQLVar^.relname,@PSQLVar^.sqlname) in [CS_NONE,CS_BINARY]))

I am ready to do all that fixes into a trunk, but I can not login to bug tracker, it does not allow to reset password. If I try to register a new user whit old user name, it report that my user name allready exists.


   

valdir.marcos

  • Hero Member
  • *****
  • Posts: 831
Re: Firebird WIN1251 encoding problem and solution
« Reply #1 on: June 22, 2019, 01:40:31 pm »
I am ready to do all that fixes into a trunk, but I can not login to bug tracker, it does not allow to reset password. If I try to register a new user whit old user name, it report that my user name allready exists.
You can reach website administrators to restore you account:

https://bugs.freepascal.org/
Contact administrator for assistance
webmaster (at) freepascal.org

https://lists.freepascal.org/mailman/listinfo
https://lists.lazarus-ide.org/listinfo