Recent

Author Topic: Firebird database with charset NONE  (Read 7350 times)

marclebrun

  • New Member
  • *
  • Posts: 16
Firebird database with charset NONE
« on: June 23, 2021, 08:20:20 am »
Hi  :)

There is a 20+ years old database in a company.
The server is Ubuntu 12.04 using Firebird 2.5.
And all the applications are running on Windows 10.

The database has been created without specifying any charset.
Code: MySQL  [Select][+][-]
  1. SELECT rdb$character_set_name FROM rdb$database; => "NONE"

There is an application written in Lazarus 1.4.4 using the old FBLib library from Alessandro Batisti https://github.com/graemeg/firebird-lib

When connecting, this application specifies a charset of ISO8859_1

Code: Pascal  [Select][+][-]
  1. DB := TFBLDatabase.Create(nil);
  2. DB.CharacterSet := 'ISO8859_1';
  3. ...

So far so good, but...

I want to migrate this application from Lazarus 1.4.4 to Lazarus 2.0.12.

No matter what charset I use when connecting, the result is the same : all characters with accents are replaced with question marks.
Exemple :
Code: [Select]
H?l?ne
J?r?me
Jo?l

I also tried using the TIBConnection from the Lazarus components but the result is exactly the same.

It looks like something has changed between Lazarus 1.4 and Lazarus 2.x regarding character encoding...?
I guess I will need to recreate this (big) database and find a way to convert the existing data from its original charset to UTF8.
Is it possible to convert the charset while restoring a backup ?
Is there another solution ?

Many thanks for your answers :D

Marc

Zvoni

  • Hero Member
  • *****
  • Posts: 627
One System to rule them all, One IDE to find them,
One Code to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
People call me crazy, because i'm jumping out of perfectly fine aircraft

Fantablup

  • Full Member
  • ***
  • Posts: 136
Re: Firebird database with charset NONE
« Reply #2 on: June 23, 2021, 08:47:59 am »
Look at this link

http://www.destructor.de/firebird/gbak.htm

They added two repair fixes in 2.5
Look at "Repair malformed"

Hope it helps. I have not ried it.
Seems like the only solution.

marclebrun

  • New Member
  • *
  • Posts: 16
Re: Firebird database with charset NONE
« Reply #3 on: June 23, 2021, 09:00:53 am »
Thank you both for your replies.

I had already seen that discussion on stackoverflow but the database is so big that I cannot just recreate some columns and migrate the data...

I will check the page about gbak, I hope I will find something useful.

I also read about FBCopy, do you guys know if it works well for this kind of issue ?

Thanks  :D

Marc

Fantablup

  • Full Member
  • ***
  • Posts: 136
Re: Firebird database with charset NONE
« Reply #4 on: June 23, 2021, 09:08:45 am »
Before version 2.5, they had the fix outside Gbak, but has included it in 2.5.
It says that should fix with a database with character set NONE.

Zvoni

  • Hero Member
  • *****
  • Posts: 627
Re: Firebird database with charset NONE
« Reply #5 on: June 23, 2021, 09:20:00 am »
Thank you both for your replies.

I had already seen that discussion on stackoverflow but the database is so big that I cannot just recreate some columns and migrate the data...

Marc
Why not? Since it's a one-time thing...
As they wrote on SO:
Export the DDL,
tweak the DDL with the "new" charset
create the new Database
dump the old data (data only) from the old DB
import it into the new DB
Test/QA the new Database
Roll out
One System to rule them all, One IDE to find them,
One Code to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
People call me crazy, because i'm jumping out of perfectly fine aircraft

marclebrun

  • New Member
  • *
  • Posts: 16
Re: Firebird database with charset NONE
« Reply #6 on: June 23, 2021, 10:11:02 am »
Why not? Since it's a one-time thing...

Because the database has 362 tables, and some of them have more than 200 fields, ...
I know it's not well designed  :)

What tool would you use to dump the data from the old DB to the new one ?

Zvoni

  • Hero Member
  • *****
  • Posts: 627
Re: Firebird database with charset NONE
« Reply #7 on: June 23, 2021, 10:22:50 am »
Since it's FireBird, FlameRobin should be the first try
Otherwise DBeaver

EDIT: 362 tables?
That's..... cute....
Just looking at the main schema of our IBM DB2 i see 1100 physical Tables (not counting views and such)
and some 700 Schemas, each with its own number of tables.....
« Last Edit: June 23, 2021, 10:25:49 am by Zvoni »
One System to rule them all, One IDE to find them,
One Code to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
People call me crazy, because i'm jumping out of perfectly fine aircraft

devEric69

  • Hero Member
  • *****
  • Posts: 515
Re: Firebird database with charset NONE
« Reply #8 on: June 23, 2021, 11:37:00 am »
Hello,

To export (in *.sql text-file) just the DDL statements of the database structure (synopsis of the database's scheme), and then to export the structure and its data in order to reimported those data into a new database structure - for example - I use fbsql (incuded in IBX for Lazarus, an IT solution from MWA Software, dedicated to firebird development). It works well for me (but I've never tried it on such a big database).

Edit (important): I've only tested this solution with firebird 3 (I see @marclebrun, that you are still in version 2.5)!
« Last Edit: June 23, 2021, 11:48:55 am by devEric69 »
use: Linux 64 bits (Ubuntu 20.04 LTS).
Lazarus version: 2.0.4 (svn revision: 62502M) compiled with fpc 3.0.4 - fpDebug \ Dwarf3.

tonyw

  • Full Member
  • ***
  • Posts: 207
    • MWA Software
Re: Firebird database with charset NONE
« Reply #9 on: June 23, 2021, 01:09:35 pm »
 FBSQL is an example utility provided as source code with the IBX release. It probably won't do what you want without a small change to the source code.

What FBSQL can do is to dump an entire database as metadata plus data into a text file, blobs and arrays included. This means that your entire database can be rendered as text and hence edited.

What I suggest you do is to get hold of the IBX source and compile the FBISQL example, but changing line 407 of fbsql.lpr from

 FIBDatabase.Params.Values['lc_ctype'] := 'UTF8';

to

 FIBDatabase.Params.Values['lc_ctype'] := 'ISO8859_1';

This is so that your database is opened using its proper character set. You can then use this version to dump your database to a textfile using fbisql from the command line using (e.g.)

fbsql -A -g -u <USERNAME> -p <PASSWORD> -o <output file> <database connect string>

If the database is local then <database connect string> is just the path to the database, otherwise it should include the domain name of the server ':' database name - any valid connect string will do.

The above should dump the whole database to one big text file in UTF8 encoding.

You should now edit the text file to replace all "Character Set none" to "Character set UTF8". Also edit the create database statement to what is needed to define the replacement database.

Restore line 407 of fbisql.lpr to its original value and recompile. You can now create the replacement database using:

fbsql -b -e  -u <USERNAME> -p <PASSWORD> -i <file containing edited text dump of database>

fbsql should now read in the text dump and recreate your database for you.

If you are using FB4 then you should get IBX from the MWA software SVN repository. The zip has a bug in it affecting the post RC1 version of FB4 and which is only fixed in SVN.


devEric69

  • Hero Member
  • *****
  • Posts: 515
Re: Firebird database with charset NONE
« Reply #10 on: June 23, 2021, 04:26:15 pm »
Thank you very much for this FBSQL's charset tip, Tony.
« Last Edit: June 23, 2021, 04:35:44 pm by devEric69 »
use: Linux 64 bits (Ubuntu 20.04 LTS).
Lazarus version: 2.0.4 (svn revision: 62502M) compiled with fpc 3.0.4 - fpDebug \ Dwarf3.

tonyw

  • Full Member
  • ***
  • Posts: 207
    • MWA Software
Re: Firebird database with charset NONE
« Reply #11 on: June 23, 2021, 04:28:03 pm »
Thank you very much for this FBSQL's charset tip, Tony.
In the development version, I have already added a "-c" command line switch to allow the connection character set to be set at run time - just like ISQL.

devEric69

  • Hero Member
  • *****
  • Posts: 515
Re: Firebird database with charset NONE
« Reply #12 on: June 23, 2021, 04:37:44 pm »
In the development version, I have already added a "-c" command line switch to allow the connection character set to be set at run time - just like ISQL.

OK, that's good to know. For information, I'm using your tool (FBSQL) in order to create and manage differential *.sql patches.
use: Linux 64 bits (Ubuntu 20.04 LTS).
Lazarus version: 2.0.4 (svn revision: 62502M) compiled with fpc 3.0.4 - fpDebug \ Dwarf3.

zgabrovski

  • New Member
  • *
  • Posts: 33
Re: Firebird database with charset NONE
« Reply #13 on: June 26, 2021, 08:45:16 am »
Guys it is very easy to use a FB Database with no character set specified as is.
Simple set the TIBConnection property "UseConnectionCharSetIfNone" to true and then set your desired character set.
I had exactly the the same problem before and we developed this.

marclebrun

  • New Member
  • *
  • Posts: 16
Re: Firebird database with charset NONE
« Reply #14 on: June 27, 2021, 08:57:18 am »
Thank you all for your great help, I have made some very good progress :)

  • I've been able to extract the DDL using FlameRobin and recreate an empty database with charset ISO8859_1.
  • Then I wrote a "DataPump" tool which connects to the old database using charset NONE and to the new one using charset ISO8859_1, then it copies the data of each table. It takes some time to run, but it works.
  • As a result, I have a new database in which the data is encoded in ISO8859_1  :D  :D  :D

Now there is something I don't understand very well :

In FlameRobin, I'm using ISO8859_1 to connect to the new database, everything looks perfect.
Then I change the connection settings to use UTF8, I only have a warning that the charset is different from the database, but then everything still looks perfect. I didn't expect that.

BUT :

In my Lazarus application, things are different :
  • If I connect in ISO8859_1 the wrong characters appear.
  • If I connect in UTF8 then everything looks perfect.

That's strange, isn't it ?
Is there something I should be careful about ?

Again, many thanks to all for your great advice so far !!! :-D

Marc

 

TinyPortal © 2005-2018