Recent

Author Topic: [SOLVED] a MySQL utf8 database, 2 TStringField or not 2 TStringField?  (Read 5624 times)

eara

  • Jr. Member
  • **
  • Posts: 84
hi there,

I try to revert my db to MySql and i noticed something i  didn't new.

I have a table with utf8 character set (and utf8_unicode_ci collation).
Where table has a varchar[4] field, lets say code.
And i have records with greek letters in it.
E.g. 'ΑΒΓΔ'

Now i try to manipulate that fields from lazarus, with a
qry_
component that has a
qry_MyCode: TStringField;
and a TDBEdit that is bound to MyCode field.
The created TStringField has a size property of 4 that "seems" that was  correct.
But.when i read the greek letters 'ΑΒΓΔ' i only get 'ΑΒ' (2 instead of 4).
Debuging the code of
TStringField.GetValue()
function i see that a local variable Buf that holds the 'ΑΒΓΔ' value but is truncated with
buf[Size]:=#0;
command to Size chars, which results only the half of them.

After taking a look in 25801 i decided to try without TStringField,
so now i can read the 'ΑΒΓΔ' and write also 4 chars to it.

My questions now...

1. Am i safe without TStringField ? 

2. When will the TUF8StringField come out ?

FIY: i use Lazarus 1.2.0, SVN Revision:44303, i386-win32-win32/win64 on win 7 32bits.
« Last Edit: April 20, 2014, 01:54:03 am by eara »

taazz

  • Hero Member
  • *****
  • Posts: 5368
Re: a MySQL utf8 database, 2 TStringField or not 2 TStringField?
« Reply #1 on: April 09, 2014, 06:26:49 pm »
hi there,

I try to revert my db to MySql and i noticed something i  didn't new.

I have a table with utf8 character set (and utf8_unicode_ci collation).
Where table has a varchar[4] field, lets say code.
And i have records with greek letters in it.
E.g. 'ΑΒΓΔ'

Now i try to manipulate that fields from lazarus, with a
qry_
component that has a
qry_MyCode: TStringField;
and a TDBEdit that is bound to MyCode field.
The created TStringField has a size property of 4 that "seems" that was  correct.
But.when i read the greek letters 'ΑΒΓΔ' i only get 'ΑΒ' (2 instead of 4).
Debuging the code of
TStringField.GetValue()
function i see that a local variable Buf that holds the 'ΑΒΓΔ' value but is truncated with
buf[Size]:=#0;
command to Size chars, which results only the half of them.

After taking a look in 25801 i decided to try without TStringField,
so now i can read the 'ΑΒΓΔ' and write also 4 chars to it.

My questions now...

1. Am i safe without TStringField ? 

2. When will the TUF8StringField come out ?

FIY: i use Lazarus 1.2.0, SVN Revision:44303, i386-win32-win32/win64 on win 7 32bits.


OK speed reading here so if I missed something I apologize upfront.

1) what did you try and how you did it with out a tstringfield? (its impossible to do by the way).

2) There is no way you can avoid using a TStringField or in your case a TxxxxxField (might be a Twidestringfield) even if you do not create them they are created automaticaly for you when the TQUERY control is opened unless of course you already have other fields created in which case accessing the field should raise a "not found" exception.

Long story short, there is no way you will access fields through a TQUERY that are not represented in memory by TxxxxxField object.
Good judgement is the result of experience … Experience is the result of bad judgement.

OS : Windows 7 64 bit
Laz: Lazarus 1.4.4 FPC 2.6.4 i386-win32-win32/win64

eara

  • Jr. Member
  • **
  • Posts: 84
Re: a MySQL utf8 database, 2 TStringField or not 2 TStringField?
« Reply #2 on: April 09, 2014, 08:54:18 pm »
that was what i was afraid .... an underlying TStringField    :'(

The funny thing is that accessing the database field without creating a TStringField in lazarus, seems to work till now, and in my tests i saved values up to max size with greek letters without problems.

But when i implicit create the damned TStringField in IDE then i got the truncation from dataset.inc, where it puts a damned #0 in buf[size] that is not a good thing for my UTF8 string....

Another funny thing is that i noticed that the buf had the right data in dataset.inc before the truncation (that means that they where readed all of them), so i don't know how to go on ...

Should i
a) continue without implicit creating TStringFields in my TSQLQueries, or
b) should i wait for the magic TUTF8StringField to come out?

Other options ?
c) there is also the possibility of redesign MySQL database without utf8 encodings, but this is not a solution ... utf8 is a "must", specially if you want to share to internet the data from db.
d) Change my name to Mr. Smith and go as immigrant to a english-only land, and forget the utf8 internationalization problems....
e) Do another job....

taazz

  • Hero Member
  • *****
  • Posts: 5368
Re: a MySQL utf8 database, 2 TStringField or not 2 TStringField?
« Reply #3 on: April 10, 2014, 12:04:56 am »
since implicit or persistent fields as they are better known are not a requirement for your application you should go on with out using them, after all they come with their own set of problems mainly they do not go well with non persistent fields in the same dataset. Personaly because I want the ability to change the fields at run time as well as design time I never use persistent fields.

Now that I said that I would prefer if you could take the 6th or 7th road
f) create a TUTF8StringField and provide a patch to be included in fpc sqldb
g) patch the existing TStringField to work properly with UTF8 characters.

Good judgement is the result of experience … Experience is the result of bad judgement.

OS : Windows 7 64 bit
Laz: Lazarus 1.4.4 FPC 2.6.4 i386-win32-win32/win64

LacaK

  • Hero Member
  • *****
  • Posts: 691
Re: a MySQL utf8 database, 2 TStringField or not 2 TStringField?
« Reply #4 on: April 11, 2014, 11:59:17 am »
The funny thing is that accessing the database field without creating a TStringField in lazarus, seems to work till now, and in my tests i saved values up to max size with greek letters without problems.

Can you check TStringField.Size for:
a) persistent field, created in lazarus Fields editor (here you get truncated result)
b) if field is created by FreePascal (here you get correct results)

eara

  • Jr. Member
  • **
  • Posts: 84
Re: a MySQL utf8 database, 2 TStringField or not 2 TStringField?
« Reply #5 on: April 11, 2014, 01:34:13 pm »
@LacaK: what do you mean?
If you are talk to me: I already made some checks and i locate the place where the truncation is done in TStringField, and i think i refer to it, look in dataset.inc for Buf[size]:=#0; command in TStringField methods.
I tried also to figure out what object is used in case of auto created fields but had no sucess on it.

Right now, i continue my project without creating TStringField in IDE. I set up autocreated fields on runtime (at least things that i can change, like .DisplayLabel, .Origin, but not Size cause throws exception).

What i have made so far in my project that i started (2 months ago) in SQLite and now i restarted and clean-up in MySQL is:

a base dbForm as anchestor for all my db oriented forms that provides the following:

* a TSQLQuery
* Provides a PageControl with 2 pages:
  a) tabList that holds a dbgrid for listing records
  b) tabRecord that holds an empty panel and an empty PageControl for displaying details that is used on CRUD operations.
* Can been shown in 2 different modes:
  a) Select only record (without details)
  b) As CRUD form
* can be used as Master or Detail form (detail or master connected forms should be descedants).
* Provides also a simple Filter mechanism on whatever field supports the like operator in db
* Instead of bookmarks, it uses primary keys (can be located from it, or better can be provided by virtual methods from descedants)
   for locating the current record after a refresh.
* Updates grid's column captions, and some label captions with the displayLabel from bounded fields in case the caption of control is the same with the related field name, in order to minize design time with entering the same label again and again.
* Provides ApplyUpdates and Commit in AfterPost, AfterDelete events and some basic error handling (that can be improved), and also creates some extra functions that are used as event handlers (e.g. qry_AfterPost_BeforeCommit)
* Provides a mechanism for changing Sort Order. Order By clause can be manipulated from another dialog where user can change the field list order, ASC, DESC for all fields that are displayed in TDBGrid (and .Origin property is not empty). Information is saved in 2 LSB bits of .Tag property of TDBGrid columns.

a SettingDialog Form where all application settings can be (explicit) stored in 3 different sections (application,security,database)

if someone is interest on it...
« Last Edit: April 11, 2014, 01:40:39 pm by eara »

LacaK

  • Hero Member
  • *****
  • Posts: 691
Re: a MySQL utf8 database, 2 TStringField or not 2 TStringField?
« Reply #6 on: April 12, 2014, 10:11:38 pm »
There is only one TStringField. It does not depend if it is persistent field or is created at runtime.
If you describe, that persistent fields causes truncation and automaticaly created fields do not, then I am guessing, that there must be difference in some property.
And only property which comes into my mind is Size.
So after you have opened dataset f.e. SQLQuery1 can you please check:
  showmessagefmt('Size of field is: %d", [SQLQuery1.FieldByName('mystringfield').Size]);
Then create persistent field, open dataset and check same again.
I expect, that you will see different values.

eara

  • Jr. Member
  • **
  • Posts: 84
Re: a MySQL utf8 database, 2 TStringField or not 2 TStringField?
« Reply #7 on: April 13, 2014, 05:59:04 pm »
yeap, i see the light...  8)
with autocreated fields i get the triple size always 
with persistance fields i get the db size on some cases    %)
mmmm
something new that i noticed...
Currently i use the TSQLConnector.
There is the damned CharSet property which also plays its own role.
If i set the  .CharSet property to utf8 and then create the persistance fields then they have the (db-size)x3.
If i dont set it (leave it empty) and then create them, then they have the (db-size).
Now as for the autocreate fields... how the heck they got the x3 size always?
 >:D aha, is the TSQLConnector.AfterConnect where i run the SET NAMES 'utf8'
mystery solved... now i know that i should live with the x3 size in my application, until the UTF8StringField comes out...
that has some strange side effects in the edit controls that may lead to db errors size 2 long for field XYZ, but well lets forget it for the time...

by the way is it a great overhead if i use autocreated fields, instead of persistent ? (thats another story)

thank you all for your responses...

LacaK

  • Hero Member
  • *****
  • Posts: 691
Re: a MySQL utf8 database, 2 TStringField or not 2 TStringField?
« Reply #8 on: April 13, 2014, 07:51:53 pm »
by the way is it a great overhead if i use autocreated fields, instead of persistent ? (thats another story)

No

eara

  • Jr. Member
  • **
  • Posts: 84
Re: a MySQL utf8 database, 2 TStringField or not 2 TStringField?
« Reply #9 on: April 20, 2014, 01:53:15 am »
@Lacak: Thanks again.

And in order to close this topic, the finally accepted awser is ......

Jaein!  :P (or poke 0xFFAA,23)

P.S. always a damned string problem... char, char*, BSTR, char^2, ansi, mbcs, utf8, utf16, utf_sqrt(2) ascii , null terminated or not,  fixed or variant e.t.c .... brrrrrr...., mayby i should leave this and use pc only for games
« Last Edit: April 20, 2014, 02:04:34 am by eara »

 

TinyPortal © 2005-2018