Using 'Tools->Convert encoding of projects' option

btr0001

New member
Posts: 7

Using 'Tools->Convert encoding of projects' option

« on: November 03, 2011, 01:50:42 pm »

Hi!
I don't understand how to choose project in 'Convert encoding' form. The dropdown list shows some standard packages (ChmHelpPkg and others). I tried to enter path to my project filename, but no files where found.
Please help!

Logged

howardpc

Hero Member
Posts: 4144

Re: Using 'Tools->Convert encoding of projects' option

« Reply #1 on: November 03, 2011, 05:47:38 pm »

You have to open the project first in the IDE (or if it is a new project, save it with a specific name somewhere).
Then you'll see the first option in the dropdown list is "Current Project", before the list of packages.
The dialog is not designed to let you navigate to as-yet unopened projects. You are restricted to the (one) project currently open in the IDE.

Logged

btr0001

New member
Posts: 7

Re: Using 'Tools->Convert encoding of projects' option

« Reply #2 on: November 09, 2011, 11:11:26 am »

Ok. I have done the next.
First I created a new project in Delphi with single label in form. Label has caption in Cyrillic. Then I converted this project using 'Tools->Convert Delphi project to Lazarus project'. All was done correctly, caption was converted to utf8.
Next I created another project with no caption on label but with additional button. When I press the button text in cyrillic is assigned to label. So now text in cyriilic is saved in unit1.pas file instead of Form1.lfm. In this case text in cyrillic was not converted to utf8. Then I used 'tools->convert encoding of projects', pressed 'update preview' I saw in 'Encoding' column that my file has ISO-8859-1 codepage. Really it has cp1251. After I searched 'Environment->Options' trying to find something to set the codepage, but did find any place to do this.
What to do now?

Logged

JuhaManninen

Global Moderator
Hero Member
Posts: 4468
I like bugs.

Re: Using 'Tools->Convert encoding of projects' option

« Reply #3 on: November 09, 2011, 01:51:56 pm »

Could you attach a sample project?

Juha

Logged

Mostly Lazarus trunk and FPC 3.2 on Manjaro Linux 64-bit.

btr0001

New member
Posts: 7

Re: Using 'Tools->Convert encoding of projects' option

« Reply #4 on: November 09, 2011, 02:52:08 pm »

Attached project

sample_project.tar.gz (0.79 kB - downloaded 118 times.)

Logged

JuhaManninen

Global Moderator
Hero Member
Posts: 4468
I like bugs.

Re: Using 'Tools->Convert encoding of projects' option

« Reply #5 on: November 10, 2011, 10:46:20 am »

Ok
Your .pas file encoding is correct. You need to encode the string literal explicitly.
This is the original assignment :
Label1.Caption:='Ïðèâ³ò!';

I thought this would do the job but it doesn't :
Label1.Caption:=UTF8Encode('Ïðèâ³ò!');

Maybe someone knows the right way.

Juha

Logged

Mostly Lazarus trunk and FPC 3.2 on Manjaro Linux 64-bit.

btr0001

New member
Posts: 7

Re: Using 'Tools->Convert encoding of projects' option

« Reply #6 on: November 10, 2011, 03:08:46 pm »

I thing you are not right.
Original assignment is
Label1.Caption:='Привіт!';
If you use Unicode in your browser you can see a difference.
As *.pas is a simple text file it does not contain any information about it's codepage. I don't understand how can you know that my file encoding is correct.
In fact this file was created under WS Windows with cp1251. But I don't know how Lazarus can know about this. Another fact that first program (attached) was converted successfully. In that case caption of Label1 was stored in Unit1.dfm file as codes of cp1251 symbols:

Caption = #1055#1088#1080#1074#1110#1090'!'
Font.Charset = DEFAULT_CHARSET

I also cannot find in this dephi project any other data about what is DEFAULT_CHARSET (it is cp1251), but Lazarus know this. How?

first_project.tar.gz (2.36 kB - downloaded 123 times.)

Logged

ludob

Hero Member
Posts: 1173

Re: Using 'Tools->Convert encoding of projects' option

« Reply #7 on: November 10, 2011, 03:50:20 pm »

When the file doesn't include any encoding info, the system default encoding is used in the convert encoding tool. On windows this is the code page returned by the windows GetACP(). As a result you can only use the convert tool on a machine that uses cp1251 as default encoding. The machine you are using is apparently configured as ISO-8859-1.
On my machine I also read "Label1.Caption:='Ïðèâ³ò!';" but my default windows code page is cp1252. Normal that it doesn't show Cyrillic characters.

Logged

JuhaManninen

Global Moderator
Hero Member
Posts: 4468
I like bugs.

Re: Using 'Tools->Convert encoding of projects' option

« Reply #8 on: November 10, 2011, 10:55:37 pm »

I am not an expert with character encodings. According to the converter your file has ISO-8859-1 encoding. See:

* Converting file /home/juha/SW/LazTest/sample_project/Unit1.pas *
Changed encoding from ISO-8859-1 to UTF-8
Replaced unit "Windows" with "LCLIntf, LCLType, LMessages" in uses section.

Now, the strange thing is that I see 'Ïðèâ³ò!' in Lazarus editor always, with or without the conversion. It looks like a bug. I think the string should be converted.
When I open the file with another editor (KWrite) I see 'Привіт!' which is the correct string.

About how the code in Lazarus detects the file's encoding? It scans the file contents up to some length and guesses.
I didn't write the code and I don't even understand it. You must study the code yourself. See GuessEncoding function.
I used it in Delphi converter but it may well have bugs.

The syntax:
Caption = #1055#1088#1080#1074#1110#1090'!'
is not really a character encoding but .DFM form file's specific way to represent WideStrings.
You may want to see an old thread about the same topic:
http://lazarus.freepascal.org/index.php/topic,9045
Before that I didn't even know such syntax existed.

Somebody with more knowledge about character encodings could tell what can be expected from conversion.

Juha

Logged

Mostly Lazarus trunk and FPC 3.2 on Manjaro Linux 64-bit.

avra

Hero Member
Posts: 2514

Re: Using 'Tools->Convert encoding of projects' option

« Reply #9 on: November 14, 2011, 11:43:18 am »

Quote from: JuhaManninen on November 10, 2011, 10:55:37 pm

Now, the strange thing is that I see 'Ïðèâ³ò!' in Lazarus editor always, with or without the conversion. It looks like a bug. I think the string should be converted.
When I open the file with another editor (KWrite) I see 'Привіт!' which is the correct string.

What font do you use in Lazarus editor? Is it capable of showing cyrillic characters? What happens when you use the same font in KWrite and Lazarus?

Logged

ct2laz - Conversion between Lazarus and CodeTyphon
bithelpers - Bit manipulation for standard types
pasettimino - Siemens S7 PLC lib

JuhaManninen

Global Moderator
Hero Member
Posts: 4468
I like bugs.

Re: Using 'Tools->Convert encoding of projects' option

« Reply #10 on: November 15, 2011, 01:51:56 pm »

Quote from: avra on November 14, 2011, 11:43:18 am

What font do you use in Lazarus editor? Is it capable of showing cyrillic characters? What happens when you use the same font in KWrite and Lazarus?

The font makes no difference. I tested by setting the font to the same "Monospace" in both KWrite and Lazarus.
Even if I save the file from one editor (KWrite or Lazarus) they still show the same difference.
So, clearly they show the exact same data in a different way. KWrite shows it right so there is a bug in Lazarus somewhere.

I still wonder what the char encoding converter does.
It claimed it converted from ISO-8859-1 to UTF-8 but it does not show visually.

Someone should make a bug report.

Juha

Logged

Mostly Lazarus trunk and FPC 3.2 on Manjaro Linux 64-bit.

Lazarus

Bookstore

Search

Recent

Author Topic: Using 'Tools->Convert encoding of projects' option (Read 12770 times)

btr0001

Using 'Tools->Convert encoding of projects' option

howardpc

Re: Using 'Tools->Convert encoding of projects' option

btr0001

Re: Using 'Tools->Convert encoding of projects' option

JuhaManninen

Re: Using 'Tools->Convert encoding of projects' option

btr0001

Re: Using 'Tools->Convert encoding of projects' option

JuhaManninen

Re: Using 'Tools->Convert encoding of projects' option

btr0001

Re: Using 'Tools->Convert encoding of projects' option

ludob

Re: Using 'Tools->Convert encoding of projects' option

JuhaManninen

Re: Using 'Tools->Convert encoding of projects' option

avra

Re: Using 'Tools->Convert encoding of projects' option

JuhaManninen

Re: Using 'Tools->Convert encoding of projects' option

	Computer Math and Games in Pascal (preview)
	Lazarus Handbook