UTF8 Problems

JD

Hero Member
Posts: 1848

UTF8 Problems

« on: April 29, 2017, 12:37:09 pm »

Hi there everyone,

I have a problem with a case statement in an old project. {$codepage UTF8} was set near the beginning of the file. When I compile it, I get the error shown below

Code: Pascal [Select][+]

unit1.pas(38,34) Error: Constant and CASE types do not match
unit1.pas(38,34) Error: String expression expected
 

The line with the error is shown in the attached screenshot. Does anyone know how I can fix this problem?

Thanks,

JD

Highlighted error.png (7.96 kB, 535x227 - viewed 351 times.)

Logged

Windows - Lazarus 2.1/FPC 3.2 (built using fpcupdeluxe),
Linux Mint - Lazarus 2.1/FPC 3.2 (built using fpcupdeluxe)

mORMot; Zeos 8; SQLite, PostgreSQL & MariaDB; VirtualTreeView

wp

Hero Member
Posts: 11916

Re: UTF8 Problems

« Reply #1 on: April 29, 2017, 01:25:22 pm »

Remove the directive {$codepage utf8}. It helped when I simulated your issue although I don't understand (Lazarus source files are utf8 anyway, aren't they?)

Logged

Thaddy

Hero Member
Posts: 14371
Sensorship about opinions does not belong here.

Re: UTF8 Problems

« Reply #2 on: April 29, 2017, 01:25:33 pm »

case <string> is a ~~bug~~ feature of FPC and is strictly ANSI afaik. It isn't supposed to work with UTF8, because there IS no UTF8 support in the compiler itself.

« Last Edit: April 29, 2017, 01:27:32 pm by Thaddy »

Logged

Object Pascal programmers should get rid of their "component fetish" especially with the non-visuals.

wp

Hero Member
Posts: 11916

Re: UTF8 Problems

« Reply #3 on: April 29, 2017, 03:09:31 pm »

But why is the case statement working if the {$codepage utf8} is removed? If I open the file in NotePad++ it tells me that the file is UTF8. So, the compiler is able to understand UTF8.

All this is very confusing, and I have always tried to avoid additional directives and "exotic" string types such as rawbytestring or the codepage aware strings declared at compile time etc. Using plain-old "string" without any directives works in 95%, probably even more, of all programs that I write.

Logged

Thaddy

Hero Member
Posts: 14371
Sensorship about opinions does not belong here.

Re: UTF8 Problems

« Reply #4 on: April 29, 2017, 03:31:01 pm »

No, the compiler listens to the codepage of the underlying OS. Which is probably CP863. Therefor it works. Up to a point. Like for French. But not for all languages.
The same code will probably fail on other platforms than Windows. Unless the strings hash the same.

Logged

Object Pascal programmers should get rid of their "component fetish" especially with the non-visuals.

marcov

Administrator
Hero Member
Posts: 11452
FPC developer.

Re: UTF8 Problems

« Reply #5 on: April 29, 2017, 05:43:04 pm »

So in addition to Thaddy, to state the obvious:

the compiler is compiled with ansistring(0), so on windows that means the default windows codepage (ansi codepages like windows-125x, not OEM codepages like cp852 like Thaddy says, though they are mostly 1:1 matched)

So anything inside the compiler that doesn't have special treatment (like literals) will use that codepage.

Literals have a special pass through to allow literals in varying codepages to be transfered to the final binary.

The choice for one-byte unicode/utf8 strings on Windows was doomed from the start, and should have been avoided, for exactly these reasons.

Logged

Thaddy

Hero Member
Posts: 14371
Sensorship about opinions does not belong here.

Re: UTF8 Problems

« Reply #6 on: April 29, 2017, 06:53:02 pm »

All these string types, indeed.
cp is probably 1252-1.

Maybe a documentation issue in FPC.

Then again, people should use case of <string> with care. I wish someone put that genie back in its bottle.

Logged

Object Pascal programmers should get rid of their "component fetish" especially with the non-visuals.

JD

Hero Member
Posts: 1848

Re: UTF8 Problems

« Reply #7 on: April 29, 2017, 11:23:45 pm »

Quote from: Thaddy on April 29, 2017, 06:53:02 pm

All these string types, indeed.
cp is probably 1252-1.

Maybe a documentation issue in FPC.

Then again, people should use case of <string> with care. I wish someone put that genie back in its bottle.

Thanks a lot Thaddy. I replaced the case statement with an if statement & I was able to proceed. Now I will do the same with all the other case statements in the project.

The funny thing is that in another project, I had used case statements with case of <variant> and that worked. I might add that the variant was the mORMot TDocVariant type.

The other strange thing is this whole problem came to light with Lazarus 1.6.4/FPC 3.0.2. The last version Lazarus 1.6.2/FPC 3.0 did not complain at all.

Cheers,

JD

« Last Edit: April 29, 2017, 11:50:59 pm by JD »

Logged

Windows - Lazarus 2.1/FPC 3.2 (built using fpcupdeluxe),
Linux Mint - Lazarus 2.1/FPC 3.2 (built using fpcupdeluxe)

mORMot; Zeos 8; SQLite, PostgreSQL & MariaDB; VirtualTreeView

JD

Hero Member
Posts: 1848

Re: UTF8 Problems

« Reply #8 on: April 30, 2017, 12:08:03 am »

Using IndexStr also works with {$codepage UTF8}

Code: Pascal [Select][+]

procedure TForm1.Button1Click(Sender: TObject);
begin
  //
  case IndexStr(Edit1.Text, ['Sondages', 'Internet', 'Bouche à oreille', 'Partenaire']) of
    0 : ShowMessage('First option');
    1 : ShowMessage('Second option');
    2 : ShowMessage('Third option');
    3 : ShowMessage('Fourth option');
  end;
end;
 

NB: IndexStr is in StrUtils.pas

JD

« Last Edit: April 30, 2017, 12:14:25 am by JD »

Logged

Windows - Lazarus 2.1/FPC 3.2 (built using fpcupdeluxe),
Linux Mint - Lazarus 2.1/FPC 3.2 (built using fpcupdeluxe)

mORMot; Zeos 8; SQLite, PostgreSQL & MariaDB; VirtualTreeView

Lazarus

Bookstore

Search

Recent

Author Topic: UTF8 Problems (Read 4497 times)

JD

UTF8 Problems

wp

Re: UTF8 Problems

Thaddy

Re: UTF8 Problems

wp

Re: UTF8 Problems

Thaddy

Re: UTF8 Problems

marcov

Re: UTF8 Problems

Thaddy

Re: UTF8 Problems

JD

Re: UTF8 Problems

JD

Re: UTF8 Problems

	Computer Math and Games in Pascal (preview)
	Lazarus Handbook