* * *

Author Topic: Is StringReplace UTF8?  (Read 1338 times)

josh

  • Sr. Member
  • ****
  • Posts: 434
Is StringReplace UTF8?
« on: March 22, 2017, 01:51:12 am »
Hi

Been playing with latest trunk, and am getting a random crash which is pointing to UTFUpCase

I am not calling this anywhere, but I suspect it maybe coming from stringreplace

I have a string that contains binary data; so it can hold chr(0) up to chr(255); I use the stringreplace to replace certain values of the string before it output to a device;
This previously worked fine for many years; but now the routine randomly exits with a sigsev error and stack trace point to utfupcase.

Do any of the standard string routines now only support utf?

Hope that make sense.
Development Installation Lazarus 1.3, FPC 2.7.1,Windows 7/8 32/64, OSX, *nix

Test Environment Lazarus & FPC Trunk on Windows and OSX (Cocoa Mainly on OSX). Testing also Crosscompile windows to OSX.. 
Any posts made from 2015 will be based on Lazarus Trunk.

Girlbrush

  • Jr. Member
  • **
  • Posts: 65
Re: Is StringReplace UTF8?
« Reply #1 on: March 22, 2017, 08:43:51 am »
AFAIK, Strings are considered UTF8.

This might be relevant to you: http://wiki.freepascal.org/Lazarus_with_FPC3.0_without_UTF-8_mode

Although it might be better to not use String functions if you aren't actually dealing with Strings ;)
Getting back into programming after 8+ years.

wp

  • Hero Member
  • *****
  • Posts: 3894
Re: Is StringReplace UTF8?
« Reply #2 on: March 22, 2017, 10:19:13 am »
As usual: Post code to demonstrate the issue. It is strange that you see an UTF8-related issue with StringReplace which does not take care of string encoding at all (in my understanding). How is the string declared? What is happening after the StringReplace? Maybe you ended up with byte combinations which represent invalid UTF8 code points and thus cannot be displayed by the LCL?
Lazarus trunk / fpc 3.0.4 / all 32-bit on Win-10

Graeme

  • Hero Member
  • *****
  • Posts: 1414
    • Graeme on the web
Re: Is StringReplace UTF8?
« Reply #3 on: March 22, 2017, 11:07:47 am »
I have a string that contains binary data; so it can hold chr(0) up to chr(255); I use the stringreplace to replace certain values of the string before it output to a device;
A terrible idea. Use a byte array or TStream to hold your data. The default String type was AnsiString in FPC 2.6.4 and earlier, and I believe it is now UTF-16 in FPC 3.x, so you are going to run into all kinds of problems if you don't change your code.
--
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

Cyrax

  • Hero Member
  • *****
  • Posts: 542
Re: Is StringReplace UTF8?
« Reply #4 on: March 22, 2017, 11:13:51 am »
I have a string that contains binary data; so it can hold chr(0) up to chr(255); I use the stringreplace to replace certain values of the string before it output to a device;
A terrible idea. Use a byte array or TStream to hold your data. The default String type was AnsiString in FPC 2.6.4 and earlier, and I believe it is now UTF-16 in FPC 3.x, so you are going to run into all kinds of problems if you don't change your code.

The only difference with 2.6.4 and 3.0.2 AnsiString is support for codepage and conversions between different codepages. And there is no default codepage enabled for FPC, it will use defaults what OS (or user) will tell it to use.

Thaddy

  • Hero Member
  • *****
  • Posts: 4638
Re: Is StringReplace UTF8?
« Reply #5 on: March 22, 2017, 11:20:31 am »
He declares things as string and should declare it as ansistring in his context and if he uses Lazarus.
And I agree with Graeme that using a string for binary data is and always has been a recipe for disaster.
Replace even the AnsiString with an array of byte.
"Logically, no number of positive outcomes at the level of experimental testing can confirm a scientific theory, but a single counterexample is logically decisive."

Remy Lebeau

  • Sr. Member
  • ****
  • Posts: 335
    • Lebeau Software
Re: Is StringReplace UTF8?
« Reply #6 on: March 24, 2017, 02:19:57 am »
The default String type was AnsiString in FPC 2.6.4 and earlier, and I believe it is now UTF-16 in FPC 3.x

AFAIK, the default string type is still AnsiString unless you use {$MODE DelphiUnicode} or {$MODESWITCH UnicodeStrings}.
Remy Lebeau
Lebeau Software - Owner, Developer
Internet Direct (Indy) open source project - Admin, Developer

 

Recent

Get Lazarus at SourceForge.net. Fast, secure and Free Open Source software downloads Open Hub project report for Lazarus