Recent

Author Topic: removing accents in string  (Read 14002 times)

cazzajay

  • Jr. Member
  • **
  • Posts: 94
removing accents in string
« on: May 10, 2011, 06:18:09 pm »
hello

wondering if anyone could help me please!

i am trying to format a string ("artist") and remove all the accented characters (in fact, replace them with a substring - in the example below, "z".

But my code below does not seem to work properly. There are numerous resources online on how to do this but they are all for delphi and do not work in lazarus?

thanks for any assistance!

Code: [Select]
 
var
artist: string;
examine: char;
q: integer;
begin
for q := 1 to length(artist) do
   begin
   examine := artist[q];
   if ord(examine) > 127 then
   stringreplace(artist, artist[q], 'z',[rfReplaceAll]);
   end;
end
     
« Last Edit: May 10, 2011, 06:20:10 pm by cazzajay »
Windows XP 32 bit / Lazarus 1.0.6 / FPC 2.6.0

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9754
  • Debugger - SynEdit - and more
    • wiki
Re: removing accents in string
« Reply #1 on: May 10, 2011, 06:40:50 pm »
Please read http://wiki.lazarus.freepascal.org/LCL_Unicode_Support

If the content of your string variable comes from an edit field memo or anything, it will be in utf8.
If loaded from a file it depends on the file

Mando

  • Full Member
  • ***
  • Posts: 181
Re: removing accents in string
« Reply #2 on: May 10, 2011, 06:45:56 pm »
if you are using UTF8 (defautl for lazarus) the accented chars use 2 bytes (195 + ...)
try this function of mine:

Code: [Select]
function ReplaceAccent1(aStr: String): string;
var i:integer;
    auxStr: string;
begin
  // +---- Convert string to ANSI ---------------------------------------------+
  aStr:=UTF8ToANSI(aStr);
  for i:=1 to length(aStr) do
    if aStr[i]>#127 then aStr[i]:='z';
  result:=ANSITOUTF8(aStr);
end;               

theo

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1927
Re: removing accents in string
« Reply #3 on: May 10, 2011, 08:47:22 pm »
thanks for any assistance!
 

In addition to what is already said here: StringReplace is a function. You use it as if it was a procedure, not using the function result.

cazzajay

  • Jr. Member
  • **
  • Posts: 94
Re: removing accents in string
« Reply #4 on: May 11, 2011, 09:35:19 am »
if you are using UTF8 (defautl for lazarus) the accented chars use 2 bytes (195 + ...)
try this function of mine:

Code: [Select]
function ReplaceAccent1(aStr: String): string;
var i:integer;
    auxStr: string;
begin
  // +---- Convert string to ANSI ---------------------------------------------+
  aStr:=UTF8ToANSI(aStr);
  for i:=1 to length(aStr) do
    if aStr[i]>#127 then aStr[i]:='z';
  result:=ANSITOUTF8(aStr);
end;               

works a treat, thanks!
and theo, ill have a bit of a read about StringReplace to find out more!
thanks all!
Windows XP 32 bit / Lazarus 1.0.6 / FPC 2.6.0

Bart

  • Hero Member
  • *****
  • Posts: 5265
    • Bart en Mariska's Webstek
Re: removing accents in string
« Reply #5 on: May 11, 2011, 11:16:48 am »
@Theo: your solution is rather less complex than the one I came up with for TMaskEdit, where I have an UTF8ToAscii function that replaces all non-lower ascci characters with a '?'.

@cazzajay: if you take a look at Utf8ToAscii() in TMaskEdit, you can use that as a framework to replace for example accented chars with thir non-accented ascii equivalents (ë -> e, ä -> a) etc.

Bart

cazzajay

  • Jr. Member
  • **
  • Posts: 94
Re: removing accents in string
« Reply #6 on: May 11, 2011, 03:00:43 pm »
@Theo: your solution is rather less complex than the one I came up with for TMaskEdit, where I have an UTF8ToAscii function that replaces all non-lower ascci characters with a '?'.

@cazzajay: if you take a look at Utf8ToAscii() in TMaskEdit, you can use that as a framework to replace for example accented chars with thir non-accented ascii equivalents (ë -> e, ä -> a) etc.

Bart

im confused, how do i use a TMaskEdit component to change the chars to their non accented equivs? that would be ideal rather than substituting a "wildcard"style character for it...
Windows XP 32 bit / Lazarus 1.0.6 / FPC 2.6.0

 

TinyPortal © 2005-2018