Recent

Author Topic: Slighty Annoying issue with Curly Quotes, both single and double.  (Read 722 times)

zxandris

  • Full Member
  • ***
  • Posts: 101
I've got a routine where I'm trying to remove both left and right quotes, single and double (Though currently working on double) ”These“ from a document.  Now I've looked up the ASCII and I'm still not actually removing the darn things.  It should as best I can tell, and I'm not entirely sure they are the right codes, 0147 and 0148, below is the routine I'm using that doesn't actually work.  I would be SO grateful if anyone could help me fix it?

Code: Pascal  [Select][+][-]
  1. Function TfrmMain.RemoveFancyQuotes(const str : String) : String;
  2. var
  3.    a : Integer;
  4.    txt : String;
  5.    lQ, rQ : Integer;
  6. begin
  7.      txt := str;
  8.      lq := 0147;
  9.      rq := 0148;
  10.      for a := 1 to length(txt) do
  11.      begin
  12. //         if txt[a] = chr(225) then txt[a] := '"';
  13. //         if txt[a] = chr(226) then txt[a] := '"';”
  14.          if txt[a] = '“' then txt[a] := '"';
  15.          if txt[a] = '”' then txt[a] := '"';
  16.          if ord(txt[a]) = lq then txt[a] := '"';
  17.          if ord(txt[a]) = rq then txt[a] := '"';
  18.          if txt[a] = chr(0147) then txt[a] := '"';
  19.          if txt[a] = chr(0148) then txt[a] := '"';
  20.      end;
  21.      txt := utf8ToAnsi(txt);
  22.      result := txt;
  23. end;
  24.  

As you can see I'm duplicating my effort in an attempt to remove the darn things.

Please, this is driving me crazy, it seems so simple to do and in reality it isn't actually working!?

CJ
« Last Edit: March 04, 2024, 03:54:32 pm by zxandris »

Fibonacci

  • Hero Member
  • *****
  • Posts: 593
  • Internal Error Hunter
Re: Slighty Annoying issue with Curly Quotes, both single and double.
« Reply #1 on: March 04, 2024, 04:12:07 pm »
Code: Pascal  [Select][+][-]
  1. function replacequotes(str: string): string;
  2. var
  3.   w: widestring;
  4.   i: integer;
  5. begin
  6.   w := UTF8Decode(str);
  7.   for i := 1 to length(w) do
  8.     if w[i] = '”' then w[i] := '"';
  9.   result := UTF8Encode(w);
  10. end;

EDIT. Just doing "UTF8Encode(UTF8Decode(s))" replaces the quotes. The loop is unnecessary ;)
EDIT 2. Or not, it was just in console app. Use paweld's answer.
« Last Edit: March 04, 2024, 04:24:15 pm by Fibonacci »

paweld

  • Hero Member
  • *****
  • Posts: 1268
Re: Slighty Annoying issue with Curly Quotes, both single and double.
« Reply #2 on: March 04, 2024, 04:13:49 pm »
In UTF8, this quote takes up 3 bytes. Use StringReplace:
Code: Pascal  [Select][+][-]
  1. function TfrmMain.RemoveFancyQuotes(const str: String): String;
  2. begin
  3.   Result := str;
  4.   if pos('“') > 0 then
  5.     Result := StringReplace(Result, '“', '"', [rfReplaceAll]);
  6.   if pos('”') > 0 then
  7.     Result := StringReplace(Result, '”', '"', [rfReplaceAll]);
  8. end;
or UTF8Length:
Code: Pascal  [Select][+][-]
  1. uses
  2.   LazUTF8;
  3.  
  4. function TfrmMain.RemoveFancyQuotes(const str: String): String;
  5. var
  6.   i: Integer;
  7.   s: String;
  8. begin
  9.   Result := '';
  10.   for i := 1 to UTF8Length(Result) do
  11.   begin
  12.     s := UTF8Copy(str, i, 1);
  13.     if (s = '“') or (s = '”') then
  14.       Result := Result + '"'
  15.     else
  16.       Result := Result + s;
  17.   end;
  18. end;
  19.  
Best regards / Pozdrawiam
paweld

zxandris

  • Full Member
  • ***
  • Posts: 101
Re: Slighty Annoying issue with Curly Quotes, both single and double.
« Reply #3 on: March 04, 2024, 04:25:49 pm »
Nice, thanks guys, both those work fine, but this brings me to a similiar issue and it's an odd but related.  Word will replace "Word - Word" with a long dash" – " between the words.  Now I would assume I could use StringReplace for that too. Trying to copy and paste that into the editor to replace isn't working and it'll always show up as strange characters when I try to replace it, so I'm a little lost.  Does anyone know how to convert what I assume is an extended character set character into say a normal dash without issue?

paweld

  • Hero Member
  • *****
  • Posts: 1268
Re: Slighty Annoying issue with Curly Quotes, both single and double.
« Reply #4 on: March 04, 2024, 04:37:46 pm »
Is that the point?
Code: Pascal  [Select][+][-]
  1. Result := StringReplace(Result, '—', '-', [rfReplaceAll]);
  2.  
Best regards / Pozdrawiam
paweld

zxandris

  • Full Member
  • ***
  • Posts: 101
Re: Slighty Annoying issue with Curly Quotes, both single and double.
« Reply #5 on: March 05, 2024, 08:56:03 am »
Is that the point?
Code: Pascal  [Select][+][-]
  1. Result := StringReplace(Result, '—', '-', [rfReplaceAll]);
  2.  

Hi there tried that didn't work, I've actually created it's own posting since this is so weird

https://forum.lazarus.freepascal.org/index.php?topic=66488.new#new

 

TinyPortal © 2005-2018