Recent

Author Topic: Strings and special characters removal  (Read 9151 times)

JLWest

  • Hero Member
  • *****
  • Posts: 1293
Strings and special characters removal
« on: February 14, 2019, 11:35:01 pm »
I'm processing a string thru the following statement which removes special characters.

if Str in [#3,#4,#5,#6,#7,#8,#9] then Str:= ' '; end;

However I'm still getting some strange characters on the output. For example an 'RT' with a small 3.

I would like the output to be limited to 'A' .. 'Z' and '0' ..'9'. 
If different I will store in a listbox and manually examine for correction.

I don't know how to write the statements.
Can someone give me an idea where to look or search criteria.

Thanks.
FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

jamie

  • Hero Member
  • *****
  • Posts: 6128
Re: Strings and special characters removal
« Reply #1 on: February 14, 2019, 11:55:13 pm »
First of all, please use a different word instead of "str" it is reserved for the STR procedure in the Run time.

Anyways.

   if not mystr in ['A'..'Z', '0'..'9'] then MyStr := '';
The only true wisdom is knowing you know nothing

JLWest

  • Hero Member
  • *****
  • Posts: 1293
Re: Strings and special characters removal
« Reply #2 on: February 15, 2019, 12:05:29 am »
First of all, please use a different word instead of "str" it is reserved for the STR procedure in the Run time.

Anyways.

   if not mystr in ['A'..'Z', '0'..'9'] then MyStr := '';

I'll try it . Made the change to tStr from Str.

Thanks
FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

Blaazen

  • Hero Member
  • *****
  • Posts: 3237
  • POKE 54296,15
    • Eye-Candy Controls
Re: Strings and special characters removal
« Reply #3 on: February 15, 2019, 12:06:27 am »
Code: Pascal  [Select][+][-]
  1. if not mystr in ['A'..'Z', '0'..'9'] then MyStr := '';

This will not work for string, mystr must be type char here, and not has precedence to in operator.

Code: Pascal  [Select][+][-]
  1. var myStr: Char;
  2. ...
  3. if not (mystr in ['A'..'Z', '0'..'9']) then MyStr := #0;

String sshould be precessed char by char + take care if it is UTF8.
Lazarus 2.3.0 (rev main-2_3-2863...) FPC 3.3.1 x86_64-linux-qt Chakra, Qt 4.8.7/5.13.2, Plasma 5.17.3
Lazarus 1.8.2 r57369 FPC 3.0.4 i386-win32-win32/win64 Wine 3.21

Try Eye-Candy Controls: https://sourceforge.net/projects/eccontrols/files/

Kays

  • Hero Member
  • *****
  • Posts: 574
  • Whasup!?
    • KaiBurghardt.de
Re: Strings and special characters removal
« Reply #4 on: February 15, 2019, 03:03:15 am »
[…] mystr must be type char here […]
Well, then you just iterate over it:
Code: Pascal  [Select][+][-]
  1. {$mode objFPC}
  2. {$longstrings on}
  3. uses
  4.         sysUtils;
  5.  
  6. function allCapsAndDigits(const s: string): string;
  7. var
  8.         c: char;
  9. begin
  10.         for c in s do
  11.         begin
  12.                 if c in ['0'..'9', 'A'..'Z'] then
  13.                 begin
  14.                         appendStr(allCapsAndDigits, c);
  15.                 end;
  16.         end;
  17. end;
[…] If different I will store in a listbox and manually examine for correction. […]
Huh, if you're processing user input, you ideally check whether a valid character has been entered every time as a key has been pressed. This way you don't waste time storing and post-processing input you'll discard anyway.
Yours Sincerely
Kai Burghardt

dbannon

  • Hero Member
  • *****
  • Posts: 2791
    • tomboy-ng, a rewrite of the classic Tomboy
Re: Strings and special characters removal
« Reply #5 on: February 15, 2019, 04:30:37 am »
Kays, nice, I love recursion. (to understand recursion, first you need to understand recursion)

Anyway, perhaps another approach (that does not require a new function) might be -

Code: Pascal  [Select][+][-]
  1.     Index := 1;
  2.     while Index <= length(MyStr) do
  3.         if not (MyStr[Index] in ['0'..'9', 'A'..'Z']) then
  4.             delete(MyStr, Index, 1)
  5.         else inc(Index);                

Will cleanly remove any UTF8 codes too.
Davo
Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

JLWest

  • Hero Member
  • *****
  • Posts: 1293
Re: Strings and special characters removal
« Reply #6 on: February 15, 2019, 04:41:36 am »
[…] mystr must be type char here […]
Well, then you just iterate over it:
Code: Pascal  [Select][+][-]
  1. {$mode objFPC}
  2. {$longstrings on}
  3. uses
  4.         sysUtils;
  5.  
  6. function allCapsAndDigits(const s: string): string;
  7. var
  8.         c: char;
  9. begin
  10.         for c in s do
  11.         begin
  12.                 if c in ['0'..'9', 'A'..'Z'] then
  13.                 begin
  14.                         appendStr(allCapsAndDigits, c);
  15.                 end;
  16.         end;
  17. end;
[…] If different I will store in a listbox and manually examine for correction. […]
Huh, if you're processing user input, you ideally check whether a valid character has been entered every time as a key has been pressed. This way you don't waste time storing and post-processing input you'll discard anyway.

NO, Not processing input. Reading a file and saving the data record to a to  a list. Then going thru the listbox and correction the text.
  box
FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

PascalDragon

  • Hero Member
  • *****
  • Posts: 5462
  • Compiler Developer
Re: Strings and special characters removal
« Reply #7 on: February 15, 2019, 09:00:57 am »
First of all, please use a different word instead of "str" it is reserved for the STR procedure in the Run time.

It's not reserved. Otherwise the compiler would complain. It simply might be confusing if one looks at the code.

Blaazen

  • Hero Member
  • *****
  • Posts: 3237
  • POKE 54296,15
    • Eye-Candy Controls
Re: Strings and special characters removal
« Reply #8 on: February 15, 2019, 01:58:31 pm »
Quote
NO, Not processing input. Reading a file and saving the data record to a to  a list. Then going thru the listbox and correction the text.
And can there be UTF8 characters or is the plain file ASCII?
Lazarus 2.3.0 (rev main-2_3-2863...) FPC 3.3.1 x86_64-linux-qt Chakra, Qt 4.8.7/5.13.2, Plasma 5.17.3
Lazarus 1.8.2 r57369 FPC 3.0.4 i386-win32-win32/win64 Wine 3.21

Try Eye-Candy Controls: https://sourceforge.net/projects/eccontrols/files/

furious programming

  • Hero Member
  • *****
  • Posts: 858
Re: Strings and special characters removal
« Reply #9 on: February 15, 2019, 02:02:42 pm »
Kays, nice, I love recursion. (to understand recursion, first you need to understand recursion)

But this code does not use recursion. @Kays used the function name instead of Result variable. The code of this function can be written differently (in a simpler way), which will give the same result:

Code: Pascal  [Select][+][-]
  1. function ToAlphanumericString(const AString: String): String;
  2. var
  3.   Character: Char;
  4. begin
  5.   Result := '';
  6.  
  7.   for Character in AString do
  8.     if Character in ['0'..'9', 'A'..'Z'] then
  9.       Result += Character;
  10. end;

Quote
Anyway, perhaps another approach (that does not require a new function) might be - […]

No, never use such a structure (Delete in a loop) because it is slow and unclear.
Lazarus 3.2 with FPC 3.2.2, Windows 10 — all 64-bit

Working solo on an acrade, action/adventure game in retro style (pixelart), programming the engine and shell from scratch, using Free Pascal and SDL. Release planned in 2026.

lucamar

  • Hero Member
  • *****
  • Posts: 4219
Re: Strings and special characters removal
« Reply #10 on: February 15, 2019, 02:24:34 pm »
Quote
NO, Not processing input. Reading a file and saving the data record to a to  a list. Then going thru the listbox and correction the text.
And can there be UTF8 characters or is the plain file ASCII?

In this particular case it doesn' really matter whether there are UTF 8 characters: they'll be discarded anyway. Remember that any byte in a UTF 8 char will have its last bit set and will not satisfy the condition: "char in ['A..Z', '0..9']".

What would matter is wheter the file can contain UTF-16 or 32-bit unicode chars. Or be in other not-so-respectful multibyte encoding.
« Last Edit: February 15, 2019, 02:29:06 pm by lucamar »
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus/FPC 2.0.8/3.0.4 & 2.0.12/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

JLWest

  • Hero Member
  • *****
  • Posts: 1293
Re: Strings and special characters removal
« Reply #11 on: February 15, 2019, 03:09:55 pm »
Quote
NO, Not processing input. Reading a file and saving the data record to a to  a list. Then going thru the listbox and correction the text.
And can there be UTF8 characters or is the plain file ASCII?

No this I think is a UTF-8 file.
FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

Zvoni

  • Hero Member
  • *****
  • Posts: 2327
Re: Strings and special characters removal
« Reply #12 on: February 15, 2019, 06:58:38 pm »
Errrr.... maybe a stupid question:
What's wrong with defining an Array containing all illegal characters, then just loop through that array firing of a ReplaceStr or StringReplace against the Target-String?
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

JLWest

  • Hero Member
  • *****
  • Posts: 1293
Re: Strings and special characters removal
« Reply #13 on: February 15, 2019, 09:05:37 pm »
Errrr.... maybe a stupid question:
What's wrong with defining an Array containing all illegal characters, then just loop through that array firing of a ReplaceStr or StringReplace against the Target-String?

Probly would work, but I got it to work fine with an above solution.
FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

dbannon

  • Hero Member
  • *****
  • Posts: 2791
    • tomboy-ng, a rewrite of the classic Tomboy
Re: Strings and special characters removal
« Reply #14 on: February 15, 2019, 10:25:55 pm »
Errrr.... maybe a stupid question:
What's wrong with defining an Array containing all illegal characters, then just loop through that array firing of a ReplaceStr or StringReplace against the Target-String?

He has a small set of acceptable characters. If the string is just ascii then there is a another, not much bigger set of non acceptable ones. But it sounds like the string maybe UTF8 (or some other unicode ?) - that makes the unacceptable list quite big, er, huge !

Davo
Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

 

TinyPortal © 2005-2018