Recent

Author Topic: Determine if a char  (Read 9506 times)

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: Determine if a char
« Reply #15 on: December 12, 2018, 06:01:42 pm »
By testing against the set of 'A'..'Z' we'd do 26 comparisons, Handoko's model only does two ?
As Engkin above pointed out, the case of in 'A'..'Z', generates only one comparison instead of two.  Smaller, faster and easier to read code.  Everything is good about it.

that said, it's important to note that the efficiency of the "in" construct depends on the range being contiguous (that's the best case).  If the set is made of multiple non-contiguous ranges then it's probably best to look at the generated code to find out which one the compiler handles best. (best overuse of the word best in a while.)

(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: Determine if a char
« Reply #16 on: December 12, 2018, 06:10:48 pm »
Yes, those work too.  I like setting "result" explicitly at the function entry.
That is a safe way.

The first alternative generates code that is basically equivalent (code size and speed) as the code generated when setting result upfront.
I am not sure. I assumed it would be a tick or two faster when the string pass, as only one assignment happens. Both are identical when the string fail. But of course, the assignment upfront is the way to go.

I like the second alternative, very clean and maintainable but, it generates more code and it's also a smidgen slower.   I like the "c in ..." but I don't want to pay extra for it. ;)
I hope this will change in the future.

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: Determine if a char
« Reply #17 on: December 12, 2018, 06:17:47 pm »
I am not sure. I assumed it would be a tick or two faster when the string pass, as only one assignment happens.
The "exit(false);" causes an assignment, because of that, it's basically the same thing, the assignment just happens in a different place. (see ETA though.)

I like the second alternative, very clean and maintainable but, it generates more code and it's also a smidgen slower.   I like the "c in ..." but I don't want to pay extra for it. ;)
I hope this will change in the future.
That would be very nice, looking forward to that. :)

ETA: you are absolutely correct that when the string passes, the code you presented will be a hair faster because the assignment isn't executed.
« Last Edit: December 12, 2018, 06:22:22 pm by 440bx »
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

jamie

  • Hero Member
  • *****
  • Posts: 6090
Re: Determine if a char
« Reply #18 on: December 12, 2018, 11:24:28 pm »
So, what happens when you have lower case letters in the file?
The only true wisdom is knowing you know nothing

Zath

  • Sr. Member
  • ****
  • Posts: 391
Re: Determine if a char
« Reply #19 on: December 16, 2018, 10:52:02 am »
So, what happens when you have lower case letters in the file?
It will be regarded as false because its not in the declared A..Z range.
The extra range a..z could be added or perhaps all characters are set to upper case for the test ?

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: Determine if a char
« Reply #20 on: December 16, 2018, 11:27:43 am »
https://www.freepascal.org/docs-html/rtl/sysutils/charinset.html
Code: Pascal  [Select][+][-]
  1. {$mode delphi}{$ifdef windows}{$apptype console}{$endif}
  2. uses sysutils;
  3. var i:integer;
  4.     s:string ='aBn2345HnPll120$# ';
  5. begin
  6.   for i := 1 to length(s) do  //strings start at 1.
  7.     writeln(i:4,s[i]:4,charinset(s[i],['A'..'Z','a'..'z']):6);
  8. end.
Why did everybody miss CharInSet? It is inlined too.
« Last Edit: December 16, 2018, 11:44:06 am by Thaddy »
Specialize a type, not a var.

jamie

  • Hero Member
  • *****
  • Posts: 6090
Re: Determine if a char
« Reply #21 on: December 16, 2018, 04:11:59 pm »
@Zath,
  That was a rhetorical question, a lure, and you fell for it ! hook line and sinker! :D

 I just thought it was an obvious over look..
 
 Of course Thaddy is here for the rescues with a suggestion that will cure all..

btw, I used to have cat named Tabby!

 Btw, isn't the CharInSet only usefull if the string are WideString and don't you need to specify that even in Delphi mode for
backwards compatibility ?



The only true wisdom is knowing you know nothing

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: Determine if a char
« Reply #22 on: December 16, 2018, 05:19:06 pm »
Btw, isn't the CharInSet only useful if the string are WideString and don't you need to specify that even in Delphi mode for
backwards compatibility ?
In Delphi originally yes, but as you can see if you examine the sourcecode - in both Delphi and Fpc - it does the job.
It works for any string type supported by Freepascal. TSysCharSet.
« Last Edit: December 16, 2018, 05:28:12 pm by Thaddy »
Specialize a type, not a var.

Zath

  • Sr. Member
  • ****
  • Posts: 391
Re: Determine if a char
« Reply #23 on: December 17, 2018, 12:29:16 am »
@Zath,
  That was a rhetorical question, a lure, and you fell for it ! hook line and sinker! :D

@jamie
Swine !
I've been ill this last three weeks so the obvious is easily missed lol.
The infection I've had really messes with your head.

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: Determine if a char
« Reply #24 on: December 17, 2018, 11:18:57 am »
Both Jamie and Zath are missing the point: CharInSet is string type agnostic. I am amazed that everybody except me missed that....
And it answers the original question.
Specialize a type, not a var.

jamie

  • Hero Member
  • *****
  • Posts: 6090
Re: Determine if a char
« Reply #25 on: December 17, 2018, 11:25:09 pm »
Yeah but I was called a Swine so I don't count!  :)
The only true wisdom is knowing you know nothing

Zoran

  • Hero Member
  • *****
  • Posts: 1829
    • http://wiki.lazarus.freepascal.org/User:Zoran
Re: Determine if a char
« Reply #26 on: December 18, 2018, 12:14:23 am »
Why did everybody miss CharInSet? It is inlined too.

What is the advantage of
Code: Pascal  [Select][+][-]
  1. CharInSet(Ch, CSet)
over
Code: Pascal  [Select][+][-]
  1. Ch in CSet
?
It is not shorter, it is not more readable... what is the point of this function?

HeavyUser

  • Sr. Member
  • ****
  • Posts: 397
Re: Determine if a char
« Reply #27 on: December 18, 2018, 12:17:25 am »
Why did everybody miss CharInSet? It is inlined too.

What is the advantage of
Code: Pascal  [Select][+][-]
  1. CharInSet(Ch, CSet)
over
Code: Pascal  [Select][+][-]
  1. Ch in CSet
?
It is not shorter, it is not more readable... what is the point of this function?
in is confined to 255 values only aka ascii range of characters charinset can use any range of unicode characters but it will be slower.

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: Determine if a char
« Reply #28 on: December 18, 2018, 03:26:43 am »
charinset can use any range of unicode characters but it will be slower.
The second parameter in CharInSet is of type:
Code: Pascal  [Select][+][-]
  1. type TSysCharSet = set of AnsiChar;

When the first parameter is of type WideChar:
Code: Pascal  [Select][+][-]
  1. Function CharInSet(Ch:WideChar;Const CSet : TSysCharSet) : Boolean;
  2. begin
  3.   result:=(Ch<=#$FF) and (ansichar(byte(ch)) in CSet);
  4. end;

Notice Ch<=#$FF.

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: Determine if a char
« Reply #29 on: December 18, 2018, 09:04:08 am »
Correct. Sets can only be of an ordinal type, so not Widechars because these are not ordinal. for WideChars <= 255 maps to the AnsiSet, hence that will work for the set and for Unicode and for Ansi. To be more clear: A higher Unicode char can not be part of the set...  It is quite nifty code but easily misinterpreted:
- A Widechar that satisfies the Ansi set has a high byte value of zero, hence $FF can be used as the comparison to determine a possible Ansi compatible value
- The byte cast picks the lower value of a word (or larger) sized pair and identifies the actual AnsiChar mapping itself.
« Last Edit: December 18, 2018, 09:19:44 am by Thaddy »
Specialize a type, not a var.

 

TinyPortal © 2005-2018