Recent

Author Topic: Basic Strings, Fresh eyes please.  (Read 12880 times)

commodianus

  • New Member
  • *
  • Posts: 18
Basic Strings, Fresh eyes please.
« on: April 02, 2010, 08:22:13 pm »
Code: [Select]

ArtName := Copy(CurArt,Pos('">',CurArt)+2, Length(CurArt)-Pos('</a>',CurArt)+6);

 

Artname, if it helps, will look like this:

Code: [Select]
<li><a href="index.cfm?id=37237">ZOCE, OUR LADY OF</a></li>
Though naturally, the link text changes.

The code worked, I thought, except if the link text is >9 characters (cutts it off there). Suggestions? I think there's got to be a better way to deal with the last param.
« Last Edit: April 02, 2010, 08:46:13 pm by commodianus »

eny

  • Hero Member
  • *****
  • Posts: 1634
Re: Basic Strings, Fresh eyes please.
« Reply #1 on: April 02, 2010, 08:52:36 pm »
Use regular expressions for this.
All posts based on: Win10 (Win64); Lazarus 2.0.10 'stable' (x64) unless specified otherwise...

dfeher

  • New Member
  • *
  • Posts: 19
Re: Basic Strings, Fresh eyes please.
« Reply #2 on: April 02, 2010, 09:07:11 pm »
Try this:

Code: [Select]
ArtName := RightStr(CurArt, Length(CurArt) - Pos('">', CurArt) + 1);
ArtName := LeftStr(ArtName, Pos('</a>', ArtName));

I hope this works.
« Last Edit: April 02, 2010, 09:12:07 pm by dfeher »

theo

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1927
Re: Basic Strings, Fresh eyes please.
« Reply #3 on: April 02, 2010, 09:12:40 pm »
Copy does not work like
Copy(String, FromPosition, ToPosition) but like
Copy(String, FromPosition, Count).

commodianus

  • New Member
  • *
  • Posts: 18
Re: Basic Strings, Fresh eyes please.
« Reply #4 on: April 02, 2010, 10:25:06 pm »
Copy does not work like
Copy(String, FromPosition, ToPosition) but like
Copy(String, FromPosition, Count).


Yeah I guess what I was trying to do is get the count based on where </a> was.

commodianus

  • New Member
  • *
  • Posts: 18
Re: Basic Strings, Fresh eyes please.
« Reply #5 on: April 02, 2010, 10:29:04 pm »
Works, though I can't help but think I was close to something the other way.

Code: [Select]

    ArtName := RightStr(CurArt, Length(CurArt) - Pos('">', CurArt)-1);
    ArtName := LeftStr(ArtName, Pos('</a>', ArtName)-1);       


eny

  • Hero Member
  • *****
  • Posts: 1634
Re: Basic Strings, Fresh eyes please.
« Reply #6 on: April 02, 2010, 11:12:30 pm »
Try using Andrey V. Sorokin's fantastic regular expression engine.
Impossible to find nowadays. I managed to find a French site where the full FPC compatible code is available: download from here.

For example:
Code: [Select]
var reg: TRegExpr;
begin
  reg := TRegExpr.Create;
  reg.Expression := '<a href="(.*?)">(.*?)</a>';
  if reg.Exec(CurArt) then
    ArtName := reg.Match[2];
  reg.Free;
end;                                                                        
« Last Edit: April 02, 2010, 11:14:17 pm by eny »
All posts based on: Win10 (Win64); Lazarus 2.0.10 'stable' (x64) unless specified otherwise...

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11458
  • FPC developer.
Re: Basic Strings, Fresh eyes please.
« Reply #7 on: April 03, 2010, 01:40:14 pm »
Code: [Select]

  reg.Expression := '<a href="(.*?)">(.*?)</a>';

Sorry, prefer something readable ;-)

eny

  • Hero Member
  • *****
  • Posts: 1634
Re: Basic Strings, Fresh eyes please.
« Reply #8 on: April 03, 2010, 04:35:28 pm »
Sorry, prefer something readable

Better learn to read then  :'(
All posts based on: Win10 (Win64); Lazarus 2.0.10 'stable' (x64) unless specified otherwise...

theo

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1927
Re: Basic Strings, Fresh eyes please.
« Reply #9 on: April 03, 2010, 04:44:35 pm »
marcov is right, RegEx Syntax is a PITA and looks even more odd in a "pascal environment".

eny

  • Hero Member
  • *****
  • Posts: 1634
Re: Basic Strings, Fresh eyes please.
« Reply #10 on: April 03, 2010, 05:17:15 pm »
You're entitled to your opinion of course, as is marcov.

RE's are a powerful mechanism to solve many PITA parsing problems (like the one commodianus posted). In a "Pascal" or any other environment.
I'm simply showing another more robust, flexible, professional and future proof way to solve the given problem.
What you do with the information is up to you.
I don't intend to start a religious war about the pros and cons of regular expressions.
« Last Edit: April 03, 2010, 05:32:14 pm by eny »
All posts based on: Win10 (Win64); Lazarus 2.0.10 'stable' (x64) unless specified otherwise...

Marc

  • Administrator
  • Hero Member
  • *
  • Posts: 2584
Re: Basic Strings, Fresh eyes please.
« Reply #11 on: April 03, 2010, 05:46:05 pm »
Regexes are a powerful tool to match on patterns. I use them myself too. And for the given case they might be OK.
But that doesn't mean they are easy to maintain or to debug due to its concatenation of gibberish.

Before you know you might end up with something like this (incorrect and should be on one line)
Code: [Select]
(([0-9a-f]{1,4}:){1,1}(:[0-9a-f]{1,4}){1,6})|
(([0-9a-f]{1,4}:){1,2}(:[0-9a-f]{1,4}){1,5})|
(([0-9a-f]{1,4}:){1,3}(:[0-9a-f]{1,4}){1,4})|
(([0-9a-f]{1,4}:){1,4}(:[0-9a-f]{1,4}){1,3})|
(([0-9a-f]{1,4}:){1,5}(:[0-9a-f]{1,4}){1,2})|
(([0-9a-f]{1,4}:){1,6}(:[0-9a-f]{1,4}){1,1})|
((([0-9a-f]{1,4}:){1,7}|:):)|
(:(:[0-9a-f]{1,4}){1,7})|
(((([0-9a-f]{1,4}:){6})(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}))|
((([0-9a-f]{1,4}:){5}[0-9a-f]{1,4}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}))|
(([0-9a-f]{1,4}:){5}:[0-9a-f]{1,4}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3})|
(([0-9a-f]{1,4}:){1,1}(:[0-9a-f]{1,4}){1,4}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3})|
(([0-9a-f]{1,4}:){1,2}(:[0-9a-f]{1,4}){1,3}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3})|
(([0-9a-f]{1,4}:){1,3}(:[0-9a-f]{1,4}){1,2}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3})|
(([0-9a-f]{1,4}:){1,4}(:[0-9a-f]{1,4}){1,1}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3})|
((([0-9a-f]{1,4}:){1,5}|:):(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3})|
(:(:[0-9a-f]{1,4}){1,5}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3})
« Last Edit: April 03, 2010, 05:48:48 pm by Marc »
//--
{$I stdsig.inc}
//-I still can't read someones mind
//-Bugs reported here will be forgotten. Use the bug tracker

theo

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1927
Re: Basic Strings, Fresh eyes please.
« Reply #12 on: April 03, 2010, 06:24:42 pm »
RE's are a powerful mechanism to solve many PITA parsing problems (like the one commodianus posted). In a "Pascal" or any other environment.

I use them at times when writing PHP code.
But I'm looking for faster and better readable solutions first when writing Pascal code.

eny

  • Hero Member
  • *****
  • Posts: 1634
Re: Basic Strings, Fresh eyes please.
« Reply #13 on: April 03, 2010, 06:48:18 pm »
Of course RE's should be used when appropriate.
Marc's example perfectly shows how not to use RE's.

@theo: If I look at this code:
Code: [Select]
    ArtName := RightStr(CurArt, Length(CurArt) - Pos('">', CurArt)-1);
    ArtName := LeftStr(ArtName, Pos('</a>', ArtName)-1);   
I have to read it a couple of times to understand what really happens. It's the same as with Italian: I can read it but I don't understand a word of it! RE's are an elegant way to get rid of this textparsing code.
Less code means fewer chances for mistakes and errors and lower maintenance costs.
All posts based on: Win10 (Win64); Lazarus 2.0.10 'stable' (x64) unless specified otherwise...

theo

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1927
Re: Basic Strings, Fresh eyes please.
« Reply #14 on: April 03, 2010, 07:34:45 pm »
@theo: If I look at this code:

I'd do this job like this. It looks understandable, reusable and clean to me:

Code: [Select]
uses strutils;

function ExtractBetween(AString,StartS,EndS:String):String;
var Pos1,Pos2:integer;
begin
 Result:='';
 Pos1:=Pos(StartS,ASTring);
 if Pos1> 0 then
 begin
  Pos2:=PosEx(Ends,ASTring,Pos1);
  if Pos2>0 then
   Result:=Copy(AString,Pos1+Length(StartS),Pos2-Pos1-Length(StartS));
 end;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
 Caption:=ExtractBetween('<li><a href="index.cfm?id=37237">ZOCE, OUR LADY OF</a></li>','">','</a>');
end;     

 

TinyPortal © 2005-2018