Recent

Author Topic: IPv4 Regular Expression - boundary issues  (Read 4935 times)

Gizmo

  • Hero Member
  • *****
  • Posts: 831
IPv4 Regular Expression - boundary issues
« on: June 11, 2012, 11:38:05 pm »
Using the regexpr unit, I am using the following reg exp to find IPv4 values:

Code: [Select]
IPExpressionMain.Expression := '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b';
   

It works great, except if characters are but up to either the start or end of an IP address. For example:

IP Address 192.168.1.1 is found
IP Address192.168.1.1 is not found
IP Address 192.168.1.1is not found

How do I chnage my expression to find the address in the context of the last two?

Ted

IPguy

  • Sr. Member
  • ****
  • Posts: 385
Re: IPv4 Regular Expression - boundary issues
« Reply #1 on: June 12, 2012, 01:38:12 am »
To quote another member of the forum: "Google is your friend".

http://answers.oreilly.com/topic/318-how-to-match-ipv4-addresses-with-regular-expressions/

note that 192.168.001.001 is a valid IP address, as is 192.168.1.0.



Gizmo

  • Hero Member
  • *****
  • Posts: 831
Re: IPv4 Regular Expression - boundary issues
« Reply #2 on: June 12, 2012, 11:38:35 am »
To quote the link (which is useful, by the way,) :

Quote
If you want to find IP addresses within longer text, use one of the regexes that begin and end with the word boundaries \b.

As you can see, I have already tried that. Which is why I asked the question.

Ted

BigChimp

  • Hero Member
  • *****
  • Posts: 5740
  • Add to the wiki - it's free ;)
    • FPCUp, PaperTiger scanning and other open source projects
Re: IPv4 Regular Expression - boundary issues
« Reply #3 on: June 12, 2012, 11:46:02 am »
IIRC, \b means a space of some kind and then a next word.

You'll have to strip out the \b at the beginning and end of your regex. Don't know what false positivies you'll get but... you asked for it.
Thinking a bit further, you probably want to disallow numeric characters at either end, so you could add that.
Want quicker answers to your questions? Read http://wiki.lazarus.freepascal.org/Lazarus_Faq#What_is_the_correct_way_to_ask_questions_in_the_forum.3F

Open source including papertiger OCR/PDF scanning:
https://bitbucket.org/reiniero

Lazarus trunk+FPC trunk x86, Windows x64 unless otherwise specified

 

TinyPortal © 2005-2018