Hey. I saw this post yesterday and since I have recently decided I would take on regex I figured this would be perfect practice.
I will show what I have built and then compare it to the code wims provided.
I use regexbudy to build and test the regex's but I also throw one into mirc occasionally to make sure the results are matching up.
Here is the code I have come up with so far. I will still be working on it to try and see if I can improve on it.
\b(?:(?:htt|ft)ps?://(?:www|ftp\.)?|www|ftp\.).*(?:\.[a-z]{2,4})(?::\d+)?(?:/\w+(?:/\w*/*)*|(?:\.[a-z]{2,4})|\?\S*)*\b
It responds to the url in every one of the following lines.
Wims code is
\b(\^@\S+|www\.\S+|http://\S+|irc\.\S+|irc://\S+|\w+(?:[\.-]\w+)?@\w+(?:[\.-]\w+)?\.[a-z]{2,4})\b
And responds to
But misses
hello ftp.host.com:21/directory/path/file.ext hello
hello ftp://ftp.host.com:21/directory/path/file.ext hello
hello ftp://host.com:21/directory/path/file.ext hello
And lastly in ftp's is
ftp://ftp.com hello
hello ftp://user:password@host.ext:21/path/ hello
I feal like I should say thanks because I really did learn a LOT making that regex. I'm sure someone more experienced in regex could find some problems with it but it works and I feal like I have a much better grasp on regex all together.
If I update the code Ill throw the new code in a new reply. I know I plan on trying to see if I can still make it better or find any problems in it or just make it smaller. I also have a friend thats decent with regex so I plan on running the line by him whenever I see him. First tho I need food and to let my eyes regain some focus. lol.
Good luck.