mIRC Homepage

parse a string (check if URL is in a string)

Posted By: Solo1

parse a string (check if URL is in a string) - 15/07/08 02:34 PM

Hi people

What is the most efficient way to check if a URL is in a string? i have a feeling it may be a regex. Can anyone help me?

Thanks in advance
Posted By: Wims

Re: parse a string (check if URL is in a string) - 15/07/08 05:38 PM

Have you make a search on this forum ?
look here http://forums.mirc.com/ubbthreads.php?ub...true#Post192127
Posted By: Solo1

Re: parse a string (check if URL is in a string) - 15/07/08 07:40 PM

Hi
It does not look like its going to help me as the OP of that post wanted something else. I do not know how to implement it as i know very little regex. does anyone else perhaps have a suggestion?

thank you kindly
Posted By: Solo1

Re: parse a string (check if URL is in a string) - 16/07/08 08:37 AM

Hi people
I have tried searching the forum again and cannot find a satisfactory post. For example this is not a reliable way of checking and that check is looking to ban advertisers on a channel although it would ban anyone typing www. All i am looking for is a reliable way to check that if a URL is in a string to return true.

Thanks
Posted By: WhipLash

Re: parse a string (check if URL is in a string) - 16/07/08 08:40 AM

I'd say it depends on the type of URL you wish to process. you could get the regex to look for 'http://' or 'www.'. But it is not necessary to use regex per se. Once you've worked out what regex should search for, it shouldn't be to hard.

Why not just use:
Code:
if (*http://* isin $1-) || (*www.* isin $1-) { Do something here }


The regex i got is:
Code:
$regex(test, $1-, http://.+|www\..+)


The regex will not work for only 'www.' or 'http://' it needs to have 'www.somethingmorehere' or 'http://this could be a web address'
Posted By: Solo1

Re: parse a string (check if URL is in a string) - 16/07/08 11:15 AM

Hi
Thanks for your reply. the problem with your fist solution is that if the string contains just http:// or "this is a www string" it will return true even though technically there is not a URL in the string.
Your regex is a little better but not perfect. Is there any better more assured way?
I for now will use your regex.

Thanks
Posted By: WhipLash

Re: parse a string (check if URL is in a string) - 16/07/08 12:09 PM

you see, im not sure what you plan on using the script for. Adding more to the regex is not easy considering the range of possiblities in web address today. I'll play some more though and get back to you. smile
Posted By: Wims

Re: parse a string (check if URL is in a string) - 16/07/08 01:12 PM

Code:
alias isurl return $iif($regex($1-,/\b(\^@\S+|www\.\S+|http://\S+|irc\.\S+|irc://\S+|\w+(?:[\.-]\w+)?@\w+(?:[\.-]\w+)?\.[a-z]{2,4})\b/gi),$iif($prop,$regml($v1),$true))

$isurl(www.mirc.com) return $true
$isurl(www.mirc.com www.GaisGa.com).2 return www.GaisGa.com
Posted By: Solo1

Re: parse a string (check if URL is in a string) - 16/07/08 02:39 PM

woah laugh that looks complicated. I will try it, thanks
Posted By: Typos

Re: parse a string (check if URL is in a string) - 17/07/08 09:10 AM

Hey. I saw this post yesterday and since I have recently decided I would take on regex I figured this would be perfect practice.
I will show what I have built and then compare it to the code wims provided.
I use regexbudy to build and test the regex's but I also throw one into mirc occasionally to make sure the results are matching up.
Here is the code I have come up with so far. I will still be working on it to try and see if I can improve on it.
Quote:
\b(?:(?:htt|ft)ps?://(?:www|ftp\.)?|www|ftp\.).*(?:\.[a-z]{2,4})(?::\d+)?(?:/\w+(?:/\w*/*)*|(?:\.[a-z]{2,4})|\?\S*)*\b

It responds to the url in every one of the following lines.
Quote:
www.sss.com/hello.html hello
hello and welcome to www.regexbuddy.com hello
http://www.regexbuddy.com/ hello
Yes, http://www.regexbuddy.com/index.html is a link!
https://www.regexbuddy.com/index.html?source=library
You can download RegexBuddy at http://www.regexbuddy.com/download.html.
hello http://host.com:21/directory/path/
hello ftp.host.com:21/directory/path/file.ext hello
hello ftp://ftp.host.com:21/directory/path/file.ext hello
hello ftp://host.com:21/directory/path/file.ext hello
hello ftp://user:password@host.ext:21/path/ hello
And lastly in ftp's is ftp://ftp.com hello
hello http://user:password@host.ext/path hello
hello http://user:password@host.ext:21/path/directory/hello.html?1=2aaa hello


Wims code is
Quote:
\b(\^@\S+|www\.\S+|http://\S+|irc\.\S+|irc://\S+|\w+(?:[\.-]\w+)?@\w+(?:[\.-]\w+)?\.[a-z]{2,4})\b

And responds to
Quote:
www.sss.com/hello.html hello
hello and welcome to www.regexbuddy.com hello
http://www.regexbuddy.com/ hello
Yes, http://www.regexbuddy.com/index.html is a link!
https://www.regexbuddy.com/index.html?source=library
You can download RegexBuddy at http://www.regexbuddy.com/download.html.
hello http://host.com:21/directory/path/
hello http://user:password@host.ext/path hello
hello http://user:password@host.ext:21/path/directory/hello.html?1=2aaa hello

But misses
Quote:
hello ftp.host.com:21/directory/path/file.ext hello
hello ftp://ftp.host.com:21/directory/path/file.ext hello
hello ftp://host.com:21/directory/path/file.ext hello
And lastly in ftp's is ftp://ftp.com hello
hello ftp://user:password@host.ext:21/path/ hello


I feal like I should say thanks because I really did learn a LOT making that regex. I'm sure someone more experienced in regex could find some problems with it but it works and I feal like I have a much better grasp on regex all together.

If I update the code Ill throw the new code in a new reply. I know I plan on trying to see if I can still make it better or find any problems in it or just make it smaller. I also have a friend thats decent with regex so I plan on running the line by him whenever I see him. First tho I need food and to let my eyes regain some focus. lol.

Good luck.
Posted By: Wims

Re: parse a string (check if URL is in a string) - 17/07/08 11:15 AM

I've simply extracted the pattern from the link i gave above and it simply doesn't match for "ftp"
Posted By: Typos

Re: parse a string (check if URL is in a string) - 17/07/08 11:46 AM

I have to admit, I was wondering why you decided to include irc and not ftp.

Sorry if it seamed like I was singling out your regex but it was the only other good one in this thread so naturally I compared to it. Besides, like I said, I've just recently taken on regex on a more serious level so I cannot promise my regex is error free and I also believe it could be smaller.
Posted By: hixxy

Re: parse a string (check if URL is in a string) - 17/07/08 01:06 PM

(?:www|ftp\.)?|www|ftp\.)

In this section of your regex you should be matching . outside of the brackets. At the moment that will match "www" or "ftp.", not "www." or "ftp." like it should.

It would also be a good idea to match a single digit directly after "www" so that it will work with things like "www2.google.com"

Updated regex:

Code:
\b(?:(?:htt|ft)ps?://(?:www\d?\.|ftp\.)?|www\d?|ftp)\..*(?:\.[a-z]{2,4})(?::\d+)?(?:/\w+(?:/\w*/*)*|(?:\.[a-z]{2,4})|\?\S*)*\b
Posted By: Typos

Re: parse a string (check if URL is in a string) - 18/07/08 06:26 AM

Thank you very much for catching that hixxy. I decided while I was fixing the regex since I couldn't see the code in your post because it was in a little code box that was so short I couldnt see the text that I would also get rid of the second www|ftp part. I hope I got the digit idea right, it did past testing so I'm sure its prolly exactly what you did. I think I'll use firefox next time I come here so that doesnt happen again.
Quote:
\b(?:(?:(?:htt|ft)ps?://)|(?:www\d?|ftp)\.).*(?:\.[a-z]{2,4})(?::\d+)?(?:/\w+(?:/\w*/*)*|(?:\.[a-z]{2,4})|\?\S*)*\b

I just got on the pc for the first time today so I will be looking at this more to see how else I can improve it like I said I would be trying to do in my earlier post.

Im sure you notice I use quote boxes for my code. They dont shrink to an unusable size on me like the text boxes. Very very strange.

Posted By: hixxy

Re: parse a string (check if URL is in a string) - 18/07/08 03:08 PM

Yep that's very similar to what I posted. smile
© 2022 mIRC Discussion Forums