mIRC Home    About    Download    Register    News    Help

Print Thread
Joined: Sep 2003
Posts: 261
S
Fjord artisan
OP Offline
Fjord artisan
S
Joined: Sep 2003
Posts: 261
Hi, it's me again. How would I go about using regex to match domain names and not subdomains nore full urls?

Domain: blah.com
Subdomain: sub.blah.com
Full Url: http://sub.blah.com

trying with this right now
Code:
$regex(domain,%text,/(\w+)\.(\w+)/g)
 


But it gives me subdomains too. Thanks in advance.

Joined: Feb 2005
Posts: 342
R
Fjord artisan
Offline
Fjord artisan
R
Joined: Feb 2005
Posts: 342
Code:
$regex(%text,/([^\/.]+\.(?:com|org|net|ru|sk))(?=\b|\/)/gi)


Typing:

//echo -ag $regex(http://www.test.com/ http://test2.com/ http://a.b.c.test3.com/,/([^\/.]+\.(?:com|org|net|ru|sk))(?=\b|\/)/gi) -- $regml(1) -- $regml(2) --- $regml(3)

Will return: 3 -- test.com -- test2.com --- test3.com

Joined: Sep 2003
Posts: 261
S
Fjord artisan
OP Offline
Fjord artisan
S
Joined: Sep 2003
Posts: 261
hrm, yeah um. frown

Say someone types: hey someone go to http://www.blah.com.

I want it to return that it found no matches.

If someone types: hey someone go to www.blah.com or me.blah.com.

I want it to return that it found no matches.

But, if someone typed: Hey someone go to blah.com.

I want it to return that it found blah.com frown.

But thanks for the help, what you provided has pointed me further in the right direction.

Joined: Sep 2003
Posts: 261
S
Fjord artisan
OP Offline
Fjord artisan
S
Joined: Sep 2003
Posts: 261
ok, i'm getting closer to world domination, err I mean what I want out of this script, I can snatch out the individual full urls subdomains and domains, but the stuff that trails them that would be part of the url doesn't get added. how do I have it attach the rest of the url:

if a url is: http://www.blah.com/ieatworms.html

and it gives me http://www.blah.com
and doesn't add the /ieatworms.html

Then what regex pattern do i add after to get that?
Using this so far for fullurl matching:
Code:
 
$regex(fullurl,the text,/((?:http|https|ftp)://+(?:|\w+\.)\w+\.(?:com|net|org|ca|au|co.uk|name|us|biz|info))(?:\b|\/)/g)
 

Joined: Sep 2003
Posts: 261
S
Fjord artisan
OP Offline
Fjord artisan
S
Joined: Sep 2003
Posts: 261
Solved it! Here's the code I used that may help someone else in the future:
Code:
alias regextest { 
  var %text = door.net <a href="http://www.url.com/cowheadcheese.html"> http://cat.dog.net/dogcatcher/frogsnatcher/ http://car.mart.org chorn.com that.mouse.net ftp://hrm.farkwad.com butter.bowl.turkey myspace.com/scorpwanna http://horse.net scorpwanna.com sub.domain.co.uk ford car.org
  ;var %text = $1-

  echo $color(info) -aeg ---------------------------------
  ;looking only for: *://this.that.com
  var %fullurl = $regex(fullurl,%text,/((?:http|https|ftp):\/\/+(?:|\w+\.)\w+\.(?:com|net|org|ca|au|co.uk|name|us|biz|info)(?:[^\s]*|\/+\w[^\s]*))/ig)

  ;below built for only [fullurl]
  var %i = 1
  var %blah = $regml(fullurl,0)
  echo -ag %fullurl possible full url name(s) found.
  while (%i <= %blah) {
    echo -ag %i $regml(fullurl,%i)
    var %urlreplace = $replace($regml(fullurl,%i),:,$chr(32) $+ colon $+ $chr(32),.,$chr(32) $+ dot $+ $chr(32))
    %text = $replace(%text,$regml(fullurl,%i),%urlreplace)
    inc %i
  }

  echo $color(info) -ag ---------------------------------
  ;looking only for: this.that.com
  var %subdomain = $regex(subdomain,%text,/((?:^|\s)+\w+\.\w+\.(?:com|net|org|ca|au|co.uk|name|us|biz|info)(?:[^\s]*|\/+\w[^\s]*))/ig)

  ;below built for only [subdomain]
  var %i = 1
  var %blah = $regml(subdomain,0)
  echo -ag %subdomain possible subdomain name(s) found.
  while (%i <= %blah) {
    echo -ag %i $regml(subdomain,%i)
    var %urlreplace = $replace($regml(subdomain,%i),.,$chr(32) $+ dot $+ $chr(32))
    %text = $replace(%text,$regml(subdomain,%i),%urlreplace)
    inc %i
  }

  echo $color(info) -ae ---------------------------------
  ;looking only for: that.com
  var %domain = $regex(domain,%text,/((?:^|\s)+\w+\.(?:com|net|org|ca|au|co.uk|name|us|biz|info)(?:[^\s]*|\/+\w[^\s]*))/ig)

  ;below built for only [domain]
  var %i = 1
  var %blah = $regml(domain,0)
  echo -ag %domain possible domain name(s) found.
  while (%i <= %blah) {
    echo -ag %i $regml(domain,%i)
    var %urlreplace = $replace($regml(domain,%i),.,$chr(32) $+ dot $+ $chr(32))
    %text = $replace(%text,$regml(domain,%i),%urlreplace)
    inc %i
  }
  echo -aeg %fullurl url(s) found, %subdomain possible subdomain name(s) found and %domain possible domain name(s) found.
  ;return %text
}
 


Which returns:
---------------------------------
-
5 possible full url name(s) found.
1 http://www.url.com/cowheadcheese.html">
2 http://cat.dog.net/dogcatcher/frogsnatcher/
3 http://car.mart.org
4 ftp://hrm.farkwad.com
5 http://horse.net
---------------------------------
2 possible subdomain name(s) found.
1 that.mouse.net
2 sub.domain.co.uk
-
---------------------------------
-
5 possible domain name(s) found.
1 door.net
2 chorn.com
3 myspace.com/scorpwanna
4 scorpwanna.com
5 car.org
-
5 url(s) found, 2 possible subdomain name(s) found and 5 possible domain name(s) found.
-

Damn i feel good now. smile


Link Copied to Clipboard