mIRC Home    About    Download    Register    News    Help

Print Thread
extracting urls from multiline text using reg exps #138798 06/01/06 07:25 PM
Joined: Nov 2005
Posts: 42
A
Aenei Offline OP
Ameglian cow
OP Offline
Ameglian cow
A
Joined: Nov 2005
Posts: 42
I have a block of text in a multiline edit and I am trying to get any/all URLs contained therein and pass them to an alias.

I've got a loop which reads each line of the edit fine, and puts it into a variable %text. I then have a fairly comprehensive regex that I think should catch pretty much any URL. I then echo it as I am still testing, but will be sending it to an alias or a hash table soemwhere in due course
Code:
if $regex($xdid(paste_diag,20,%i).text,/((?:ftp:\/\/|https?:\/\/|www\d?\.)[^<>\.\s]+(?:\.[^<>\.\s]+)+(?:\/[^<>\.\s]+)*)/g) {
  echo -s found: $regml(1)
}

Problem comes (obviously) when I have more than one URL in a line. I'm hoping there's a neater way than checking how many there are and then getting a $regml of them all, seems a bit inefficient. Additionally, unlikely as it is, my understanding is this wouldn't work if there were more than 10 occurrences in a line anyway due to how $regml is coded? I suppose I could use some kind of while (url is in the line) with a reg sub. Any ideas?

Re: extracting urls from multiline text using reg exps #138799 07/01/06 01:57 PM
Joined: Feb 2004
Posts: 15
O
OuttaControlX Offline
Pikka bird
Offline
Pikka bird
O
Joined: Feb 2004
Posts: 15
maybe easyer to use token instead... heres somethiung to maybe give you a idea you can use see what i mean

Code:
Alias URL_Puller {
  var %i = 1
  While (%i <= $numtok(%huh,32)) {
    if $regex($xdid(paste_diag,20,%i).text,/((?:ftp:\/\/|https?:\/\/|www\d?\.)[^<>\.\s]+(?:\.[^<>\.\s]+)+(?:\/[^<>\.\s]+)*)/g) {
      echo -s found: $regml(1)
    }
    inc %i
  }
}