|
Joined: Mar 2008
Posts: 24
Ameglian cow
|
OP
Ameglian cow
Joined: Mar 2008
Posts: 24 |
Due to the complexity of the html code (20 chunks of data in 60,000 chars), I decided to use a simple perl script to extract the data. It works like a charm, but i'd still prefer to keep everything in mIRC. Too bad mIRC has so many variable limitations. No idea why!
I guess I should learn how to do all the socket programing in perl, then it wont be such a mess.
|
|
|
|
Joined: Mar 2008
Posts: 24
Ameglian cow
|
OP
Ameglian cow
Joined: Mar 2008
Posts: 24 |
Theres 20 chunks of this code in 1 line of the html code. Does this look like something easy to manipulate in mIRC? I need only 7 fields from each chunk, but they arent easy to match: <div class="results" style="width: 395px;"><table><tr><td class="sImage"><a href=" profile.php?id=825603469" onclick="(new Image()).src = '/ajax/ct.php?app_id=7906852977&action_type=3&post_form_id=170baa3572aa7ee38b8b9f6fa0f04676&position=3&' + Math.random();return true"><img size="s" uid="825603469" linked="no" src=" http://profile.ak.facebook.com/profile5/473/88/s825603469_7283.jpg" alt="Sarah Bockus" title=" Sarah Bockus" /></a></td><td class="sInfo"><font class="bigger" style="line-height: 16px;"><b><a href="profile.php?id=825603469" onclick="(new Image()).src = '/ajax/ct.php?app_id=7906852977&action_type=3&post_form_id=170baa3572aa7ee38b8b9f6fa0f04676&position=3&' + Math.random();return true">Sarah Bockus</a></b></font> <div class="blurb">":)"I will fight to the end...lol"</div><table style="line-height: 16px; margin-top: 3px; color: #898989;"><tr><td width="40px">Price:</td><td><b style="color: #006600;"> $23,378</b></td></tr><tr><td>Bling:</td><td><b style="color: #006600;"> $52,606</b></td></tr><td>Hotties:</td><td><font style="color: red;"> 12</font></td></table></td><td class="actions" style="width: 145px;"><div class="navLink"><a class="submenu" clicktoshowdialog="my_dialog" onclick="FBML.clickToShowDialog("app7906852977_my_dialog");fbjs_sandbox.instances.a7906852977.bootstrap();return fbjs_dom.eventHandler.call([fbjs_dom.get_instance(this,7906852977),function(a7906852977_event) {a7906852977_getFavor(825603469);},7906852977],new fbjs_event(event));return false"><div class="nav">Add to Favorites</div></a></div><div class="navLink"><a class="submenu" clicktoshowdialog="my_dialog" onclick="FBML.clickToShowDialog("app7906852977_my_dialog");fbjs_sandbox.instances.a7906852977.bootstrap();return fbjs_dom.eventHandler.call([fbjs_dom.get_instance(this,7906852977),function(a7906852977_event) {a7906852977_getPoke(825603469);},7906852977],new fbjs_event(event));return false"><div class="nav">Poke Sarah!</div></a></div><div class="navLink"><a class="submenu" href="gifts.php?id=825603469" onclick="(new Image()).src = '/ajax/ct.php?app_id=7906852977&action_type=3&post_form_id=170baa3572aa7ee38b8b9f6fa0f04676&position=3&' + Math.random();return true"><div class="nav">Give a Present</div></a></div><div class="navLink"><a class="submenu" clicktoshowdialog="my_dialog" onclick="FBML.clickToShowDialog("app7906852977_my_dialog");fbjs_sandbox.instances.a7906852977.bootstrap();return fbjs_dom.eventHandler.call([fbjs_dom.get_instance(this,7906852977),function(a7906852977_event) {a7906852977_getInfo(825603469);},7906852977],new fbjs_event(event));return false"><div class="nav">Buy for <b style="color: #006600;"> $26,728</b></div></a></div></td></tr></table></div></b></div></a></div></td></tr></table></div>
|
|
|
|
Joined: Sep 2005
Posts: 2,881
Hoopy frood
|
Hoopy frood
Joined: Sep 2005
Posts: 2,881 |
I've copied and pasted that chunk of text into a file and then read it into a binvar to make this code, but it should work no matter how you fill the variable with some small modifications. alias parseprofile {
bread test.txt 0 $file(test.txt) &data
var %pointer = $bfind(&data,1,profile.php?id=), %profile, %profileimage, %profiletitle, %profileprice, %profilebling, %profilehotties, %profilebuyfor
while (%pointer) {
if ($bfind(&data,%pointer,")) {
%profile = $bvar(&data,$+(%pointer,-,$calc($v1 - 1))).text
%pointer = $v1 + 1
}
%pointer = $bfind(&data,%pointer,src="http://profile.) + 5
if ($bfind(&data,%pointer,")) {
%profileimage = $bvar(&data,$+(%pointer,-,$calc($v1 - 1))).text
%pointer = $v1
}
%pointer = $bfind(&data,%pointer,title=") + 7
if ($bfind(&data,%pointer,")) {
%profiletitle = $bvar(&data,$+(%pointer,-,$calc($v1 - 1))).text
%pointer = $v1
}
%pointer = $bfind(&data,%pointer,Price:)
%pointer = $bfind(&data,%pointer,$)
if ($bfind(&data,%pointer,<)) {
%profileprice = $bvar(&data,$+(%pointer,-,$calc($v1 - 1))).text
%pointer = $v1
}
%pointer = $bfind(&data,%pointer,Bling:)
%pointer = $bfind(&data,%pointer,$)
if ($bfind(&data,%pointer,<)) {
%profilebling = $bvar(&data,$+(%pointer,-,$calc($v1 - 1))).text
%pointer = $v1
}
%pointer = $bfind(&data,%pointer,Hotties:)
%pointer = $bfind(&data,%pointer,red;">) + 6
if ($bfind(&data,%pointer,<)) {
%profilehotties = $bvar(&data,$+(%pointer,-,$calc($v1 - 1))).text
%pointer = $v1
}
%pointer = $bfind(&data,%pointer,Buy for)
%pointer = $bfind(&data,%pointer,$)
if ($bfind(&data,%pointer,<)) {
%profilebuyfor = $bvar(&data,$+(%pointer,-,$calc($v1 - 1))).text
%pointer = $v1
}
echo -a * Profile: %profile
echo -a * Profile image: %profileimage
echo -a * Profile title: %profiletitle
echo -a * Profile price: %profileprice
echo -a * Profile bling: %profilebling
echo -a * Profile hotties: %profilehotties
echo -a * Profile buy for: %profilebuyfor
%pointer = $bfind(&data,%pointer,profile.php?id=)
}
} Of course you can do whatever you want with the variables.
|
|
|
|
Joined: Mar 2008
Posts: 24
Ameglian cow
|
OP
Ameglian cow
Joined: Mar 2008
Posts: 24 |
Oh wow it works, im quite impressed! The only thing is, I got 19/20 results outputed. This is because sometimes a profile is private and theres no name included. So this is missing from the code: alt="name here" title="name here" Also, the url for those without a title is http://static. and not http://profile.Still trying to figure out how to fix that.
Last edited by tparry; 24/03/08 02:41 PM.
|
|
|
|
Joined: Sep 2005
Posts: 2,881
Hoopy frood
|
Hoopy frood
Joined: Sep 2005
Posts: 2,881 |
If you paste the chunk for a private profile I can take a look.
|
|
|
|
Joined: Mar 2008
Posts: 24
Ameglian cow
|
OP
Ameglian cow
Joined: Mar 2008
Posts: 24 |
I got it working, thanks! alias parseprofile {
bread hfs.recent.txt 0 $file(hfs.recent.txt) &data
var %pointer = $bfind(&data,1,profile.php?id=)
while (%pointer) {
var %profile, %profileimage, %profiletitle, %profileprice, %profilebling, %profilehotties, %profilebuyfor
if ($bfind(&data,%pointer,")) {
%profile = $bvar(&data,$+(%pointer,-,$calc($v1 - 1))).text
%pointer = $v1 + 1
%pointer = $bfind(&data,%pointer,src="http://) + 5
if ($bfind(&data,%pointer,")) {
%profileimage = $bvar(&data,$+(%pointer,-,$calc($v1 - 1))).text
%pointer = $v1
if (http://profile isin %profileimage) {
%pointer = $bfind(&data,%pointer,title=") + 7
if ($bfind(&data,%pointer,")) {
%profiletitle = $bvar(&data,$+(%pointer,-,$calc($v1 - 1))).text
%pointer = $v1
}
}
else { %profiletitle = "PRIVATE" }
%pointer = $bfind(&data,%pointer,Price:)
%pointer = $bfind(&data,%pointer,$)
if ($bfind(&data,%pointer,<)) {
%profileprice = $bvar(&data,$+(%pointer,-,$calc($v1 - 1))).text
%pointer = $v1
%pointer = $bfind(&data,%pointer,Bling:)
%pointer = $bfind(&data,%pointer,$)
if ($bfind(&data,%pointer,<)) {
%profilebling = $bvar(&data,$+(%pointer,-,$calc($v1 - 1))).text
%pointer = $v1
%pointer = $bfind(&data,%pointer,Hotties:)
%pointer = $bfind(&data,%pointer,red;">) + 6
if ($bfind(&data,%pointer,<)) {
%profilehotties = $bvar(&data,$+(%pointer,-,$calc($v1 - 1))).text
%pointer = $v1
%pointer = $bfind(&data,%pointer,Buy for)
%pointer = $bfind(&data,%pointer,$)
if ($bfind(&data,%pointer,<)) {
%profilebuyfor = $bvar(&data,$+(%pointer,-,$calc($v1 - 1))).text
%pointer = $v1
echo -a * Profile: %profile
echo -a * Profile image: %profileimage
echo -a * Profile title: %profiletitle
echo -a * Profile price: %profileprice
echo -a * Profile bling: %profilebling
echo -a * Profile hotties: %profilehotties
echo -a * Profile buy for: %profilebuyfor
}
}
}
}
}
}
%pointer = $bfind(&data,%pointer,profile.php?id=)
}
|
|
|
|
Joined: Oct 2003
Posts: 3,918
Hoopy frood
|
Hoopy frood
Joined: Oct 2003
Posts: 3,918 |
whats wrong with just matching the first href="" for the profile link, first title="" for the name, etc..?
- argv[0] on EFnet #mIRC - "Life is a pointer to an integer without a cast"
|
|
|
|
Joined: Sep 2005
Posts: 2,881
Hoopy frood
|
Hoopy frood
Joined: Sep 2005
Posts: 2,881 |
He said there's 20 of those chunks in one line of html. Matching the first result would ignore the other 19 profiles.
|
|
|
|
Joined: Oct 2004
Posts: 8,330
Hoopy frood
|
Hoopy frood
Joined: Oct 2004
Posts: 8,330 |
If the lines are too long, then you need binary variables. That said, you *can* manipulate binary variables within mIRC without writing them to a file. Using $bvar allows you to do this. If you check each line for your matchtext, you may not even need to actually work with lines over the ~500 character limit. If that is the case, it won't be very difficult at all to handle the data. If you give a *small* sample of the data that you're actually trying to grab from the site, it would be easy to show you how to match it and use it.
Invision Support #Invision on irc.irchighway.net
|
|
|
|
Joined: Mar 2008
Posts: 24
Ameglian cow
|
OP
Ameglian cow
Joined: Mar 2008
Posts: 24 |
If the lines are too long, then you need binary variables. That said, you *can* manipulate binary variables within mIRC without writing them to a file. Using $bvar allows you to do this. If you check each line for your matchtext, you may not even need to actually work with lines over the ~500 character limit. If that is the case, it won't be very difficult at all to handle the data. If you give a *small* sample of the data that you're actually trying to grab from the site, it would be easy to show you how to match it and use it. I posted a sample of 1 chunk a few replies ago. You can copy/paste that 20 times on the same line to replicate the data I receive from the webpage. Much appreciated
|
|
|
|
Joined: Oct 2004
Posts: 8,330
Hoopy frood
|
Hoopy frood
Joined: Oct 2004
Posts: 8,330 |
Sorry, ignore my reply. I didn't notice the second page and replied to the post at the end of the first page.
Invision Support #Invision on irc.irchighway.net
|
|
|
|
|