mIRC Home    About    Download    Register    News    Help

Print Thread
Hmph @ WebData get tricky #189549 08/11/07 07:59 PM
Joined: Aug 2005
Posts: 1,052
L
Lpfix5 Offline OP
Hoopy frood
OP Offline
Hoopy frood
L
Joined: Aug 2005
Posts: 1,052
In Succession of trying to make fast easy read access to the newspaper I run an issue into the careers section of the site.

Well im trying to strip out 4 basic things out of this webdata and well im kinda running into a wall because with many regex tries I do or what not the DATA in question that im trying to receive is not echoing back to me..

Here's what im trying to get out of this link.

http://ospreycareers.com/results.asp?sea...bmit=New+Search

In the middle column where red data is

A) Job Title
B) Date
C) Source

Like this ex.: first one MACHINE OPERATORS: Fabricating... 11/8/2007 Sault Ste Marie

I'll give you the base code I got to start off, w/echo data get but I sit here not finding the specific information I need. (Im not sure by connecting through socket if theres an issue on that site getting the info I need but hopefully not)

Code:
alias news_career sockclose news_career | sockopen news_career ospreycareers.com 80

on 1:sockopen:news_career:{
  .sockwrite -n $sockname GET /results.asp?search_type=quick&kw=&city=Sault+Ste+Marie&submit=New+Search HTTP/1.1
  .sockwrite -n $sockname HOST: ospreycareers.com
  .sockwrite -n $sockname $crlf
}

on 1:sockread:news_career:{
  if ($sockerr > 0) return 
  var %x | sockread -fn %x
  echo -a %x 
}


Again I tried numerous ways to strip the data yet its still not fetching the proper area of the dump.

Any help would greatly be appreciated.


Code:
if $reality > $fiction { set %sanity Sane }
Else { echo -a *voices* }
Re: Hmph @ WebData get tricky [Re: Lpfix5] #189614 09/11/07 06:35 PM
Joined: Aug 2005
Posts: 1,052
L
Lpfix5 Offline OP
Hoopy frood
OP Offline
Hoopy frood
L
Joined: Aug 2005
Posts: 1,052
www.ospreycareers.com/results.asp?search_type=quick&kw=&city=Sault+Ste+Marie

seems that with the submit new search it would not pull up correctly but I still remain with the issue


Code:
if $reality > $fiction { set %sanity Sane }
Else { echo -a *voices* }
Re: Hmph @ WebData get tricky [Re: Lpfix5] #189641 10/11/07 02:37 AM
Joined: Jun 2006
Posts: 508
D
deegee Offline
Fjord artisan
Offline
Fjord artisan
D
Joined: Jun 2006
Posts: 508
I'd say you need to send cookie data, neither of the links above yield any search results if no cookie exists for that site.

Re: Hmph @ WebData get tricky [Re: deegee] #189643 10/11/07 03:50 AM
Joined: Aug 2005
Posts: 1,052
L
Lpfix5 Offline OP
Hoopy frood
OP Offline
Hoopy frood
L
Joined: Aug 2005
Posts: 1,052
OMG i forgot your right, cookies!!! I guess I wasn't hungry enough to think about htat.


Code:
if $reality > $fiction { set %sanity Sane }
Else { echo -a *voices* }
Re: Hmph @ WebData get tricky [Re: Lpfix5] #189705 11/11/07 03:43 AM
Joined: Aug 2005
Posts: 1,052
L
Lpfix5 Offline OP
Hoopy frood
OP Offline
Hoopy frood
L
Joined: Aug 2005
Posts: 1,052
Code:
alias news_career sockclose news_career | sockopen news_career ospreycareers.com 80

on 1:sockopen:news_career:{
  .sockwrite -n $sockname GET /results.asp?search_type=quick&kw=&city=Sault+Ste+Marie HTTP/1.1
  .sockwrite -n $sockname HOST: ospreycareers.com
  .sockwrite -n $sockname COOKIE: np%5Furl=www%2Esaultstar%2Ecom; expires=Mon, 10-Nov-2008 05:00:00 GMT; path=/
  .sockwrite -n $sockname COOKIE: ASPSESSIONIDSASCATTB=EAMACMEBNHDADKKGAPJGNGLL; path=/
  .sockwrite -n $sockname $crlf
}

on 1:sockread:news_career:{
  if ($sockerr > 0) return 
  var %x | sockread -fn %x
  if (onmouseover isin %x) {
    echo -a %x
  }
}


I need the UPPERCASE ONLY, like THIS IS WHATEVER ignore Jidaida Aaojkodaodad I want the TOKEN to be upper is what I mean to return that part of the on mouse over.


Code:
if $reality > $fiction { set %sanity Sane }
Else { echo -a *voices* }
Re: Hmph @ WebData get tricky [Re: Lpfix5] #189712 11/11/07 06:34 AM
Joined: Jun 2006
Posts: 508
D
deegee Offline
Fjord artisan
Offline
Fjord artisan
D
Joined: Jun 2006
Posts: 508
Say what? Not all items have any UPPERCASE text.
Quote:
• CAA South Central Ontario is looking for a ... « "CAA"
&#8226; FOOD SERVICES<BR><BR>HOUSEKEEPER Full Time ... « "FOOD SERVICES" and "HOUSEKEEPER"
&#8226; <BR><BR>Electrical Inspectors <BR><BR>Dryden ... « none.

Is what you're after what you see on the website without the mouseover box? (Job title, Date, Source)

Re: Hmph @ WebData get tricky [Re: deegee] #189754 11/11/07 04:58 PM
Joined: Aug 2005
Posts: 1,052
L
Lpfix5 Offline OP
Hoopy frood
OP Offline
Hoopy frood
L
Joined: Aug 2005
Posts: 1,052
Oh Some are not uppercase I didn't notice im actually trying to get the Job Title Date and Source


Code:
if $reality > $fiction { set %sanity Sane }
Else { echo -a *voices* }
Re: Hmph @ WebData get tricky [Re: Lpfix5] #189757 11/11/07 05:37 PM
Joined: Jun 2006
Posts: 508
D
deegee Offline
Fjord artisan
Offline
Fjord artisan
D
Joined: Jun 2006
Posts: 508
Try this, it also gets the cookie data first and stores it for 3600 secs, you can make it longer but I'm not sure how long it will be valid for.
Code:
on *:sockopen:news_career:{
  if $sockerr { echo -ac info * Sockerr (news_career): $sock($sockname).wsmsg | return }

  ; Check if you have a variable with cookie data
  if $($+(%,$sockname,.cookie),2) {
    ; if so go ahead and request the page
    sockwrite -n $sockname GET /results.asp?search_type=quick&kw=&city=Sault+Ste+Marie HTTP/1.0
    sockwrite -n $sockname HOST: www.ospreycareers.com
    sockwrite $sockname Cookie: $v1 $+ $str($lf,2)
    return
  }
  ; else request headers only (for the cookie)
  sockwrite -n $sockname HEAD / HTTP/1.0
  sockwrite $sockname HOST: www.ospreycareers.com $+ $str($lf,2)
}

on *:sockread:news_career:{
  if $sockerr { echo -ac info * Sockerr (news_career): $sock($sockname).wsmsg | return }
  var %a | sockread %a

  ; Check if you have a cookie
  if !$($+(%,$sockname,.cookie),2) {
    ; if not, check for the cookie data
    if *Set-Cookie: ASPSESSIONID* iswm %a {
      ; set it to a variable
      set -u3600 $+(%,$sockname,.cookie) np%5Furl=www%2Esaultstar%2Ecom; $gettok(%a,2,32)
      ; and start over
      sockclose $sockname | news_career
    }
  }
  ; else read the page ;)

  ; IF you have a job title...
  if %JobTitle {
    ; add the date and location data
    if *<td class="rowSep"* iswm %a { set -e %JobTitle %JobTitle $regsubex(%a,/\t|<.*?>/g,) }
    ; if its end of item, display the item & unset the variable
    elseif *</tr>* iswm %a { echo -a %JobTitle | unset %JobTitle }
  }

  ; else check incoming data for a job title & set it to a variable
  elseif *onMouseout="hideddrivetip()">* iswm %a { set -e %JobTitle $gettok($gettok(%a,1,60),2,62) }
}


Re: Hmph @ WebData get tricky [Re: deegee] #189830 12/11/07 03:30 AM
Joined: Aug 2005
Posts: 1,052
L
Lpfix5 Offline OP
Hoopy frood
OP Offline
Hoopy frood
L
Joined: Aug 2005
Posts: 1,052
Nice thanks


Code:
if $reality > $fiction { set %sanity Sane }
Else { echo -a *voices* }