mIRC Home    About    Download    Register    News    Help

Print Thread
Joined: Oct 2005
Posts: 827
P
pouncer Offline OP
Hoopy frood
OP Offline
Hoopy frood
P
Joined: Oct 2005
Posts: 827
How do I deal with stuff like this?

bvars?

what the server sends back is a massive xml reply.. here's a shrinked down example. it's actually a reply containing all the contacts on my msn list..

each contact info is displayed within the <contact> .. </contact> section - so you can imagine how big this reply is if my msn contact list has well over 100 people?

Code:
HTTP/1.1 200 OK
Date: Fri, 11 Nov 2005 23:55:09 GMT
Server: Microsoft-IIS/6.0
P3P:CP="BUS CUR CONo FIN IVDo ONL OUR PHY SAMo TELo"
Cache-Control: private, max-age=0
Content-Type: text/xml; charset=utf-8
Content-Length: 2207

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
    <soap:Header>
        <ServiceHeader xmlns="http://www.msn.com/webservices/AddressBook">
            <Version>11.01.0922.0000</Version>
        </ServiceHeader>
    </soap:Header>
    <soap:Body>
        <ABFindAllResponse xmlns="http://www.msn.com/webservices/AddressBook">
            <ABFindAllResult>
                <contacts>
                    <Contact>
                        <contactId> Removed </contactId>
                        <contactInfo>
                        <annotations>
                            <Annotation>
                                <Name>MSN.IM.MBEA</Name>
                                <Value>0</Value>
                            </Annotation>
                            <Annotation>
                                <Name>MSN.IM.GTC</Name>
                                <Value>1</Value>
                            </Annotation>
                            <Annotation>
                                <Name>MSN.IM.BLP</Name>
                                <Value>0</Value>
                            </Annotation>
                        </annotations>
                        <contactType>Me</contactType>
                        <quickName>Q</quickName>
                        <passportName> Removed </passportName>
                        <IsPassportNameHidden>false</IsPassportNameHidden>
                        <displayName>Inky | Hello, World from WLM</displayName>
                        <puid>0</puid>
                        <CID>0</CID>
                        <IsNotMobileVisible>false</IsNotMobileVisible>
                        <isMobileIMEnabled>false</isMobileIMEnabled>
                        <isMessengerUser>false</isMessengerUser>
                        <isFavorite>false</isFavorite>
                        <isSmtp>false</isSmtp>
                        <hasSpace>true</hasSpace>
                        <spotWatchState>NoDevice</spotWatchState>
                        <birthdate>0001-01-01T00:00:00.0000000-08:00</birthdate>
                        <primaryEmailType>ContactEmailPersonal</primaryEmailType>
                        <PrimaryLocation>ContactLocationPersonal</PrimaryLocation>
                        <PrimaryPhone>ContactPhonePersonal</PrimaryPhone>
                        <IsPrivate>false</IsPrivate>
                        <Gender>Unspecified</Gender>
                        <TimeZone>None</TimeZone>
                    </contactInfo>
                    <propertiesChanged />
                    <fDeleted>false</fDeleted>
                    <lastChange>2005-11-11T15:55:03.2600000-08:00</lastChange>
                </Contact>
            </contacts>
            <ab>
            <abId>00000000-0000-0000-0000-000000000000</abId>
                <abInfo>
                    <ownerPuid>0</ownerPuid>
                    <OwnerCID>0</OwnerCID>
                    <ownerEmail> Removed </ownerEmail>
                    <fDefault>true</fDefault>
                    <joinedNamespace>false</joinedNamespace>
                </abInfo>
                <lastChange>2005-11-11T15:55:03.2600000-08:00</lastChange>
                <DynamicItemLastChanged>2005-11-09T09:16:56.2970000-08:00</DynamicItemLastChanged>
                <createDate>2003-07-14T15:46:20.6500000-07:00</createDate>
                <propertiesChanged />
            </ab>
           </ABFindAllResult>
       </ABFindAllResponse>
   </soap:Body>
</soap:Envelope>


anyway what im wanting to do is parse out every contacts email, which is contained within the <passportName> tags. can anyone help me get started please?

Last edited by pouncer; 07/09/09 10:24 PM.
Joined: Jul 2008
Posts: 236
S
Fjord artisan
Offline
Fjord artisan
S
Joined: Jul 2008
Posts: 236
Binary variables, or hashtables... or both smile hashtables are pretty useful...

Have you tried looking for a library that parses XML streams?

Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
This is trivial to deal with.. it's no different than any other HTML scraping. Given that the <passportName>...</passportName> is conveniently located on one line, all you have to do is read the socket line by line and check for it.

Code:
on *:SOCKREAD:mysock: {
  sockread %line
  if ($regex(%line, /<passportName>(.+?)<\/passportName>/)) {
    echo -a Found another contact: $regml(1)
  }
}


If you need more than just one of those tags, well, it gets slightly more complicated, but you can follow the same general rule of matching each line individually.

If you're not comfortable with that, look for an mIRC XML library. There are a couple of options, but I'm not too familiar with them.


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Oct 2005
Posts: 827
P
pouncer Offline OP
Hoopy frood
OP Offline
Hoopy frood
P
Joined: Oct 2005
Posts: 827
Odd. That only seemed to echo 7 emails when the email has well over 1000 contacts :|

Joined: Jul 2006
Posts: 4,145
W
Hoopy frood
Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 4,145
Maybe because his regex only match one time, if more than one email is present in a line, it won't catch it, try :

Code:
on *:SOCKREAD:mysock: {
  var %line,%a 
  sockread %line
  if ($regex(%line, /<passportName>(.+?)<\/passportName>/g)) {
  %a = $regml(0)
  while (%a) { echo -a contact: $regml(%a) | dec %a }
 } 
}


#mircscripting @ irc.swiftirc.net == the best mIRC help channel
Joined: Jul 2008
Posts: 236
S
Fjord artisan
Offline
Fjord artisan
S
Joined: Jul 2008
Posts: 236
XML is not always so convenient. Really, that regex should be /<passportName>([^<]+)<\/passportName>/... This would explain why he's only getting 7 matches.

There are more than a couple of options, providing you're willing to compile a DLL. Google "parsing XML streams in C/C++". It may not be "built in", but it's likely a faster, and more complete (so you won't have to write any other regular expressions) way to parse streams of XML.

edit: just noticed you're using non-greedy match, but still...

Last edited by s00p; 09/09/09 04:31 AM.
Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
The regular expression /<passportName>(.+?)<\/passportName>/ is correct. The //g modifier is not needed (there is only one match per call).

Getting 7 of 1000 means theres a problem reading the data, OR perhaps the syntax changes after the 7th and the regex breaks (it could break if the data start splitting over multiple lines)


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Jan 2007
Posts: 1,156
D
Hoopy frood
Offline
Hoopy frood
D
Joined: Jan 2007
Posts: 1,156
I use binvars if a single line is too long.

For this just grab the info you need and store it in a hash table.

Joined: Sep 2005
Posts: 2,881
H
Hoopy frood
Offline
Hoopy frood
H
Joined: Sep 2005
Posts: 2,881
No offence but if you don't know how to parse a string you're not going to be able to write a messenger client.

You will be back here asking hundreds of questions before this project is up.

Joined: Oct 2005
Posts: 827
P
pouncer Offline OP
Hoopy frood
OP Offline
Hoopy frood
P
Joined: Oct 2005
Posts: 827
Guys, the string is over 100,000 in length.. (it's 1 whole xml document being sent in 1 line - too big for mIRC to read in 1 line)
Thats what the problem is.

Content-Length: 1340022

i need to somehow loop from char 0 to 1340022 and read all occurences of <PassportName>email</PassportName>

could anyone show me how this could be done using binvars?

Last edited by pouncer; 09/09/09 09:35 PM.
Joined: Sep 2005
Posts: 2,881
H
Hoopy frood
Offline
Hoopy frood
H
Joined: Sep 2005
Posts: 2,881
You need something like this in the sockread event:

Code:
sockread &data
while ($sockbr) sockread -f &data


Then after that you need to loop with $bfind() to get all occurrences.

Joined: Oct 2005
Posts: 827
P
pouncer Offline OP
Hoopy frood
OP Offline
Hoopy frood
P
Joined: Oct 2005
Posts: 827
Code:
on *:SOCKREAD:membership: {
  sockread &data

  while ($sockbr) sockread -f &data

  echo -a $bvar(&data, 1-).text
}


It gives me * /echo: insufficient parameters

Last edited by pouncer; 10/09/09 07:18 PM.
Joined: Jan 2007
Posts: 1,156
D
Hoopy frood
Offline
Hoopy frood
D
Joined: Jan 2007
Posts: 1,156
This is how I do it.
Code:
on *:sockread:b_rlist.*:{
  if ($Sockerr > 0) return
  sockread -n &brL
  if ($sockbr = 0) return
echo -s . $bvar(&brL,1-).text
}

Joined: Apr 2003
Posts: 342
M
Fjord artisan
Offline
Fjord artisan
M
Joined: Apr 2003
Posts: 342
Use CURL to download the XML document to a file... then use the file handler routines (/fopen, /fseek, $fgetc) to parse thru the file.


Beware of MeStinkBAD! He knows more than he actually does!

Link Copied to Clipboard