mIRC Home    About    Download    Register    News    Help

Print Thread
P
pouncer
pouncer
P
How do I deal with stuff like this?

bvars?

what the server sends back is a massive xml reply.. here's a shrinked down example. it's actually a reply containing all the contacts on my msn list..

each contact info is displayed within the <contact> .. </contact> section - so you can imagine how big this reply is if my msn contact list has well over 100 people?

Code:
HTTP/1.1 200 OK
Date: Fri, 11 Nov 2005 23:55:09 GMT
Server: Microsoft-IIS/6.0
P3P:CP="BUS CUR CONo FIN IVDo ONL OUR PHY SAMo TELo"
Cache-Control: private, max-age=0
Content-Type: text/xml; charset=utf-8
Content-Length: 2207

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
    <soap:Header>
        <ServiceHeader xmlns="http://www.msn.com/webservices/AddressBook">
            <Version>11.01.0922.0000</Version>
        </ServiceHeader>
    </soap:Header>
    <soap:Body>
        <ABFindAllResponse xmlns="http://www.msn.com/webservices/AddressBook">
            <ABFindAllResult>
                <contacts>
                    <Contact>
                        <contactId> Removed </contactId>
                        <contactInfo>
                        <annotations>
                            <Annotation>
                                <Name>MSN.IM.MBEA</Name>
                                <Value>0</Value>
                            </Annotation>
                            <Annotation>
                                <Name>MSN.IM.GTC</Name>
                                <Value>1</Value>
                            </Annotation>
                            <Annotation>
                                <Name>MSN.IM.BLP</Name>
                                <Value>0</Value>
                            </Annotation>
                        </annotations>
                        <contactType>Me</contactType>
                        <quickName>Q</quickName>
                        <passportName> Removed </passportName>
                        <IsPassportNameHidden>false</IsPassportNameHidden>
                        <displayName>Inky | Hello, World from WLM</displayName>
                        <puid>0</puid>
                        <CID>0</CID>
                        <IsNotMobileVisible>false</IsNotMobileVisible>
                        <isMobileIMEnabled>false</isMobileIMEnabled>
                        <isMessengerUser>false</isMessengerUser>
                        <isFavorite>false</isFavorite>
                        <isSmtp>false</isSmtp>
                        <hasSpace>true</hasSpace>
                        <spotWatchState>NoDevice</spotWatchState>
                        <birthdate>0001-01-01T00:00:00.0000000-08:00</birthdate>
                        <primaryEmailType>ContactEmailPersonal</primaryEmailType>
                        <PrimaryLocation>ContactLocationPersonal</PrimaryLocation>
                        <PrimaryPhone>ContactPhonePersonal</PrimaryPhone>
                        <IsPrivate>false</IsPrivate>
                        <Gender>Unspecified</Gender>
                        <TimeZone>None</TimeZone>
                    </contactInfo>
                    <propertiesChanged />
                    <fDeleted>false</fDeleted>
                    <lastChange>2005-11-11T15:55:03.2600000-08:00</lastChange>
                </Contact>
            </contacts>
            <ab>
            <abId>00000000-0000-0000-0000-000000000000</abId>
                <abInfo>
                    <ownerPuid>0</ownerPuid>
                    <OwnerCID>0</OwnerCID>
                    <ownerEmail> Removed </ownerEmail>
                    <fDefault>true</fDefault>
                    <joinedNamespace>false</joinedNamespace>
                </abInfo>
                <lastChange>2005-11-11T15:55:03.2600000-08:00</lastChange>
                <DynamicItemLastChanged>2005-11-09T09:16:56.2970000-08:00</DynamicItemLastChanged>
                <createDate>2003-07-14T15:46:20.6500000-07:00</createDate>
                <propertiesChanged />
            </ab>
           </ABFindAllResult>
       </ABFindAllResponse>
   </soap:Body>
</soap:Envelope>


anyway what im wanting to do is parse out every contacts email, which is contained within the <passportName> tags. can anyone help me get started please?

Last edited by pouncer; 07/09/09 10:24 PM.
S
s00p
s00p
S
Binary variables, or hashtables... or both smile hashtables are pretty useful...

Have you tried looking for a library that parses XML streams?

Joined: Oct 2003
Posts: 3,641
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,641
This is trivial to deal with.. it's no different than any other HTML scraping. Given that the <passportName>...</passportName> is conveniently located on one line, all you have to do is read the socket line by line and check for it.

Code:
on *:SOCKREAD:mysock: {
  sockread %line
  if ($regex(%line, /<passportName>(.+?)<\/passportName>/)) {
    echo -a Found another contact: $regml(1)
  }
}


If you need more than just one of those tags, well, it gets slightly more complicated, but you can follow the same general rule of matching each line individually.

If you're not comfortable with that, look for an mIRC XML library. There are a couple of options, but I'm not too familiar with them.

P
pouncer
pouncer
P
Odd. That only seemed to echo 7 emails when the email has well over 1000 contacts :|

Joined: Jul 2006
Posts: 4,037
W
Hoopy frood
Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 4,037
Maybe because his regex only match one time, if more than one email is present in a line, it won't catch it, try :

Code:
on *:SOCKREAD:mysock: {
  var %line,%a 
  sockread %line
  if ($regex(%line, /<passportName>(.+?)<\/passportName>/g)) {
  %a = $regml(0)
  while (%a) { echo -a contact: $regml(%a) | dec %a }
 } 
}


#mircscripting @ irc.swiftirc.net == the best mIRC help channel
S
s00p
s00p
S
XML is not always so convenient. Really, that regex should be /<passportName>([^<]+)<\/passportName>/... This would explain why he's only getting 7 matches.

There are more than a couple of options, providing you're willing to compile a DLL. Google "parsing XML streams in C/C++". It may not be "built in", but it's likely a faster, and more complete (so you won't have to write any other regular expressions) way to parse streams of XML.

edit: just noticed you're using non-greedy match, but still...

Last edited by s00p; 09/09/09 04:31 AM.
Joined: Oct 2003
Posts: 3,641
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,641
The regular expression /<passportName>(.+?)<\/passportName>/ is correct. The //g modifier is not needed (there is only one match per call).

Getting 7 of 1000 means theres a problem reading the data, OR perhaps the syntax changes after the 7th and the regex breaks (it could break if the data start splitting over multiple lines)

Joined: Jan 2007
Posts: 1,155
D
Hoopy frood
Offline
Hoopy frood
D
Joined: Jan 2007
Posts: 1,155
I use binvars if a single line is too long.

For this just grab the info you need and store it in a hash table.

Joined: Sep 2005
Posts: 2,630
H
Hoopy frood
Offline
Hoopy frood
H
Joined: Sep 2005
Posts: 2,630
No offence but if you don't know how to parse a string you're not going to be able to write a messenger client.

You will be back here asking hundreds of questions before this project is up.

P
pouncer
pouncer
P
Guys, the string is over 100,000 in length.. (it's 1 whole xml document being sent in 1 line - too big for mIRC to read in 1 line)
Thats what the problem is.

Content-Length: 1340022

i need to somehow loop from char 0 to 1340022 and read all occurences of <PassportName>email</PassportName>

could anyone show me how this could be done using binvars?

Last edited by pouncer; 09/09/09 09:35 PM.
Joined: Sep 2005
Posts: 2,630
H
Hoopy frood
Offline
Hoopy frood
H
Joined: Sep 2005
Posts: 2,630
You need something like this in the sockread event:

Code:
sockread &data
while ($sockbr) sockread -f &data


Then after that you need to loop with $bfind() to get all occurrences.

P
pouncer
pouncer
P
Code:
on *:SOCKREAD:membership: {
  sockread &data

  while ($sockbr) sockread -f &data

  echo -a $bvar(&data, 1-).text
}


It gives me * /echo: insufficient parameters

Last edited by pouncer; 10/09/09 07:18 PM.
Joined: Jan 2007
Posts: 1,155
D
Hoopy frood
Offline
Hoopy frood
D
Joined: Jan 2007
Posts: 1,155
This is how I do it.
Code:
on *:sockread:b_rlist.*:{
  if ($Sockerr > 0) return
  sockread -n &brL
  if ($sockbr = 0) return
echo -s . $bvar(&brL,1-).text
}

M
MeStinkBAD
MeStinkBAD
M
Use CURL to download the XML document to a file... then use the file handler routines (/fopen, /fseek, $fgetc) to parse thru the file.


Link Copied to Clipboard