mIRC Home    About    Download    Register    News    Help

Print Thread
#138165 24/12/05 11:35 PM
Joined: Oct 2005
Posts: 1,741
G
Hoopy frood
OP Offline
Hoopy frood
G
Joined: Oct 2005
Posts: 1,741
I'm attempting to make a script that communicates with a webserver via the POST method. I have the script working almost perfectly, except for one problem. When I attempt to send post data that is longer than 124 characters, the server simply closes the connection rather than responding with the expected data. At first glance, it would seem that the server is somehow limiting the data length, however I know that is not the case. I used a HTTP sniffer to see the exact data that was being sent to the server by the actual requesting webpage, then I copied that data and sent it verbatim using sockwrite.

Here is the data sent by the website and by my script:

###
POST /tr HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, application/x-shockwave-flash, */*
Referer: http://world.altavista.com/tr
Accept-Language: en-ca
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50215)
Host: world.altavista.com
Content-Length: 134
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: BX=3d18ip51qr43n&b=2

doit=done&intl=1&tt=urltext&trtext=hello+this+is+a+really+long+message+hello+this+is+a+really+long+message&lp=en_fr&btnTrTxt=Translate
###

The data sent by both programs are exactly the same, including extra $crlf, etc..

Does anyone have an idea of why my socket version will not work when it is exactly the same as the website version? If you need to see the code, let me know.

-genius_at_work

Joined: Feb 2004
Posts: 2,019
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2004
Posts: 2,019
Did you change the value of the Content-length header to the length of your longer data, or did you keep putting that same header value?

As a small test I did this, and it worked fine:

Code:
alias alta {
  sockopen alta world.altavista.com 80
}

on *:sockopen:alta:{
  if ($sockerr) return
  var %sw = sockwrite -tn $sockname
  var %data = doit=done&intl=1&tt=urltext&trtext=hello+this+is+a+really+long+message+hello+ $+ $&
    this+is+a+really+long+message+this+is+an+even+longer+message+so+lets+see+what+happens&lp=en_fr&btnTrTxt=Translate
  %sw POST /tr HTTP/1.1
  %sw Referer: http://world.altavista.com/tr
  %sw Content-Type: application/x-www-form-urlencoded
  %sw Host: world.altavista.com
  %sw Content-Length: $len(%data)
  %sw Connection: close
  %sw
  %sw %data
}

on *:sockread:alta:{
  if ($sockerr) return
  var %t
  sockread %t
  echo -s > %t
}


Gone.
Joined: Oct 2005
Posts: 1,741
G
Hoopy frood
OP Offline
Hoopy frood
G
Joined: Oct 2005
Posts: 1,741
My code was essentially the same as yours, but it still wouldn't accept long strings. Here is the code that I have right now:
Code:
alias babel {
  if (!$1) {
    echo -a Syntax: /babel <words to translate here, max 700 chars>
    return
  }

  set %babel $left($1-,700)
  if ($sock(babel)) sockclose babel
  sockopen babel world.altavista.com 80
  if ($sockerr) babelfail
  .timerbabel 1 30 babelfail
}

alias babelfail {
  echo -a Error: Unable to connect to babelfish
  .timerbabel off
  unset %babel
  sockclose babel
}

on *:SOCKOPEN:babel:{
  if ($sockerr) {
    babelfail
    return
  }
  if (!%babel) {
    sockclose babel
    echo -a Error: No text to translate
    return
  }

  var %type = en_fr
  var %data = doit=done&intl=1&tt=urltext&trtext= $+ $replace(%babel,$chr(32),+) $+ &lp= $+ %type $+ &btnTrTxt=Translate
  var %write = sockwrite -tn $sockname

  %write POST /tr HTTP/1.1
  %write Referer: http://world.altavista.com/tr
  %write Content-Type: application/x-www-form-urlencoded
  %write Host: world.altavista.com
  %write Content-Length: $len(%data)
  %write Connection: close
  %write
  %write %data
}

on *:SOCKCLOSE:babel:{
  echo -s Error: Babelfish closed the connection
  .timerbabel off
  unset %babel
}

on *:SOCKREAD:babel:{
  var %sread
  sockread %sread

  var %regex = /<td bgcolor=white class=s><div style=padding:10px;>(.*)<\/div><\/td>/Si
  if ($regex(babel,%sread,%regex)) {
    echo -a " $+ %babel $+ " translates to " $+ $regml(babel,1) $+ "
    .timerbabel off
    unset %babel
    sockclose $sockname
  }
}


Here are examples of usage:

/babel hello
"hello" translates to "bonjour"

/babel I like cat food
"I like cat food" translates to "J'aime des aliments pour chats"

/babel Twas the night before christmas and all through the house not a creature was stirring
Error: Babelfish closed the connection


-genius_at_work

Joined: Feb 2004
Posts: 2,019
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2004
Posts: 2,019
The error lies in your regex. You still get the correct response, I got this:

<td bgcolor=white class=s><div style=padding:10px;>Twas la nuit avant Noël et tout par la maison pas que une créature
remuait</div></td>

(that's on two lines)

Since your regex doesn't match, it never closes the socket manually, so it triggers the sockclose event when the server closed it itsself, leading you to believe it didnt work.


Gone.
Joined: Oct 2005
Posts: 1,741
G
Hoopy frood
OP Offline
Hoopy frood
G
Joined: Oct 2005
Posts: 1,741
Thanks FiberOPtics. I should have caught that myself. I changed the SOCKREAD event, and now it works fine.

Code:
on *:SOCKREAD:babel:{
  var %sread
  sockread %sread

  if (%babelcontinue) %babelt = %babelt %sread

  var %regex = /&lt;td bgcolor=white class=s&gt;&lt;div style=padding:10px;&gt;(.+)$/Si
  if ($regex(babel,%sread,%regex)) {
    set %babelt $regml(babel,1)
    set %babelcontinue 1
  }

  var %regex = /(.+)&lt;\/div&gt;&lt;\/td&gt;/Si
  if ($regex(babelb,%babelt,%regex)) {
    set %babelt $regml(babelb,1)
    set %babelcontinue 0
    var %babeldone 1
  }

  if (%babeldone) {
    echo 4 -a " $+ %babel $+ " 
    echo 4 -a ...translates to...
    echo 4 -a " $+ %babelt $+ "
    .timerbabel off
    unset %babel*
    sockclose $sockname
  }
}


-genius_at_work

Last edited by genius_at_work; 25/12/05 04:08 AM.
Joined: Feb 2004
Posts: 2,019
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2004
Posts: 2,019
No problem.

By the way, the $replace(%babel,$chr(32),+) that you have there to urlencode your string is not sufficient, as it won't deal with special characters (every char aside from alphanumerics and a few other chars) that need encoding to hex. Look on google to find out exactly which chars need encoding and which do not.


Gone.
Joined: Oct 2005
Posts: 1,741
G
Hoopy frood
OP Offline
Hoopy frood
G
Joined: Oct 2005
Posts: 1,741
I read that I needed to replace the non-alnum characters, but the code seems to work without it (on that particular site at least).

I tried using a regsub similar to this to convert all non-alnum chars (excluding +'s) to their hex equivalent, but it doesn't seem to work. The \1 isn't evaluating in the identifiers.

$regsub(babel,%babel,/([^+\w])/Sg,$base($asc(\1),10,16))

Any idea what I'm doing wrong there?

-genius_at_work

Joined: Feb 2004
Posts: 2,019
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2004
Posts: 2,019
Well, multiple things, and by the time I've shown you all the problems with $regsub, you'll realize $regsub isn't the way to go for this, rather a while loop with $regex and $replace.

  • $regsub takes as last parameter a variable which will contain the substituted string.

    //var %a, %b = $regsub(babel,<string>,/<regex>/,<substitution>,%a)
  • Like any other occasion, before mIRC calls a command/identifier, it will see if it contains paramters that need evaluation. Just like //echo -a $mid($me,1,-1) first evaluates $me before calling $mid, mIRC will first evaluate your $base before calling $regsub.

    //var %a, %b = $regsub(babel,"one"+!=+'two',/([^+\w])/Sg,$base($asc(\1),10,16),%a) | echo -a %a

    -> Result: 5Cone5C+5C5C+5Ctwo5C

    You immediately notice that the only hex char here is 5C, even though it's obvious we used different non-alphanumerics. What happened?

    mIRC evaluated $base, but before evaluating base it checked to see if it needed to evaluate any parameters in $base, which it found with $asc(\1)

    //echo -a $asc(\1) == $asc(\)

    -> Result: 92 == 92

    As you can see, $asc takes the first char and does the ascii conversion on it.

    Now $base has all its parameters evaluated, and will call the following: $base(92,10,16)

    //echo -a $base(92,10,16)

    -> Result: 5C

    There, we found out what goes wrong, our $base got evaluated before we wanted it to do so, instead of $asc(\) we wanted it to first evaluate $asc(") etc.
  • So what can we do about it? We can escape the $base and $asc identifiers to prevent them from evaluating now by prefixing them with a \.

    //var %a, %b = $regsub(babel,"one"+!=+'two',/([^+\w])/Sg,\$base(\$asc(\1),10,16),%a) | echo -a %a

    -> Result: $base($asc("),10,16)one$base($asc("),10,16)+$base($asc(!),10,16)$base($asc(=),10,16)+$base($asc('),10,16)two$base($asc('),10,16)

    Now that doesn't look right, but we're already a step closer to our final goal. We see now that the correct chars are put in $asc, and that neither $asc or $base have evaluated now. We also see that $base is pasted against each other, so we need to add in some spaces to let them evaluate.

    //var %a, %b = $regsub(babel,"one"+!=+'two',/([^+\w])/Sg,$chr(32) \$base(\$asc(\1) $chr(32),10,16),%a) | echo -a %a

    Result: $base($asc(") one $base($asc(") + $base($asc(!) $base($asc(=) + $base($asc(') two $base($asc(')

    That's looking better again, each $base will be able to evaluate when we use $eval, but there is one last thing that we notice. $base($asc("),10,16) one $base($asc("),10,16) <-- because of the spaces, the end result will be something like 22 one 22, but we'd want it to be 22one22.

    So let's add some $+'s in there to concatenate the strings back together once $base has evaluated, but again we will need to escape them from evaluating now.

    //var %a, %b = $regsub(babel,"one"+!=+'two',/([^+\w])/Sg,$chr(32) \$+ \$base(\$asc(\1),10,16) \$+ $chr(32),%a) | echo -a %a

    Result: $+ $base($asc("),10,16) $+ one $+ $base($asc("),10,16) $+ + $+ $base($asc(!),10,16) $+ $+ $base($asc(=),10,16) $+ + $+ $base($asc('),10,16) $+ two $+ $base($asc('),10,16) $+

    That looks about good, so let's go to the final step.
  • //var %a, %b = $regsub(babel,"one"+!=+'two',/([^+\w])/Sg,$chr(32) \$+ \$base(\$asc(\1),10,16) \$+ $chr(32),%a) | echo -a $eval(%a,2)

    Result: 22one22+213D+27two27

    That looks quite good!

    However... there are two important downsides to this:

    1) The short string "one"+!=+'two' must be converted to the much longer string $+ $base($asc("),10,16) $+ one $+ $base($asc("),10,16) $+ + $+ $base($asc(!),10,16) $+ $+ $base($asc(=),10,16) $+ + $+ $base($asc('),10,16) $+ two $+ $base($asc('),10,16) $+. Since mIRC has a string too long limit, you will run into this limit much quicker than you would with alternative methods.

    2) $eval(<params>,2) will force evaluation of depth two on the string. This means that a string containing characters in mIRC with a special meaning, will suddenly be evaluated, when we don't want this.

    //var %a = what's $!pi ? | echo -a %a ** $eval(%a,2)

    Result: what's $pi ? ** what's 3.14159265358979323846 ?

    Needless to say, we wanted to see the left side of the **, not the right side.

    There are of course possible ways around it, but it is all more a justification not to use $regsub for something like this.
  • On a final note, if you want to urlencode a string, it's not enough to change them to hex chars, they must be hex chars preceeded by a % sign, so let's replace the first $chr(32) with a %

    //var %a, %b = $regsub(babel,"one"+!=+'two',/([^+\w])/Sg,% \$+ \$base(\$asc(\1),10,16) \$+ $chr(32),%a) | echo -a $eval(%a,2)

    Result: "one"+%21%3D+%27two%27 <-- Forum bug, "one" should be % 22 one % 22 without the spaces.
  • Even though we finally arrived at the desired result, the code that we used to get there is hackish and ugly to say the least, so let's take a look at an alternative method.

    Code:
    alias urlencode {
      var %u = $replace($1-,%,% $+ 25), %s = $ticks
      while ($regex(%s,%u,/([^% \w])/g)) %u = $replace(%u,$regml(%s,1),% $+ $base($asc($regml(%s,1)),10,16))
      return $replace(%u,$chr(32),+)
    }


    //echo -a $urlencode("one" != 'two')

    --> "one"+%21%3D+%27two%27 <-- Forum bug, "one" should be % 22 one % 22 without the spaces.

    Note that a regex isn't necessary, you could also do an ordinary while loop, looping through each character of the string, and replacing it with a hex char where necessary.

    I'm also not implying that this is something like the ultimate urlencoder, but it will do for general purposes, even though there are a non-alphanumerics that don't need converting, in the current version they will be.


EDIT: Damn these forums, there is yet another bug, for some reason it doesn't want to display the % 22 one % 22 instead it shows it as "one" grrrrr!!! So in the last two results, the first part should not be "one" but % 22 one % 22 without the spaces. Even though I type it correctly, the boards transform it, so when I edit my post, it doesn't even show it in its original form anymore, but converted to "one".


Gone.
Joined: Oct 2005
Posts: 1,741
G
Hoopy frood
OP Offline
Hoopy frood
G
Joined: Oct 2005
Posts: 1,741
Thanks for the help again, FiberOPtics. I haven't used the $regsub very often in the past, but I figured it wasn't working due to the evaluation in mIRC. Your workaround worked perfectly.

-genius_at_work


Link Copied to Clipboard