mIRC Homepage
Posted By: Knoeki Gzip decompression - 13/01/10 09:21 PM
I'd be very nice if mIRC was able to decompress gzipped data comming in from a socket, since generally, APIs for sites send back data in gzip compressed format, and as far as I know there is no way to handle this currently.

Alternatively, if someone knows some workaround, I'm intrested. :_)
Posted By: Zed Re: Gzip decompression - 17/01/10 05:12 PM
I have found a solution, but it means modifying 2 bytes in mirc.exe. This disables "deflate" CRC checking and allows you to fed $decompress() with gzip compressed data (after a bcopy).
Validity of the decompressed data can then be vefiried with $crc() but normal deflate format will loose any crc verification.

If you're connection with http, another solution can be to simply use:
Accept-Encoding: deflate
instead of:
Accept-Encoding: gzip,deflate

And if the site(s) you connect to supports deflate, it's enough. But most sites only support gzip
Posted By: s00p Re: Gzip decompression - 18/01/10 02:17 PM
Uhh, while we're at it, bzip2 would be nice? cab, zip, rar and ace? OK maybe a few of those are patented :P
Posted By: Riamus2 Re: Gzip decompression - 18/01/10 04:43 PM
Typically, if you want to work with a format that mIRC doesn't natively support, you need to either write it to binary and save it and then run another program to handle the decrompression or else write/use some form of DLL that will do the work for you. mIRC can't support everything out there.

That said, gzip *is* a very common format and would probably be worthwhile to include.

I may be wrong, but I think you can usually just not include gzip in the accept sockwrite and most sites will send it as text instead of gzip so that you don't need to worry about decompressing anything. Might be slightly slower getting the data, but shouldn't be too noticeable if you're just socketing a page and not a large file.
Posted By: Zed Re: Gzip decompression - 18/01/10 05:37 PM
A little remark about gzip and mirc. Gzip is an encapsulation of compressed data. This compressed part is exactly what mirc reads and writes with $compress() $decompress(), except that it uses zlib deflate encapsulation. Note: zlib/deflate encapsulation is 6 bytes, and gzip encapsulation is 18 bytes (with no filename)
Both formats, gzip and deflate, use encapsulation with checksum verification (ADLER32 for deflate, and CRC32 for gzip). These methods are incompatible, so you can't simply dress gzip as deflate since you can't convert CRC32 to ADLER32. I've debugged mirc, and have found the location where a call to ADLER32 is done. By deactivating it I can feed $decompress() with deflate data or with gzip dressed as deflate. For gzip, I can then verify the CRC32 of the decompressed data. But for native deflate data, since $adler32() doesn't exist, verification is definitively lost.

Now, why mirc doesn't support gzip encoding?
mirc uses its own embedded version of the zlib library. This library natively supports both formats (encoding and decoding in gzip and zlib/deflate format). It should be easy to enable gzip awareness in mirc.
Posted By: Riamus2 Re: Gzip decompression - 18/01/10 05:57 PM
Good explanation of it, though I wasn't referring to using a built-in mIRC function when I suggested decompressing it by writing to a file and using another program to decompress it or using a DLL for the purpose. Still, if the functionality is already there, but "disabled", I don't see why it can just get enabled for use. Should be easy enough to do.

I still think you can remove gzip accepted file types and avoid the issue in most situations.
Posted By: Knoeki Re: Gzip decompression - 21/01/10 02:57 PM
Thanks for the replies! :_)

Originally Posted By: Riamus2

I may be wrong, but I think you can usually just not include gzip in the accept sockwrite and most sites will send it as text instead of gzip so that you don't need to worry about decompressing anything. Might be slightly slower getting the data, but shouldn't be too noticeable if you're just socketing a page and not a large file.


Well, I've tried that. Problem is, some sites *force* it, such as the Discogs API.
Posted By: Zed Re: Gzip decompression - 21/01/10 05:02 PM
Originally Posted By: Knoeki
Well, I've tried that. Problem is, some sites *force* it, such as the Discogs API.


What http headers are you sending?
I've never seen a server which forces an encoding that was not allowed by the browser.

Anyway, if you don't mind patching mirc.exe and using my "solution", you could use gzip encoding. Which version are you using?
Posted By: Knoeki Re: Gzip decompression - 22/01/10 06:52 AM
Originally Posted By: Zed
Originally Posted By: Knoeki
Well, I've tried that. Problem is, some sites *force* it, such as the Discogs API.


What http headers are you sending?
I've never seen a server which forces an encoding that was not allowed by the browser.


This is the exact bit of code I'm using;

Code:
alias discogs {
    sockclose discogs
    set %discogs.get release/15
    set %discogs.count 1
    sockopen discogs www.discogs.com 80
}

on *:SOCKOPEN:discogs: {
    sockwrite -n $sockname GET $+(/,%discogs.get,?f=xml&api_key=6757824fd9) HTTP/1.1
    sockwrite -n $sockname Host: www.discogs.com
    sockwrite -n $sockname User-Agent: Mozilla/5.0 
    sockwrite -n $sockname Connection: keep-alive
    sockwrite -n $sockname Accept-Encoding: deflate
    sockwrite -n $sockname $crlf
}

on *:SOCKREAD:discogs: {
    sockread %raw
    echo -s %raw
} 


Okay, it doesn't literally *force* me to use gzip, but it does tell me the client doesn't accept gzip encoding, instead of the data I want to retrieve.

Quote:
Anyway, if you don't mind patching mirc.exe and using my "solution", you could use gzip encoding. Which version are you using?


I'm using 6.35. Somehow I have my doubts about patching though, I'm somewhat afraid it will break things.
Posted By: Riamus2 Re: Gzip decompression - 22/01/10 12:31 PM
Originally Posted By: Knoeki
I'm using 6.35. Somehow I have my doubts about patching though, I'm somewhat afraid it will break things.


I don't think it's really acceptable in the EULA anyhow. smile

It's strange that it won't work without gzip. Unfortunately I don't know if there's any way around that since I don't know all the tricks with http headers.
Posted By: Zed Re: Gzip decompression - 22/01/10 01:37 PM
Originally Posted By: Knoeki
Okay, it doesn't literally *force* me to use gzip, but it does tell me the client doesn't accept gzip encoding, instead of the data I want to retrieve.

Can you give me a sample URL which replies with this error?

Originally Posted By: Knoeki
Quote:
Anyway, if you don't mind patching mirc.exe and using my "solution", you could use gzip encoding. Which version are you using?

I'm using 6.35. Somehow I have my doubts about patching though, I'm somewhat afraid it will break things.

It's up to you. The modification only changes 2 bytes. And it doesn't target anything else.

The last solution, if the server response can't be fixed for "deflate" requests, it to remove the "Accept-Encoding:" header from your requests.
Posted By: Knoeki Re: Gzip decompression - 24/01/10 08:14 PM
Originally Posted By: Zed
Originally Posted By: Knoeki
Okay, it doesn't literally *force* me to use gzip, but it does tell me the client doesn't accept gzip encoding, instead of the data I want to retrieve.

Can you give me a sample URL which replies with this error?


Of course:

Code:
alias discogs {
    sockclose discogs
    set %discogs.get release/15
    set %discogs.count 1
    sockopen discogs www.discogs.com 80
}

on *:SOCKOPEN:discogs: {
    sockwrite -n $sockname GET $+(/,%discogs.get,?f=xml&api_key=6757824fd9) HTTP/1.1
    sockwrite -n $sockname Host: www.discogs.com
    sockwrite -n $sockname User-Agent: Mozilla/5.0 
    sockwrite -n $sockname Connection: keep-alive
    sockwrite -n $sockname Accept-Encoding: gzip
    sockwrite -n $sockname $crlf
}

on *:SOCKREAD:discogs: {
    sockread %raw
    echo -s %raw
}



Quote:
Originally Posted By: Knoeki
Quote:
Anyway, if you don't mind patching mirc.exe and using my "solution", you could use gzip encoding. Which version are you using?

I'm using 6.35. Somehow I have my doubts about patching though, I'm somewhat afraid it will break things.

It's up to you. The modification only changes 2 bytes. And it doesn't target anything else.

The last solution, if the server response can't be fixed for "deflate" requests, it to remove the "Accept-Encoding:" header from your requests.


that won't work either, some sites really really want you to accept gzip encoding.
Posted By: argv0 Re: Gzip decompression - 25/01/10 10:37 AM
The workaround, if it was missed in the discussion, is to write the raw data to a file and run a gunzip program on it. Alternatively, you can use your own gzip wrapper dll file instead of calling an external exe, though it would be functionally the same.

Posted By: Zed Re: Gzip decompression - 26/01/10 04:23 PM
Originally Posted By: Knoeki
that won't work either, some sites really really want you to accept gzip encoding.


OK, I see.

Here is how to patch mirc.exe 6.35
- make a backup copy of mirc.exe
- with an hex editor (you can find a free one here: http://frhed.sourceforge.net/ ) edit mirc.exe (must not be running).
- change byte at position 6354C: from 85 to 33 (context: C4 1C 33 C9 85 C0 0F 95)
- change byte at position 1D6752: from 74 to EB (context: 46 18 74 09 C7 41 18 90)
- save file

How to use the new gzip capability?
I have a remote script that handles http requests (as many concurrent requests as you like). Each socket has its own name and its own hash table (same name). The hash table is populated with a binvar item ('headers') for the headers, and another one ('ret') for the body.
When the socket is closed, here is how decompression looks like:

Code:

"if no (http or socket) error" : $hget($sockname,headers,&tmphead)

    if ($bfind(&tmphead,1,Content-Encoding: deflate)) && ($hget($sockname,ret,&_dec)) {
      if ($decompress(&_dec,b)) hadd -b $sockname ret &_dec
    }
    elseif ($bfind(&tmphead,1,Content-Encoding: gzip)) && ($hget($sockname,ret,&_dec)) {
      var %crc = $bvar(&_dec, $calc($ifmatch -7) ,4), %i = 4, %crc2
      while %i { %crc2 = %crc2 $+ $base( $gettok(%crc,%i,32) ,10,16,2) | dec %i }
      if (%crc2) {
        bcopy &_dec 3 &_dec 11 -1
        bset &_dec 1 120 156
        if ($decompress(&_dec,b)) && ($crc(&_dec,1) == %crc2) hadd -b $sockname ret &_dec
      }
      else hadd $sockname ret
    }
    bunset &_dec


Notes:
- gzip compressed data has to be dressed as deflate data. crc32 is extracted and verified after decompression.
- standard deflate ($compress()) format has lost its CRC capability on decompression.
Posted By: MeStinkBAD Re: Gzip decompression - 07/02/10 08:54 AM
You people do know of CURL right?
Posted By: tontito Re: Gzip decompression - 27/02/10 01:40 AM
Yes, but it would be interesting to have the native support on mirc.

Best regards
Posted By: FroggieDaFrog Re: Gzip decompression - 07/09/17 12:34 AM
Even though this request is 7 years old, I'd like to see it implemented, especially now with the overwhelming rise of HTTP. Many HTTP servers are forcing GZIP compression even if its not requested, and currently mIRC does not have a suitable way to decompress such recieved data.

mIRC already supports the underlying methodology via INFLATE/DEFLATE aside from the checksum performed after decompression(adler32 for inflate; crc32 for gzip)

I suggest either of the following for gzip decompression:
Code:
;; .gzip to indicate gzip data is being decompressed
$decompress(file|&bvar, b).gzip


Code:
;; n: disable the checksum after decompression
;;     It would be on the scripter to remove (gzip-specific) headers
;;     and to calculate & verify the checksum after decompression
$decompress(file|&bvar, n)



This feature request has had quite a bit of support over the years:
https://forums.mirc.com/ubbthreads.php/ubb/showflat/Number/216390
https://forums.mirc.com/ubbthreads.php/ubb/showflat/Number/99485
https://forums.mirc.com/ubbthreads.php/ubb/showflat/Number/74303
https://forums.mirc.com/ubbthreads.php/ubb/showflat/Number/46462
© mIRC Discussion Forums