mIRC Home    About    Download    Register    News    Help

Print Thread
#264831 15/01/19 07:27 PM
Joined: Aug 2003
Posts: 319
P
Pan-dimensional mouse
OP Offline
Pan-dimensional mouse
P
Joined: Aug 2003
Posts: 319
$zip looks like a VERY useful enhancement, so thanks for adding it. However I am unclear what range of compressed file formats it supports (other than .zip)?

I have a particular requirement for reading .rar files as well as .zip.

How difficult would it be to convert this feature to use libarchive instead so that it would also include support for reading .rar and other formats as documented at https://github.com/libarchive/libarchive/wiki/LibarchiveFormats/ ?

Protopia #264839 16/01/19 12:53 PM
Joined: Dec 2002
Posts: 5,411
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,411
$zip() only supports zip files, with or without AES, and no other formats.

I did try libarchive, unfortunately I just could not make it work. Everything from compilation issues to not being able to figure out how to use the API to zip files with AES. I eventually gave up :-] I may look at it again in future.

Khaled #264852 18/01/19 12:27 AM
Joined: Feb 2003
Posts: 2,812
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2003
Posts: 2,812
Would it be possible to add AES support to $encode/$decode() as well? I assume your crypto lib for SSL and now ZIP could have their functions repurposed.


Well. At least I won lunch.
Good philosophy, see good in bad, I like!
Raccoon #264853 18/01/19 02:04 AM
Joined: Jan 2004
Posts: 2,127
Hoopy frood
Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
A couple suggestions for $zip

1. The 'o' switch shouldn't be required when the export location exists as a foldername, but rather should be required only for overwriting actual files, which shouldn't include the dir-name entry inside the zip. This makes it impossible to unzip into an existing folder without using the 'o' switch, putting yourself at risk of overwriting an actual file. The error message in this case is confusing, as it's not invalid parameters just like it's not invalid parameters when /copy doesn't use -o when the output already exists.

2. Have a switch or .prop which reveals the contents of the zip to a script. Since $zip is possibly interacting with zips created with 7zip or other utilities, it's likely someone would receive a .zip via DCC which has unknown contents. This snippet is just a quick attempt to read the contents of a zip, so it won't work on all .zip files. For example it doesn't attempt to read the filesize of a zip64 format holding files whose original size is larger than 4gb, and it won't touch a zip which has a non-zero length comment at the very end of the file. It saves info to a hashtable, as well as creating a tab-delimited output as the /return value from calling this as an identifier.
Code:
; Syntax $ziplist(zipname,hashtablename)
; deletes then replaces hashtable, creates items 1+ containing. returns tab-delimited list of files
; hashtable format: counter_integer filesize|DIR filename
ziplist {
  var %size $file($1).size , %list , %ptr 1 , %counter 0 , %totsize 0
  if (%size !isnum 100-) { var %err Invalid zip | goto fail }
  bread $qt($1) $calc(%size -6) 6 &maroon.ziplist
  var %comlen $bvar(&maroon.ziplist,5).word , %cdirloc $bvar(&maroon.ziplist,1).long
  if (%comlen != 0) { var %err non-zero comment | goto fail }
  var %cdirsize $calc(%size - %cdirloc -6)
  if (%cdirsize !isnum 5-65536) { var %err invalid central dir | goto fail }
  bread $qt($1) $calc(%cdirloc) $calc(%cdirsize +7) &maroon.ziplist
  if ($bvar(&maroon.ziplist,0) != $calc(%cdirsize +6)) { var %err invalid zip | goto fail }
  while ($bvar(&maroon.ziplist,%ptr).long == 33639248) {
    var %modtime $base($bvar(&maroon.ziplist,$calc(%ptr +12)).word,10,16,4)
    var %moddate $base($bvar(&maroon.ziplist,$calc(%ptr +14)).word,10,16,4)
    var %crc     $base($bvar(&maroon.ziplist,$calc(%ptr +16)).long,10,16,8)
    var %arcsize $bvar(&maroon.ziplist,$calc(%ptr +20)).long
    var %orisize $bvar(&maroon.ziplist,$calc(%ptr +24)).long
    var %fnamlen $bvar(&maroon.ziplist,$calc(%ptr +28)).word
    var %xtralen $bvar(&maroon.ziplist,$calc(%ptr +30)).word
    var %comtlen $bvar(&maroon.ziplist,$calc(%ptr +32)).word
    var %intattr $bvar(&maroon.ziplist,$calc(%ptr +36)).word
    var %extattr $bvar(&maroon.ziplist,$calc(%ptr +38)).long
    inc %totsize %orisize
    inc %counter | if (%counter == 1) { hfree -w $2 | hmake $2 | echo -s $ $+ ziplist contents listing of $1 }
    hadd $2 %counter $iif($isbit(%extattr,5),DIR,%orisize) $bvar(&maroon.ziplist,$calc(%ptr + 46),%fnamlen).text
    if (!$isbit(%extattr,5)) var %list $addtok(%list,$bvar(&maroon.ziplist,$calc(%ptr + 46),%fnamlen).text,9)
    inc %ptr $calc(46+ %fnamlen + %xtralen + %comtlen)
  }
  if (%counter == 0) { var %err no contents found | goto fail }
  echo -sc info2 *ziplist %counter items for total size %totsize found in $1 and listed in hashtable $2
  return %list
  :fail | echo -sc info2 *$ziplist error: %err syntax: $ $+ ziplist(zipname,hashtable) | halt
}

Raccoon #264855 18/01/19 09:11 AM
Joined: Dec 2002
Posts: 5,411
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,411
Quote:
Would it be possible to add AES support to $encode/$decode() as well? I assume your crypto lib for SSL and now ZIP could have their functions repurposed.

This is not possible as the AES support is built into the zip feature of the libzip package. It is not something that mIRC handles. I might be able to add AES support to $encode/$decode separately in future.

maroon #264858 18/01/19 12:49 PM
Joined: Dec 2002
Posts: 5,411
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,411
Quote:
The 'o' switch shouldn't be required when the export location exists as a foldername

I personally want the current behaviour. If I am extracting a zip to an existing folder, it could overwrite files or change folder structure. So I chose to make $zip() fail if the file or folder exists, unless 'o' is used. That is the purpose of 'o' in this case. If you need a different behaviour, I am afraid you will need to script it.

As for the invalid parameter error, I will change this in the next version to report a file error.

Khaled #264861 18/01/19 06:53 PM
Joined: Jan 2004
Posts: 2,127
Hoopy frood
Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
$zip is failing to extract files if the filename in the central dir is different than in the local header, but that's probably a good thing.

But $zip should filter filenames containing ..

This trick doesn't seem to work when filename begins with leading slash, nor when filename begins with driveletter: But it is working when filename contains ..

file doesn't exist:
Code:
//echo -a $isfile(Scripts\script999.mrc)

create dummy file:
Code:
/write -c zzzzzzzzzzzzzzzzzzzzzzzzzzzz.mrc test

add it into a zip:
Code:
//remove test.zip | echo -a $zip(test.zip,c,zzzzzzzzzzzzzzzzzzzzzzzzzzzz.mrc)

alter filename inside zip:
Code:
//var %a test\..\..\Scripts\script999.mrc | bread test.zip 0 999 &v | bset -t &v 31 %a | bset -t &v 115 %a | bwrite -c test2.zip 0 999 &v

extract file from zip to non-existing foldername:
Code:
//rmdir nosuchfolder2 | echo -a $zip(test2.zip,e,nosuchfolder2) | echo -a $file(Scripts\script999.mrc).size

unzipped file now exists in scripts folder. now replace file with larger file, but this silently overwrites existing file, even creates file if Scripts\ subfolder doesn't exist:
Code:
//rmdir nosuchfolder2 | copy -o $qt(mirc.ini) scripts\script999.mrc | echo -a $file(Scripts\script999.mrc).size | echo -a $zip(test2.zip,e,nosuchfolder2) | echo -a $file(Scripts\script999.mrc).size

maroon #264864 19/01/19 10:00 AM
Joined: Dec 2002
Posts: 5,411
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,411
Quote:
$zip is failing to extract files if the filename in the central dir is different than in the local header

This is due to how libzip handles this issue, so either way it would not be possible for mIRC to change it.

Quote:
But $zip should filter filenames containing ..

Ye gods. Thanks for spotting this! I never considered that the zip format would allow ".." in path names. I am not sure how best to handle this though. Should mIRC ignore specific files that contain ".." in their path but extract all others? Or should it assume that any zip file that contains ".." in a path name is malicious? I would opt to fail the entire zip file since, if it contains "..", it was most likely intended to be malicious and may contain other malicious files. This change will be in the next beta.

maroon #264865 19/01/19 01:36 PM
Joined: Feb 2003
Posts: 2,812
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2003
Posts: 2,812


Well. At least I won lunch.
Good philosophy, see good in bad, I like!
Raccoon #264882 23/01/19 05:59 AM
Joined: Jan 2004
Posts: 2,127
Hoopy frood
Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
$zip l

Observing that filenames in the 'l' listing have the / slashes translated to \ format, but the subdir/ entry still has the slash as forward. Keeping a slash at the tail of a subdir entry is useful, identifying it as different from a zero size file.

Observing that the .size property for N=0 is being ignored, and would be useful if it returned total filesize instead of total item count.

This may be a limitation of the zip library, but it would be useful to have a parameter to alter the default compression level. For example specifying method=store when it's expected that the file can't be compressed, but is being zipped for either the password or for the crc32 verification of integrity.

Observing that the zip library isn't doing a fallback to storage when the file can't be compressed. This simplistic example creates a zip which compresses a 256-byte file to 261 bytes instead of storing as 256.

Code:
//var %size 256 | bset &v 1 $regsubex($str(x,%size),/x/g,$calc(\n -1) $chr(32)) | bwrite ascii.dat 0 %size &v | echo -a $zip(ascii.zip,c,ascii.dat)



Before $zip becomes established, it could be useful for the success to be the N number added to the zip instead of copying $compress which only compresses 1 file or not.

Conforming that 'e' is doing as you said, it's not extracting anything from a zip containing ../ or ..\. Not sure if it's intentional that in this case 'l' with N=0 reports 0. Not sure if 'l' should at least report filenames so people know why it failed.

maroon #264883 23/01/19 10:00 AM
Joined: Dec 2002
Posts: 5,411
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,411
Quote:
Observing that filenames in the 'l' listing have the / slashes translated to \ format, but the subdir/ entry still has the slash as forward. Keeping a slash at the tail of a subdir entry is useful, identifying it as different from a zero size file.

This is actually what is being returned by libzip. mIRC is not making any changes to the results. I could make mIRC fix the slashes but then it wouldn't be what is in the zip file. On the other hand, not fixing the slashes means that scripts have to deal with both \ and / ... This change will be in the next beta.

Quote:
it would be useful to have a parameter to alter the default compression level

I would rather avoid adding more complexity. At today's internet speeds, I would generally opt for faster compression. Especially with $zip(), which does not support multiple CPU threads, etc., so it is going to be relatively slow anyway. I have set $zip() to use the fastest compression level by default.

Quote:
Not sure if it's intentional that in this case 'l' with N=0 reports 0

It is, the zip file is failed at the very earliest stage if it contains "..", so all zip-related features will see a failure. Another option is to handle this like 7zip; In some contexts, 7zip will convert ".." to underscores, in other contexts it will remove "..\" completely from the path (which can result in an empty file/folder name if all it contains is "..\..\..\"), and so on. I could make $zip() do all of this but if a zip file contains a zip exploit, I would rather just fail it consistently across all $zip() features.

Khaled #270668 19/08/22 11:59 PM
Joined: Jan 2004
Posts: 2,127
Hoopy frood
Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
More feedback on $zip

I see the .crc property I suggested is in effect but not documented. But it's being returned as a base10 integer.

Suggestion#1: Unless for internal use, have $zip().crc return the crc as 8 hex digits like $crc does.

--

From brute force I see .cm .em .idx also return values. Through trial and error, it seems that .em returns 256 if it's been encrypted by aes, and otherwise is always 0. I have another program which can create encrypted zips using 3 different methods. $zip().em returns for the 3 methods:

256 = encrypted using aes256
256 = encrypted using aes128
0 = encrypted using pkzip 2.0 method
0 = not encrypted

I know zip format supports a wide range of encryption methods, so it's unreasonable to expect a different .em value for them all. $zip can extract the aes128 and aes256 methods but not the pkzip 2.0 method

Suggestion #2: Have $zip().em return some other value besides 0 and 256 for "sorry extraction switch 'e' will fail because encryption method not supported".

--

the .cm seems to be compression method, and a zip created by $zip always returns 8. For zips created by others I've got it to return '0' when the file is stored instead of compressed, or return '6' from obsolete methods used by pkzip v1.1. Not sure yet what kind of practical benefit that .cm will be for users.

At first, .idx was always returning 0, until I figured out that when a .zip has more than 1 file in it, that .idx is always 1 less than the N parm seen by the 'l' switch in $zip, so this probably won't have much help for the user who already know the N N parameter they passed to $zip.

--

Suggestion #3

$zip().mtime so users can see what's the timestamp of the file inside the zip.

Suggestion #4

Change current behavior which is to always give the extracted files the timestamp matching the time of extraction instead of the .mtime contained inside the zip. $zip correctly attaches the timestamp when compressing the file, it just doesn't use it when extracting.

Trivia note for interested users: When added to a .zip the timestamp is rounded down to an even number of seconds because that's all that's being stored inside the .zip where the date and time are each stored as 16-bit values, and there's only 5 bits for the 'seconds', so the values stored is $calc(seconds//2)

Suggestion #5

You didn't think it's useful to allow choice of compression level, but there's plenty of situations where it's useful. For example, when creating a .zip for purpose for longterm storage, or as an attachment where some email systems have a limit to the size of attachments. There can be a wide range of filesizes depending on the utility being used, and I'm not sure what levels are offered by the zip library being used here. My file manager offers choice of levels 1 through 9, and from comparing how it and $zip handle the current versions.txt...

280814 = filemanager's level 1
282496 = created by $zip(test.zip,co,versions.txt)
268436 = filemanager's level 2
...
230087 = filemanager's level 9

... which shows a 1-vs-9 difference of 18% for this particular file. These ranges of sizes look similar to the level of compression from $compress using 'l' with N=1 thru 9. Assuming versions.txt is in $mircdir :

//var %i 1 | while (%i isnum 1-9) { bread versions.txt 0 9999999 &v | echo -a $compress(&v,bl $+ %i) level %i : $bvar(&v,0) | inc %i }

I can't see a slightly slower compression time being a problem, unless someone is really doing some heavy-duty zipping, like trying to compress 100megabytes at the same time. The level-one could always be the default.

Suggestion #6

You mentioned not wanting to add complexity to $zip, so I assume that precludes being able to selectively add an additional file to an existing .zip or extracting only a single file from a multi-file zip. Those features are usually associated with complex switches for doing things like adding only when a file is newer, or adding only files within a range of timestamps. Perhaps a compromise would be the ability to create a list-file, which would be the scriptor's job to create it based on their own needs, such as limiting the filesize added to the zip, date range, etc. Then there could be a switch like '@' which would make the 'c' switch see parm3 as a text listfile containing relative/absolute file to be added instead of the parameter being a filename or foldername. That would avoid the need to create copies of files in the add-to-zip folder then delete the copies when done. That @list feature has been in zips for a long time, so I assume the zip library still supports it.

Suggestion #7

Have a way to add-or-extract the files without including the parent folder as if it's a subfolder. For example, create $mircdirZIPTEST and put versions.txt in it, then create a subfolder TEST below that and put aliases.ini in it. Then create zip:

echo -a $zip(test.zip,co,ZIPTEST)

The files are added to the zip as

ZIPTEST\versions.txt
ZIPTEST\test\aliases.ini

When you extract files, there's now no way to extract the files without including 'testfolder' as part of the extraction path, which means deleting the TESTFOLDER tree and then extracting like $zip(test.zip,e,ZIPTEST) this results in the files being extracted to:

$mircdir\ZIPTEST\ZIPTEST\versions.txt
$mircdir\ZIPTEST\ZIPTEST\test\aliases.ini

... instead of extracting versions.txt to the folder location from where it was added to the zip.


Link Copied to Clipboard