mIRC Home    About    Download    Register    News    Help

Print Thread
Page 1 of 2 1 2
#265150 12/03/19 11:47 PM
Joined: Jan 2004
Posts: 1,361
L
Hoopy frood
OP Online
Hoopy frood
L
Joined: Jan 2004
Posts: 1,361
Bug: .redirect property contains original url when the response contains no redirect.

Bug: Trying to cancel a download with either $urlget(%id,c) or $urlget(%url,c) first returns 1 then crashes mIRC.

"Bug": Downloading to binvar is slow, ~2 MB/s after 10 seconds compared to 60 MB/s for download to file.

Enhancement request: Allow verbs other than GET/POST - PUT/PATCH/DELETE etc.

Joined: Jul 2006
Posts: 4,185
W
Hoopy frood
Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 4,185
Improvement:

Add a switch to allow redirection to be followed with depth, -dN with N = 0 for infinite redirection, or N > 0 for N redirection

It would be nice to be able to keep the socket alive if the server answers with a keep alive header.
The syntax would become:
Code:
$urlget([id],url,gpfbrtcdN,target,alias,headers,body)
and if "id" is specified (which can be easily recognized from an url), then the same socket is used if one is already in use for that id, (otherwise either use the id "passed" as the id "returned" if possible (and error out if already in use) or just ignore the id parameter and create a new id).


#mircscripting @ irc.swiftirc.net == the best mIRC help channel
Joined: Jul 2014
Posts: 34
S
Ameglian cow
Offline
Ameglian cow
S
Joined: Jul 2014
Posts: 34
Bug: $urlget(http://usr:pass@host:port,gf,target,alias) fails

compared to:
curl --request GET --url http://usr:pass@host:port

Joined: Dec 2002
Posts: 5,490
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,490
Quote:
Bug: .redirect property contains original url when the response contains no redirect.

Ah, right. I was wavering between calling this ".urlfinal" to indicate the final URL or ".redirect" to represent a redirect. Another option was to simply update .url to store the final URL. But I think .redirect makes the most sense. The .redirect property will be changed in the next beta to be empty unless a redirect takes place.

Quote:
Bug: Trying to cancel a download with either $urlget(%id,c) or $urlget(%url,c) first returns 1 then crashes mIRC.

I have not been able to reproduce this yet. I have tried starting multiple $urlget()s and cancelling them repeatedly and mIRC has not crashed yet. Can you show me the $urlget() call you are using to initiate the download?

Quote:
"Bug": Downloading to binvar is slow, ~2 MB/s after 10 seconds compared to 60 MB/s for download to file.

Hmm. There is only one download routine. The only difference between file and &binvar is that file is written to during the download, whereas &binvar is set at the end. So, technically, file should be slower. In my tests, I actually had to make "file" cache downloads up to a certain amount because Windows Anti-Virus was scanning the file after each write. Can you provide two $urlget() calls that reproduce this issue?

Joined: Dec 2002
Posts: 5,490
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,490
Quote:
It would be nice to be able to keep the socket alive if the server answers with a keep alive header.
The syntax would become:

I'm afraid this is not possible. Each download is completely independent/encapsulated and cannot be re-used.

Joined: Dec 2002
Posts: 5,490
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,490
Quote:
Bug: $urlget(http://usr:pass@host:port,gf,target,alias) fails

I have not been able to reproduce an issue with this. I tried the above call on a password protected http folder and it passed authentication and downloaded the file without any issues. Do you mean to say that usr:pass is not working? Or that there is an issue with :port? Or something else?

Joined: Feb 2003
Posts: 2,812
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2003
Posts: 2,812
Is there a means to handle response headers with a Content-Disposition recommended filename, so that filename can be used or later /renamed? eg, http://example.com/get?100 -> somefile.mp3


Well. At least I won lunch.
Good philosophy, see good in bad, I like!
Joined: Dec 2002
Posts: 5,490
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,490
Quote:
Add a switch to allow redirection to be followed with depth, -dN with N = 0 for infinite redirection, or N > 0 for N redirection

Currently, $urlget() gives up after 10 redirects. It does not detect cyclical redirections. As far as I know, most browsers have a redirect limit of between 10 to 20 redirects. Instead of adding an option for this, I would rather make it behave in a standard way. I could increase the limit to 20 but 10 seems reasonable? I would not want to allow infinite redirects.

Joined: Dec 2002
Posts: 5,490
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,490
Quote:
Is there a means to handle response headers with a Content-Disposition recommended filename, so that filename can be used or later /renamed? eg, http://example.com/get?100 -> somefile.mp3

Nope. $urlget() allows you to specify no dir/filename, a dir, a filename, or a dir + filename. In all cases, it will either use the dir/filename you specify or will determine these itself from your DCC folders, the URL path, the redirect path, etc. If resume is not used, it will add an incrementing number to the filename if it already exists. For example: //echo result: $urlget(https://www.mirc.com/get.php,gf,,alias).

Update: there is an issue where making repeated calls to the above non-resume $urlget() results in some calls failing due to identical filenames being used. Fixing this required moving the non-resume filename check to a different point in the download. This change will be in the next version.

Joined: Feb 2003
Posts: 2,812
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2003
Posts: 2,812
Consider this example. Again, it's the Content-Disposition response header that's intended for a server specified filename. It's the correct method that replaces the redirct method.

https://www.oldtimeradiodownloads.com/download/get_file/79609

glenn-miller-glenn-millers-music-39-06-13-first-song-at-sundown.mp3

Code:
cache-control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
cf-ray: 4b6d5b87aa8c98ad-LAX
content-disposition: attachment; filename=glenn-miller-glenn-millers-music-39-06-13-first-song-at-sundown.mp3
content-transfer-encoding: Binary
content-type: application/octet-stream
date: Wed, 13 Mar 2019 10:34:50 GMT
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
expires: Thu, 19 Nov 1981 08:52:00 GMT
pragma: public
server: cloudflare
set-cookie: PHPSESSID=f221d67c468d8d1037b0ba215d52cb7f; expires=Thu, 14-Mar-2019 05:35:15 GMT; Max-Age=86400; path=/
set-cookie: countPage=34
set-cookie: session_database=dfefb85ec34e0dd0cf90214b4c1eccabfcb716c5%7E5c8896946d9840-67444800; path=/
status: 200
x-powered-by: PHP/5.6.30


Well. At least I won lunch.
Good philosophy, see good in bad, I like!
Joined: Jan 2004
Posts: 1,361
L
Hoopy frood
OP Online
Hoopy frood
L
Joined: Jan 2004
Posts: 1,361
The crash happens only with a binvar target, I have PM'd you links to two large files for testing

Also, where are these temp files in case I now have many partial files downloaded due to crash testing?

Code:
; Call /urlget.test %url

alias urlget.test {
  var %url = $iif($1,$1,http://localhost/)
  bset -t &header 1 Test: Header
  bset -t &body 1 foo1=bar1&foo2=bar2

  var %id = $urlget(%url,gb,&target.dat,urlget.callback,&header) | ; Change from binvar to file for full speed
  echo 4 -ag %id
  timer 1 5 urlget.callback %id
  ;timer 1 6 echo 4 -ag here: $!urlget( %id ,c)    | ; Uncomment to crash on binvar test
  timers
}

alias urlget.callback {
  var %id = $1

  echo -agi9 url      $urlget(%id).url
  echo -agi9 redirect $urlget(%id).redirect
  echo -agi9 method   $urlget(%id).method
  echo -agi9 type     $urlget(%id).type
  echo -agi9 target   $urlget(%id).target
  echo -agi9 alias    $urlget(%id).alias
  echo -agi9 id       $urlget(%id).id
  echo -agi9 state    $urlget(%id).state
  echo -agi9 size     $urlget(%id).size
  echo -agi9 resume   $urlget(%id).resume
  echo -agi9 rcvd     $urlget(%id).rcvd
  echo -agi9 time     $urlget(%id).time
  echo -agi9 reply    $urlget(%id).reply

  echo 4 -agi9 speed    $bytes($calc($urlget(%id).rcvd * 1000 / $urlget(%id).time)).suf $+ /s

  if ($urlget(%id).type == binvar) && ($bvar($urlget(%id).target)) {
    echo -agi9 response $bvar($urlget(%id).target,1-3000).text
  }
}

Joined: Dec 2002
Posts: 5,490
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,490
Quote:
Consider this example. Again, it's the Content-Disposition response header that's intended for a server specified filename.

Thanks, I am aware of this header, however the WinInet Query Info page lists HTTP_QUERY_CONTENT_DISPOSITION as obsolete. Puzzling. I will see if I can add support for it in the next version.

Joined: Dec 2002
Posts: 5,490
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,490
Quote:
Also, where are these temp files in case I now have many partial files downloaded due to crash testing?

If you are telling $urlget() to save the result to a &binvar, no files are created. It will save the result to the &binvar.

If you are downloading a large file to a &binvar, I expect it would be very easy for your system to run out of memory, resulting in a crash.

Joined: Jul 2014
Posts: 34
S
Ameglian cow
Offline
Ameglian cow
S
Joined: Jul 2014
Posts: 34
urs:pass seems to be the problem. I currently tested sending Authorization header as follows:
Code:
bset -t &header 1 Authorization: Basic $encode(usr:pass,m)
$urlget(http://localhost:port/,gf,&target,noop,&header)


and it works

Last edited by SykO; 14/03/19 05:01 AM.
Joined: Dec 2002
Posts: 5,490
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,490
Quote:
urs:pass seems to be the problem

Thanks for testing this out. Unfortunately, I have not been able to reproduce this issue yet.

I tested this feature by creating a password protected folder on a website (through cPanel, htaccess, etc.). When I called $urlget() with the correct user:pass, it downloaded the page. When I called it with the wrong user:pass or none at all, it failed.

If I use SmarSniff to look at the packets sent/recieved, it shows the correct Authorization Basic header being sent.

If I then send the header using /bset, as in your example, it sends the same header.

Both methods work for me. The difference is that $urlget() currently uses WinInet to handle the authorization.

Which version of Windows are you using?

Joined: Dec 2002
Posts: 5,490
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,490
Quote:
The crash happens only with a binvar target, I have PM'd you links to two large files for testing

Thanks, this issue has been fixed for the next version.

Joined: Dec 2002
Posts: 5,490
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,490
Quote:
"Bug": Downloading to binvar is slow, ~2 MB/s after 10 seconds compared to 60 MB/s for download to file.

I narrowed this down to the use of realloc() to repeatedly extend memory to store the downloaded bytes. The larger the memory, the slower realloc() gets. Pre-allocating large chunks helps a little.

Switching to a linked list implementation to store downloaded bytes makes it fast, however this leads to another problem - it needs to be reassembled at the end, which means first allocating contiguous memory to store the entire download in the binvar, effectively requiring double the amount of memory during the process.

(Currently, the memory pointer allocated during the download is assigned directly to the binvar structure, so no extra memory is needed)

In short, there does not seem to be an ideal solution to this - if we want to make the download available as a &binvar, we can opt for fast speed, double memory use, or slow speed, low memory use. Or we could just remove &binvar support and let scripters save to a file and load it as a &binvar if they need to.

Joined: Feb 2003
Posts: 2,812
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2003
Posts: 2,812
I'd go for a compromise where, if the download is larger than say 1 megabyte, then the download goes to a temp file and is loaded back into &binvar when complete. You decide when realloc() becomes too clumsy and slow -- 1 mb? 32 mb?

&binvar is going to be most handy for people performing page scraping, where the html/xml they're scraping never even approaches 1 mb in size.


Well. At least I won lunch.
Good philosophy, see good in bad, I like!
Joined: Aug 2003
Posts: 320
P
Pan-dimensional mouse
Offline
Pan-dimensional mouse
P
Joined: Aug 2003
Posts: 320
I would imagine there are four use cases, depending on whether there is a Content-Length field in the response header:

If we know what the content size is from the header, we should be able to allocate a &binvar of the right size from the start.

It is only when there is no content-length header that we potentially need to allocate memory several times or have multiple copies in memory.

P.S. It might be sensible to extend $urlget to include a maximum size - after which we terminate the download. Sometimes you are only interested in the <head> part of a web page. Sometimes you only want to look at the beginning of a file to determine its content type. However this would potentially avoid situations for someone downloading a file without realising that it is way to big to fit into a &binvar or way too big to download in a reasonable timeframe.

Joined: Jan 2004
Posts: 1,361
L
Hoopy frood
OP Online
Hoopy frood
L
Joined: Jan 2004
Posts: 1,361
Allow use of the HEAD method and an option to call the alias after every state change (including after headers are received)? This way you can make some decisions without the need to have an arbitrary /timer

Edit: States similar to here? https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/readyState

Last edited by Loki12583; 14/03/19 08:38 PM.
Page 1 of 2 1 2

Link Copied to Clipboard