mIRC Home    About    Download    Register    News    Help

Print Thread
Page 1 of 2 1 2
$urlget bugs / discussion #265150
12/03/19 11:47 PM
12/03/19 11:47 PM
Joined: Jan 2004
Posts: 1,271
L
Loki12583 Online OP
Hoopy frood
Loki12583  Online OP
Hoopy frood
L

Joined: Jan 2004
Posts: 1,271
Bug: .redirect property contains original url when the response contains no redirect.

Bug: Trying to cancel a download with either $urlget(%id,c) or $urlget(%url,c) first returns 1 then crashes mIRC.

"Bug": Downloading to binvar is slow, ~2 MB/s after 10 seconds compared to 60 MB/s for download to file.

Enhancement request: Allow verbs other than GET/POST - PUT/PATCH/DELETE etc.

Re: $urlget bugs / discussion [Re: Loki12583] #265152
12/03/19 11:59 PM
12/03/19 11:59 PM
Joined: Jul 2006
Posts: 3,515
France
W
Wims Offline
Hoopy frood
Wims  Offline
Hoopy frood
W

Joined: Jul 2006
Posts: 3,515
France
Improvement:

Add a switch to allow redirection to be followed with depth, -dN with N = 0 for infinite redirection, or N > 0 for N redirection

It would be nice to be able to keep the socket alive if the server answers with a keep alive header.
The syntax would become:
Code:
$urlget([id],url,gpfbrtcdN,target,alias,headers,body)
and if "id" is specified (which can be easily recognized from an url), then the same socket is used if one is already in use for that id, (otherwise either use the id "passed" as the id "returned" if possible (and error out if already in use) or just ignore the id parameter and create a new id).


Looking for a good help channel about mIRC? Check #mircscripting @ irc.swiftirc.net
Re: $urlget bugs / discussion [Re: Loki12583] #265153
13/03/19 04:56 AM
13/03/19 04:56 AM
Joined: Jul 2014
Posts: 27
S
SykO Offline
Ameglian cow
SykO  Offline
Ameglian cow
S

Joined: Jul 2014
Posts: 27
Bug: $urlget(http://usr:pass@host:port,gf,target,alias) fails

compared to:
curl --request GET --url http://usr:pass@host:port

Re: $urlget bugs / discussion [Re: Loki12583] #265154
13/03/19 09:28 AM
13/03/19 09:28 AM
Joined: Dec 2002
Posts: 4,449
London, UK
Khaled Offline

Hoopy frood
Khaled  Offline

Hoopy frood

Joined: Dec 2002
Posts: 4,449
London, UK
Quote:
Bug: .redirect property contains original url when the response contains no redirect.

Ah, right. I was wavering between calling this ".urlfinal" to indicate the final URL or ".redirect" to represent a redirect. Another option was to simply update .url to store the final URL. But I think .redirect makes the most sense. The .redirect property will be changed in the next beta to be empty unless a redirect takes place.

Quote:
Bug: Trying to cancel a download with either $urlget(%id,c) or $urlget(%url,c) first returns 1 then crashes mIRC.

I have not been able to reproduce this yet. I have tried starting multiple $urlget()s and cancelling them repeatedly and mIRC has not crashed yet. Can you show me the $urlget() call you are using to initiate the download?

Quote:
"Bug": Downloading to binvar is slow, ~2 MB/s after 10 seconds compared to 60 MB/s for download to file.

Hmm. There is only one download routine. The only difference between file and &binvar is that file is written to during the download, whereas &binvar is set at the end. So, technically, file should be slower. In my tests, I actually had to make "file" cache downloads up to a certain amount because Windows Anti-Virus was scanning the file after each write. Can you provide two $urlget() calls that reproduce this issue?

Re: $urlget bugs / discussion [Re: Wims] #265155
13/03/19 09:31 AM
13/03/19 09:31 AM
Joined: Dec 2002
Posts: 4,449
London, UK
Khaled Offline

Hoopy frood
Khaled  Offline

Hoopy frood

Joined: Dec 2002
Posts: 4,449
London, UK
Quote:
It would be nice to be able to keep the socket alive if the server answers with a keep alive header.
The syntax would become:

I'm afraid this is not possible. Each download is completely independent/encapsulated and cannot be re-used.

Re: $urlget bugs / discussion [Re: SykO] #265156
13/03/19 09:33 AM
13/03/19 09:33 AM
Joined: Dec 2002
Posts: 4,449
London, UK
Khaled Offline

Hoopy frood
Khaled  Offline

Hoopy frood

Joined: Dec 2002
Posts: 4,449
London, UK
Quote:
Bug: $urlget(http://usr:pass@host:port,gf,target,alias) fails

I have not been able to reproduce an issue with this. I tried the above call on a password protected http folder and it passed authentication and downloaded the file without any issues. Do you mean to say that usr:pass is not working? Or that there is an issue with :port? Or something else?

Re: $urlget bugs / discussion [Re: Khaled] #265157
13/03/19 09:40 AM
13/03/19 09:40 AM
Joined: Feb 2003
Posts: 2,600
Raccoon Online
Hoopy frood
Raccoon  Online
Hoopy frood

Joined: Feb 2003
Posts: 2,600
Is there a means to handle response headers with a Content-Disposition recommended filename, so that filename can be used or later /renamed? eg, http://example.com/get?100 -> somefile.mp3


Well. At least I won lunch.
Good philosophy, see good in bad, I like!
Re: $urlget bugs / discussion [Re: Wims] #265158
13/03/19 09:43 AM
13/03/19 09:43 AM
Joined: Dec 2002
Posts: 4,449
London, UK
Khaled Offline

Hoopy frood
Khaled  Offline

Hoopy frood

Joined: Dec 2002
Posts: 4,449
London, UK
Quote:
Add a switch to allow redirection to be followed with depth, -dN with N = 0 for infinite redirection, or N > 0 for N redirection

Currently, $urlget() gives up after 10 redirects. It does not detect cyclical redirections. As far as I know, most browsers have a redirect limit of between 10 to 20 redirects. Instead of adding an option for this, I would rather make it behave in a standard way. I could increase the limit to 20 but 10 seems reasonable? I would not want to allow infinite redirects.

Re: $urlget bugs / discussion [Re: Raccoon] #265159
13/03/19 09:54 AM
13/03/19 09:54 AM
Joined: Dec 2002
Posts: 4,449
London, UK
Khaled Offline

Hoopy frood
Khaled  Offline

Hoopy frood

Joined: Dec 2002
Posts: 4,449
London, UK
Quote:
Is there a means to handle response headers with a Content-Disposition recommended filename, so that filename can be used or later /renamed? eg, http://example.com/get?100 -> somefile.mp3

Nope. $urlget() allows you to specify no dir/filename, a dir, a filename, or a dir + filename. In all cases, it will either use the dir/filename you specify or will determine these itself from your DCC folders, the URL path, the redirect path, etc. If resume is not used, it will add an incrementing number to the filename if it already exists. For example: //echo result: $urlget(https://www.mirc.com/get.php,gf,,alias).

Update: there is an issue where making repeated calls to the above non-resume $urlget() results in some calls failing due to identical filenames being used. Fixing this required moving the non-resume filename check to a different point in the download. This change will be in the next version.

Re: $urlget bugs / discussion [Re: Khaled] #265160
13/03/19 10:35 AM
13/03/19 10:35 AM
Joined: Feb 2003
Posts: 2,600
Raccoon Online
Hoopy frood
Raccoon  Online
Hoopy frood

Joined: Feb 2003
Posts: 2,600
Consider this example. Again, it's the Content-Disposition response header that's intended for a server specified filename. It's the correct method that replaces the redirct method.

https://www.oldtimeradiodownloads.com/download/get_file/79609

glenn-miller-glenn-millers-music-39-06-13-first-song-at-sundown.mp3

Code:
cache-control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
cf-ray: 4b6d5b87aa8c98ad-LAX
content-disposition: attachment; filename=glenn-miller-glenn-millers-music-39-06-13-first-song-at-sundown.mp3
content-transfer-encoding: Binary
content-type: application/octet-stream
date: Wed, 13 Mar 2019 10:34:50 GMT
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
expires: Thu, 19 Nov 1981 08:52:00 GMT
pragma: public
server: cloudflare
set-cookie: PHPSESSID=f221d67c468d8d1037b0ba215d52cb7f; expires=Thu, 14-Mar-2019 05:35:15 GMT; Max-Age=86400; path=/
set-cookie: countPage=34
set-cookie: session_database=dfefb85ec34e0dd0cf90214b4c1eccabfcb716c5%7E5c8896946d9840-67444800; path=/
status: 200
x-powered-by: PHP/5.6.30


Well. At least I won lunch.
Good philosophy, see good in bad, I like!
Re: $urlget bugs / discussion [Re: Khaled] #265161
13/03/19 12:26 PM
13/03/19 12:26 PM
Joined: Jan 2004
Posts: 1,271
L
Loki12583 Online OP
Hoopy frood
Loki12583  Online OP
Hoopy frood
L

Joined: Jan 2004
Posts: 1,271
The crash happens only with a binvar target, I have PM'd you links to two large files for testing

Also, where are these temp files in case I now have many partial files downloaded due to crash testing?

Code:
; Call /urlget.test %url

alias urlget.test {
  var %url = $iif($1,$1,http://localhost/)
  bset -t &header 1 Test: Header
  bset -t &body 1 foo1=bar1&foo2=bar2

  var %id = $urlget(%url,gb,&target.dat,urlget.callback,&header) | ; Change from binvar to file for full speed
  echo 4 -ag %id
  timer 1 5 urlget.callback %id
  ;timer 1 6 echo 4 -ag here: $!urlget( %id ,c)    | ; Uncomment to crash on binvar test
  timers
}

alias urlget.callback {
  var %id = $1

  echo -agi9 url      $urlget(%id).url
  echo -agi9 redirect $urlget(%id).redirect
  echo -agi9 method   $urlget(%id).method
  echo -agi9 type     $urlget(%id).type
  echo -agi9 target   $urlget(%id).target
  echo -agi9 alias    $urlget(%id).alias
  echo -agi9 id       $urlget(%id).id
  echo -agi9 state    $urlget(%id).state
  echo -agi9 size     $urlget(%id).size
  echo -agi9 resume   $urlget(%id).resume
  echo -agi9 rcvd     $urlget(%id).rcvd
  echo -agi9 time     $urlget(%id).time
  echo -agi9 reply    $urlget(%id).reply

  echo 4 -agi9 speed    $bytes($calc($urlget(%id).rcvd * 1000 / $urlget(%id).time)).suf $+ /s

  if ($urlget(%id).type == binvar) && ($bvar($urlget(%id).target)) {
    echo -agi9 response $bvar($urlget(%id).target,1-3000).text
  }
}

Re: $urlget bugs / discussion [Re: Raccoon] #265162
13/03/19 02:52 PM
13/03/19 02:52 PM
Joined: Dec 2002
Posts: 4,449
London, UK
Khaled Offline

Hoopy frood
Khaled  Offline

Hoopy frood

Joined: Dec 2002
Posts: 4,449
London, UK
Quote:
Consider this example. Again, it's the Content-Disposition response header that's intended for a server specified filename.

Thanks, I am aware of this header, however the WinInet Query Info page lists HTTP_QUERY_CONTENT_DISPOSITION as obsolete. Puzzling. I will see if I can add support for it in the next version.

Re: $urlget bugs / discussion [Re: Loki12583] #265170
14/03/19 12:49 AM
14/03/19 12:49 AM
Joined: Dec 2002
Posts: 4,449
London, UK
Khaled Offline

Hoopy frood
Khaled  Offline

Hoopy frood

Joined: Dec 2002
Posts: 4,449
London, UK
Quote:
Also, where are these temp files in case I now have many partial files downloaded due to crash testing?

If you are telling $urlget() to save the result to a &binvar, no files are created. It will save the result to the &binvar.

If you are downloading a large file to a &binvar, I expect it would be very easy for your system to run out of memory, resulting in a crash.

Re: $urlget bugs / discussion [Re: Khaled] #265173
14/03/19 04:54 AM
14/03/19 04:54 AM
Joined: Jul 2014
Posts: 27
S
SykO Offline
Ameglian cow
SykO  Offline
Ameglian cow
S

Joined: Jul 2014
Posts: 27
urs:pass seems to be the problem. I currently tested sending Authorization header as follows:
Code:
bset -t &header 1 Authorization: Basic $encode(usr:pass,m)
$urlget(http://localhost:port/,gf,&target,noop,&header)


and it works

Last edited by SykO; 14/03/19 05:01 AM.
Re: $urlget bugs / discussion [Re: SykO] #265182
14/03/19 04:02 PM
14/03/19 04:02 PM
Joined: Dec 2002
Posts: 4,449
London, UK
Khaled Offline

Hoopy frood
Khaled  Offline

Hoopy frood

Joined: Dec 2002
Posts: 4,449
London, UK
Quote:
urs:pass seems to be the problem

Thanks for testing this out. Unfortunately, I have not been able to reproduce this issue yet.

I tested this feature by creating a password protected folder on a website (through cPanel, htaccess, etc.). When I called $urlget() with the correct user:pass, it downloaded the page. When I called it with the wrong user:pass or none at all, it failed.

If I use SmarSniff to look at the packets sent/recieved, it shows the correct Authorization Basic header being sent.

If I then send the header using /bset, as in your example, it sends the same header.

Both methods work for me. The difference is that $urlget() currently uses WinInet to handle the authorization.

Which version of Windows are you using?

Re: $urlget bugs / discussion [Re: Loki12583] #265185
14/03/19 04:32 PM
14/03/19 04:32 PM
Joined: Dec 2002
Posts: 4,449
London, UK
Khaled Offline

Hoopy frood
Khaled  Offline

Hoopy frood

Joined: Dec 2002
Posts: 4,449
London, UK
Quote:
The crash happens only with a binvar target, I have PM'd you links to two large files for testing

Thanks, this issue has been fixed for the next version.

Re: $urlget bugs / discussion [Re: Loki12583] #265189
14/03/19 06:24 PM
14/03/19 06:24 PM
Joined: Dec 2002
Posts: 4,449
London, UK
Khaled Offline

Hoopy frood
Khaled  Offline

Hoopy frood

Joined: Dec 2002
Posts: 4,449
London, UK
Quote:
"Bug": Downloading to binvar is slow, ~2 MB/s after 10 seconds compared to 60 MB/s for download to file.

I narrowed this down to the use of realloc() to repeatedly extend memory to store the downloaded bytes. The larger the memory, the slower realloc() gets. Pre-allocating large chunks helps a little.

Switching to a linked list implementation to store downloaded bytes makes it fast, however this leads to another problem - it needs to be reassembled at the end, which means first allocating contiguous memory to store the entire download in the binvar, effectively requiring double the amount of memory during the process.

(Currently, the memory pointer allocated during the download is assigned directly to the binvar structure, so no extra memory is needed)

In short, there does not seem to be an ideal solution to this - if we want to make the download available as a &binvar, we can opt for fast speed, double memory use, or slow speed, low memory use. Or we could just remove &binvar support and let scripters save to a file and load it as a &binvar if they need to.

Re: $urlget bugs / discussion [Re: Khaled] #265190
14/03/19 06:45 PM
14/03/19 06:45 PM
Joined: Feb 2003
Posts: 2,600
Raccoon Online
Hoopy frood
Raccoon  Online
Hoopy frood

Joined: Feb 2003
Posts: 2,600
I'd go for a compromise where, if the download is larger than say 1 megabyte, then the download goes to a temp file and is loaded back into &binvar when complete. You decide when realloc() becomes too clumsy and slow -- 1 mb? 32 mb?

&binvar is going to be most handy for people performing page scraping, where the html/xml they're scraping never even approaches 1 mb in size.


Well. At least I won lunch.
Good philosophy, see good in bad, I like!
Re: $urlget bugs / discussion [Re: Raccoon] #265193
14/03/19 07:19 PM
14/03/19 07:19 PM
Joined: Aug 2003
Posts: 227
UK
P
Protopia Offline
Fjord artisan
Protopia  Offline
Fjord artisan
P

Joined: Aug 2003
Posts: 227
UK
I would imagine there are four use cases, depending on whether there is a Content-Length field in the response header:

If we know what the content size is from the header, we should be able to allocate a &binvar of the right size from the start.

It is only when there is no content-length header that we potentially need to allocate memory several times or have multiple copies in memory.

P.S. It might be sensible to extend $urlget to include a maximum size - after which we terminate the download. Sometimes you are only interested in the <head> part of a web page. Sometimes you only want to look at the beginning of a file to determine its content type. However this would potentially avoid situations for someone downloading a file without realising that it is way to big to fit into a &binvar or way too big to download in a reasonable timeframe.

Re: $urlget bugs / discussion [Re: Protopia] #265196
14/03/19 08:28 PM
14/03/19 08:28 PM
Joined: Jan 2004
Posts: 1,271
L
Loki12583 Online OP
Hoopy frood
Loki12583  Online OP
Hoopy frood
L

Joined: Jan 2004
Posts: 1,271
Allow use of the HEAD method and an option to call the alias after every state change (including after headers are received)? This way you can make some decisions without the need to have an arbitrary /timer

Edit: States similar to here? https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/readyState

Last edited by Loki12583; 14/03/19 08:38 PM.
Page 1 of 2 1 2