|
Joined: Jul 2004
Posts: 8
Nutrimatic drinks dispenser
|
OP
Nutrimatic drinks dispenser
Joined: Jul 2004
Posts: 8 |
if someone posts a URL as such, Re: /proc/ kcore: <http://lists.debian.org/debian-user/2001/05/msg00197.html>; Clicking on it sends http://lists.debian.org/debian-user/2001/05/msg00197.html>; to the browser. This should be fixed becuase > is not a valid URL, so it should stop at that instead of ;
|
|
|
|
Joined: Dec 2002
Posts: 2,985
Hoopy frood
|
Hoopy frood
Joined: Dec 2002
Posts: 2,985 |
A semicolon can be part of a URL though, common in query strings. I'd be blaming the poster of the URL for not editing out the <>'s.
|
|
|
|
Joined: Aug 2003
Posts: 29
Ameglian cow
|
Ameglian cow
Joined: Aug 2003
Posts: 29 |
I do not believe the '>' character is a valid character in a url, thus it sgould stop before it, and open the correct url. This is a fairly common way that people post links. It'd be nice to see ot fixed.
|
|
|
|
Joined: Dec 2002
Posts: 2,985
Hoopy frood
|
Hoopy frood
Joined: Dec 2002
Posts: 2,985 |
Yes it is an invalid character but clicking on the link in the original post demonstrates that perhaps it should be the browser that filters it out instead of mIRC. No browser I used specifically rejected the URL.
I'll reiterate that people should post their URL's accurately instead of blaming mIRC for it. Given that there are valid characters either side of the > means that mIRC would then have to work harder to process the link the way you want it to. I also dispute your claim that posting URLs in a chatroom with < and > added is common. Just because one sees something alot doesn't make it common. I am yet to see any link posted this way in 6 years of IRC. Infact the only place I have seen this is in a link to an attachment in old versions of Microsoft Outlook.
|
|
|
|
Joined: Jul 2004
Posts: 8
Nutrimatic drinks dispenser
|
OP
Nutrimatic drinks dispenser
Joined: Jul 2004
Posts: 8 |
It is an an IRC bot that returns URLs like so, <sifu> unruly: Search took 0.364308 seconds: CNN.com - Better search results than Google ? - Jan. 5, 2004: <http://www.cnn.com/2004/TECH/internet/01/05/seeing.search1.ap/>; Yahoo Renews With Google , Changes Results: <http://searchenginewatch.com/sereport/article.php/2165081>; Major Search Engines and Directories: <http://searchenginewatch.com/links/article.php/2156221>; ) I have asked the developers of the bot to change that because mIRC doesnt parse the URLs properly, They would say that mIRC is broken becuase it violates RFC. RFC 1738, Section 2.2, paragraph 7 , that states < > are to delimit URLs The characters "<" and ">" are unsafe because they are used as the delimiters around URLs in free text; Seeing as how before i filed this bug i didnt know that was an actual RFC and was on the side they should change that, I will hold it as a bug and maintain that mIRC is broken because it doesnt follow URL standards correctly.
Last edited by sweede; 05/08/04 05:16 PM.
|
|
|
|
Joined: Jul 2004
Posts: 8
Nutrimatic drinks dispenser
|
OP
Nutrimatic drinks dispenser
Joined: Jul 2004
Posts: 8 |
is this bug going to be fixed or not?
|
|
|
|
Joined: Nov 2004
Posts: 2
Bowl of petunias
|
Bowl of petunias
Joined: Nov 2004
Posts: 2 |
Yes, please fix this, it's very annoying to have users complain about something that they think is broken in our bot, when in reality we are just abiding by the RFC for it. As per the RFC, <> are specifically designated as URL delimiters. So, whether or not it is common is irrelevant. The RFC is a well-defined standard, whereas for us (the developers of supybot, the bot in question) to attempt to code to mesh with a closed-source IRC client is simply unfeasible. We go with the most reasonable document we can, and that's the RFC. To expect us to code against a moving target is silly, which is what coding against an undocumented, not-at-all transparent source of information such as a closed-source IRC client would be. The RFC lays out in black and white what defines a URL and what does not, so why not abide by it?
|
|
|
|
Joined: Nov 2004
Posts: 8
Nutrimatic drinks dispenser
|
Nutrimatic drinks dispenser
Joined: Nov 2004
Posts: 8 |
Yes it is an invalid character but clicking on the link in the original post demonstrates that perhaps it should be the browser that filters it out instead of mIRC. No browser I used specifically rejected the URL. The browser does not filter out the URL as a convenience to the end user so they don't have to type the hex-encoded equivalent. This is also the same reason browsers accept spaces in a URL. By your logic, mIRC should also allow spaces in URLs. I'll reiterate that people should post their URL's accurately instead of blaming mIRC for it. Given that there are valid characters either side of the > means that mIRC would then have to work harder to process the link the way you want it to. People are posting their URLs correctly. < and > are specified in the RFC[1] as unsafe because they are used as in-text delimiters for a URL. There are valid characters on either side of a space, too. Why does mIRC stop parsing when it sees a space? The amount of extra work that mIRC would have to do should be negligible. Although, since I can't look at the source code, I can't fairly comment on the complexity or attempt to make a patch for you to use. I also dispute your claim that posting URLs in a chatroom with < and > added is common. Just because one sees something alot doesn't make it common. I am yet to see any link posted this way in 6 years of IRC. Just because one doesn't see something alot doesn't make it uncommon. The fact that the Uniform Resource Locator RFC specifically addresses this issue should be reason enough to convince you that this should be fixed. [1]. http://www.faqs.org/rfcs/rfc1738.htmlTo quote the RFC: Unsafe:
Characters can be unsafe for a number of reasons. The space character is unsafe because significant spaces may disappear and insignificant spaces may be introduced when URLs are transcribed or typeset or subjected to the treatment of word-processing programs. The characters "<" and ">" are unsafe because they are used as the delimiters around URLs in free text; the quote mark (""") is used to delimit URLs in some systems. The character "#" is unsafe and should always be encoded because it is used in World Wide Web and in other systems to delimit a URL from a fragment/anchor identifier that might follow it. The character "%" is unsafe because it is used for encodings of other characters. Other characters are unsafe because gateways and other transport agents are known to sometimes modify such characters. These characters are "{", "}", "|", "\", "^", "~", "[", "]", and "`".
|
|
|
|
Joined: Dec 2002
Posts: 2,985
Hoopy frood
|
Hoopy frood
Joined: Dec 2002
Posts: 2,985 |
It doesn't convince me. I haven't seen any website in recent times constructed with text hyperlinks surrounded by the < and > signs. One would be wasting energy doing it anyway since you should be typing < and > in place of the characters in a HTML document anyway. Why should Khaled rejig mIRC to cater for such an outdated practice? Try fixing the stupid way the bot quoted URLs instead of asking others to fix bugs that don't exist.
As far as I know, the makers of mIRC claim compliance only with RFC1459, the IRC related RFC. As mIRC is an IRC client then this is probably the most appropriate standard to comply with.
|
|
|
|
Joined: Nov 2004
Posts: 2
Bowl of petunias
|
Bowl of petunias
Joined: Nov 2004
Posts: 2 |
Surely you can't be serious. The IRC RFC (1459) has nothing to do with this discussion. The IRC RFC discusses IRC, whereas we are discussing URLs. Therefore, the URL RFC is the appropriate standard to use. And just because mIRC doesn't claim compliance with it doesn't mean it shouldn't shoot for it. There is no better standard for what defines a URL than that. All of the web browsers and web servers around the world aim to comply with it, why not mIRC? The fix can't be that difficult, and it has the added bonus of bringing it into the modern age (read: finally up to date with the 90s) of what defines a URL.
|
|
|
|
Joined: Feb 2004
Posts: 206
Fjord artisan
|
Fjord artisan
Joined: Feb 2004
Posts: 206 |
... The IRC RFC (1459) has nothing to do with this discussion. The IRC RFC discusses IRC, whereas we are discussing URLs. Therefore, the URL RFC is the appropriate standard to use. And just because mIRC doesn't claim compliance with it doesn't mean it shouldn't shoot for it. [...] All of the web browsers and web servers around the world aim to comply with it, why not mIRC? This is mIRC - so RFC 1459 has everything to do with it. Are are you suggesting that, because my car's manufacturer has a website that my car has to be compliant with the URL RFC :tongue: ? More seriously, how many more RFCs do you expect mIRC to become compliant with, just so that some person can script some bots? mIRC is primarily for chatting - that funny thing called conversation - and the fact that someone may want to highlight a URL they have seen recently is incidental (cut and paste can be a wonderful thing). the URL RFC is not necessary for conversation! However, on the upside - perhaps this could be a feature suggestion - compliance with <listed> RFC's - and the reason why this would be good in mIRC ( not trying to turn mIRC into something it was never meant to be!) Cheers, DK
Darwin_Koala
Junior Brat, In-no-cent(r)(tm) and original source of DK-itis!
|
|
|
|
Joined: Dec 2002
Posts: 2,985
Hoopy frood
|
Hoopy frood
Joined: Dec 2002
Posts: 2,985 |
I am quite serious. If you'd bother to sit back and look at how websites are developed nowadays you'd understand that quoting URLs inside <>'s is obsolete. I cannot find one single website made in the last few years that quotes URLs in that way. Further to that, mIRC is not a web browser or have you forgotten that?
By the way, we are almost 5 years out of the "90s".
|
|
|
|
Joined: Nov 2004
Posts: 8
Nutrimatic drinks dispenser
|
Nutrimatic drinks dispenser
Joined: Nov 2004
Posts: 8 |
This is mIRC - so RFC 1459 has everything to do with it. Since we aren't talking about how mIRC interfaces with the IRC server RFC 1459 has nothing to do with it. Are are you suggesting that, because my car's manufacturer has a website that my car has to be compliant with the URL RFC :tongue: ? If your car offers a way to recognize URLs and perform an action on them, yes. More seriously, how many more RFCs do you expect mIRC to become compliant with, just so that some person can script some bots? As many as are necessary for the features that mIRC offers to be fully-functional. This has more to do with a feature which mIRC offers being buggy than scripting a bot. mIRC offers a feature that makes URLs clickable so that you do not have to copy/paste them, yet it does not properly recognize URLs even though there is a concrete standard specifying how to recognize URLs.
|
|
|
|
Joined: Jul 2004
Posts: 8
Nutrimatic drinks dispenser
|
OP
Nutrimatic drinks dispenser
Joined: Jul 2004
Posts: 8 |
I'm confused, why are you bothering the discussion with HTML issues? does mIRC interpret HTML in its interface? oh it doesnt? well so who cares if HTML doesnt use < > as a delimiter. above was posted The characters "<" and ">" are unsafe because they are used as the delimiters around URLs in free text; the quote mark (""") is used to delimit URLs in some systems. The character "#" is unsafe and should always be encoded because it is used in World Wide Web and in other systems to delimit a URL from a fragment/anchor identifier that might follow it. The character "%" is unsafe because it is used for encodings of other characters. Other characters are unsafe because gateways and other transport agents are known to sometimes modify such characters. These characters are "{", "}", "|", "\", "^", "~", "[", "]", and "`". All of those characters listed cause the same problem in mIRC If i post a URL like, {http://www.example.com} and click on it in mIRC it works the way it should if i post a URL like [http://www.example.com] and click on it, it works the way it should. if there is ANY other character besides a space after the delimiter (the RFC defines it as such), it adds that character to the URL string and sends it to the browser (any browser exhibits the same problem) So, keeping this in mind, lets look at a valid URL <a href="http://www.example.com">example</a> Now, first it should be noted that the RFC isnt being followed at all (not that we didnt already know that) because that URL isnt clickable (the URL would be delimited by the "). however if we instead use "http://www.example.com">example it sends to the browser, http://www.example.com>example (notice that the " is stripped). {www.foo.com}bar <www.foo.com>joe |www.foo.com|frank [www.foo.com]tom "www.foo.com"bar [www.foo.com]; 'www.foo.com'; All cause the exact same problem and all ONLY happen in mIRC. so again, tell us how this isn't a mIRC problem ?
|
|
|
|
Joined: Nov 2004
Posts: 8
Nutrimatic drinks dispenser
|
Nutrimatic drinks dispenser
Joined: Nov 2004
Posts: 8 |
If you'd bother to sit back and look at how websites are developed nowadays you'd understand that quoting URLs inside <>'s is obsolete. Why do you keep bringing up website design? This has nothing to do with website design. Nowhere in the RFC does it say "This RFC applies only to website content." This has to do with mIRC properly recognizing URLs so that a feature it offers will not be buggy.
|
|
|
|
Joined: Apr 2003
Posts: 16
Pikka bird
|
Pikka bird
Joined: Apr 2003
Posts: 16 |
Lets see here, First off, mIRC is an IRC program, not HTTP. Therefore it follows the IRC RFC. The url clicking was added just as a convenience, its not a bug cause it takes what characters are after http: on an irc channel, if your posting urls on IRC with < in them, etc then you are at fault NOT mIRC. IRC is IRC, NOT HTTP....the url click is a convenience..thats all, this is NOT a bug. mIRC should not be following ANY web RFC's for something said on IRC, just think about that... If it followed every RFC for everything out there other than IRC your display would look pretty trashed in the way mirc would have to display things
|
|
|
|
Joined: Jul 2004
Posts: 8
Nutrimatic drinks dispenser
|
OP
Nutrimatic drinks dispenser
Joined: Jul 2004
Posts: 8 |
Yes it is an invalid character but clicking on the link in the original post demonstrates that perhaps it should be the browser that filters it out instead of mIRC. No browser I used specifically rejected the URL. i just now realized why you said that. ubbthreads has a similar issue with its parsing of URLs if i post, http://www.example.com it will be clickable because ubb parses it and makes it a URL <http://www.example.com> is not clickable <http://www.example.com>joe isnt clickable, as it shouldnt be (it is being argued that it should find text in < > as a URL) now, we use http://www.example.com>; or even http://www.example.com> and lo and behold, its clickable and returns an invalid URL (adding the >; or >) UBB also doesnt follow the RFC properly. all of this text is typed in as plain text, no fancy formating or UBB code or anything btw, what browser do you use? this does the same in IE and Firefox, the two most popular browsers out there. Last but not least, this is an example of a valid URL parser correctly parsing text http://sourceforge.net/forum/message.php?msg_id=2860017
|
|
|
|
Joined: Jul 2004
Posts: 8
Nutrimatic drinks dispenser
|
OP
Nutrimatic drinks dispenser
Joined: Jul 2004
Posts: 8 |
we are not talking about the HTTP RFC, we are talking about the URL RFC
IRC is not HTTP, your right no one is going to disagree there. but if mIRC is going to make URLS clickable, they should follow the URL RFC
also, URLS are clickable without the http://
if mIRC followed every RFC that applied to the features of mIRC, then mIRC would be a better program. it would not be trashy because no RFC defines how your display should look and feel.
|
|
|
|
Joined: Apr 2003
Posts: 16
Pikka bird
|
Pikka bird
Joined: Apr 2003
Posts: 16 |
btw, just so you are aware and to keep people HAPPY..have a small script that will strip out the < and > and ; if the line contains http..so use it modify to your needs then you wont have a reason to complain on ^*:TEXT:*http*:#: { ; strip out < var %cleantext = $remove($1-,$chr(60)) ; strip out > var %cleantext = $remove(%cleantext,$chr(62)) ; strip out the ; var %cleantext = $remove(%cleantext,$chr(59)) echo $chan $timestamp < $+ $nick $+ > %cleantext halt } now when someone says: hello <http://www.dragonflu.com>; how you like? it will display as: [time] <nick> hello http://www.dragonflu.com how you like? it's set to dsplay the standard way mirc displays text. with timestamp, etc if you dont want timestamp then just remove the $timestamp as I Said this is a SIMPLE script made simple so people wont get confused.
|
|
|
|
Joined: Nov 2004
Posts: 8
Nutrimatic drinks dispenser
|
Nutrimatic drinks dispenser
Joined: Nov 2004
Posts: 8 |
The url clicking was added just as a convenience, its not a bug cause it takes what characters are after http: on an irc channel, if your posting urls on IRC with < in them, etc then you are at fault NOT mIRC. So, you don't care how buggy extra features are as long as mIRC knows how to talk to the IRC server, since those are just conveniences. After all, IRC is IRC, not a GUI or being able to script your client or any of the other conveniences mIRC provides. Wrong! You would be bitching right along with us if there was a bug in the GUI. If you're going to add a feature for the convenience of your users, it should *not* be buggy. If it is, you should suck it up and fix the problem. mIRC should not be following ANY web RFC's for something said on IRC You should educate yourself before you speak. The RFC is not specific to browsing the web. It is specific to describing the format of a valid URL. More than just web browsers need to know what a valid URL is, hence the RFC. just think about that... If it followed every RFC for everything out there other than IRC your display would look pretty trashed in the way mirc would have to display things That makes absolutely no sense. How would conforming to relevant RFCs make your display "pretty trashed in the way mirc would have to display things"?
|
|
|
|
|