|
Joined: Dec 2002
Posts: 36
Ameglian cow
|
Ameglian cow
Joined: Dec 2002
Posts: 36 |
Define inconsistent state. Assume a user that has the channel #test_ä in his autojoin list in nickserv since one year. When he identified itself, nickserv uses a svsjoin to let the user join its channel #test_ä with an iso charset. The user has now a channel-window open and sees the other users in the channel. But since mirc 7 sends only utf-8 charset, this user cannot - send privmsgs to the channel - cannot part the channel (the window closed, but since mirc sends a part #test_ä (utf-8), the user is still joined) - cannot whois users in this channel if chmode +s is used I would call this an inconsistent state. You can simulate this by /raw -n join #test_ä. We have many users that asks us in the support-channels what has happend, because they do not understand they just updated a client. Using the autoupdate of mirc, the user has no warning. Thats a serious problem in my opinion.
cu
TC / Mario
|
|
|
|
Joined: Dec 2002
Posts: 5,476
Hoopy frood
|
Hoopy frood
Joined: Dec 2002
Posts: 5,476 |
Thanks for your comments mths.
I think it would probably make sense for mIRC to hide ISO channels in /list. It may even make sense to /part an auto-joined ISO channel and automatically /join the UTF-8 version. That would migrate many users to UTF-8 automatically.
As you know, my main hope is that this change will lead to a gradual change to UTF-8. I realize it will be frustrating initially but there does not seem to be another option - codepages will continue to be used, with all of their limitations and issues, without this change.
Regarding administrators: the only way to resolve this would be to add a UTF-8 enable/disable option. It is not quite that simple however - UTF-8 handling is embedded in a number of routines, such as file input/output, so it would have to be disabled in many places.
|
|
|
|
Joined: Dec 2002
Posts: 36
Ameglian cow
|
Ameglian cow
Joined: Dec 2002
Posts: 36 |
I think it would probably make sense for mIRC to hide ISO channels in /list. It may even make sense to /part an auto-joined ISO channel and automatically /join the UTF-8 version. That would migrate many users to UTF-8 automatically. What? mIRC is not the one and only client, we have many users that uses other os' and other clients. Migrating an existing channel to utf8 means that all users have to do it. All webbased clients have to be updated, all java-clients. That's not easy for a normal user without experience in that. I'd like a workaround like this. It's only an idea, maybe it could work: if mIRC detects a source or destination that contains iso-characters this channel (or nickname maybe) must be enclosed with a special character that survives the translation to utf8 and back. Let it be ctrl-i (#29) for example. Then you have only to change the part that receives the input and the part that send it to the server. Using ctrl-i had a nice side effect: it was displayed italic so the user could differ from utf8-names.
cu
TC / Mario
|
|
|
|
Joined: Oct 2003
Posts: 3,918
Hoopy frood
|
Hoopy frood
Joined: Oct 2003
Posts: 3,918 |
Any workaround will ultimately be a hack. Ideally mIRC should NOT be encoding the TARGET portion of any IRC command, as this violates the IRC protocol (a target is a unique 8bit encoded space and comma delimited identifier and must be represented exactly in its original form). Of course, it's fairly difficult to differentiate what part of a command is a target and how it should be re-encoded given that the destination encoding is unknown unless the specific target was listed by the server. Basic joining/parting and messaging might work (because mIRC knows the original target's bytestring), but scripts would still not work. For instance, "/msg # hello" could not work because the # result will be encoded as utf-8 in the script. There's no real way to support this unless by re-adding codepage support, or a much more complex/comprehensive encoding layer, which Khaled has stated is extremely unlikely to happen. Without script support, the basic functionality isn't really worth supporting. Many users will still complain that "my script does not work in #CHANNELNAME", and the problem will continue to persist until they're on a utf-8 channel, so this isn't exactly a solution. All that work in supporting codepaged channels just to tell users to use utf-8 to get a proper IRC experience anyway? Doesn't seem worth it to me. A much better long-term strategy is indeed to migrate people over to UTF-8, which is why, although subversive and sneaky, I think this might be a good idea. Khaled outlined this strategy in this post, and you should probably read it to understand where he stands on this issue. Most clients are already using utf-8 anyway, so mIRC is not really alone. I'd bet that a good 80% of the users in these codepaged channels are on some version of mIRC, so this is actually not an issue of there being other clients out there to deal with, but rather the fact that most other client already handle UTF-8 and mIRC 6.x was lagging behind. And indeed, the results on Undernet support this: (19:59:43) -> [#québec] VERSION
(19:59:43) [ElrickPOT22 VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [Mazda3 VERSION reply]: mIRC v6.31 Khaled Mardam-Bey
(19:59:44) [Mayerz16V VERSION reply]: mIRC v6.14 Khaled Mardam-Bey
(19:59:44) [asiatiQboys VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [G28Qc VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [mezcal VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [sam--- VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [{bottine} VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [Egli VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [argus VERSION reply]: mIRC v6.16 Khaled Mardam-Bey
(19:59:44) [cuish VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [Ju|iee VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [^Ph4nt0M^ VERSION reply]: mIRC v6.21 Khaled Mardam-Bey
(19:59:44) [rawInDaHouse VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [BodimancheH VERSION reply]: mIRC v6.21 Khaled Mardam-Bey
(19:59:44) [C[0_O]LMuSt VERSION reply]: mIRC v6.03 Khaled Mardam-Bey
(19:59:44) [ZombieBOT VERSION reply]: mIRC32 v5.91 K.Mardam-Bey
(19:59:44) [oli-1979 VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [homme-40-pq VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [mokimo VERSION reply]: mIRC v6.2 Khaled Mardam-Bey
(19:59:44) [_LeN0IR VERSION reply]: Miranda IM 0.8.27.0 (IRC v.0.8.27.0 Unicode), (c) 2003-09 J.Persson, G.Hazan
(19:59:44) [LooseCannon VERSION reply]: mIRC v6.3 Khaled Mardam-Bey
(19:59:44) [Emy_OUT VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [M4XL4T3RR3UR VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [xavier3 VERSION reply]: mIRC v6.16 Khaled Mardam-Bey
(19:59:44) [Sol28QC VERSION reply]: mIRC v6.31 Khaled Mardam-Bey
(19:59:44) [broly VERSION reply]: mIRC v6.34 Khaled Mardam-Bey
(19:59:44) [wonderfield VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [SwEdEn-_- VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [MrHyde VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:44) [Cammalleri VERSION reply]: mIRC v6.31 Khaled Mardam-Bey
(19:59:45) [Philippe25 VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:45) [LeVampire VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:45) [Luc_Brien VERSION reply]: mIRC v6.35 Khaled Mardam-Bey
(19:59:46) [Vegetal VERSION reply]: mIRC v6.35 Khaled Mardam-Bey Frankly, servers can do a lot to help people in this process by redirecting them to utf-8, but server implementors tend to be uninterested in moving this process along. In fact, you can read the beta threads regarding unicode to find an exchange between a server implementor about how "encoding support" is meant to be the client's responsibility, not the server's. Note that this goes against your previous post that "encoding should be handled by the server", and therein lies the vicious cycle. Server developers think it's the client's responsibility, client developers think it's the server's responsibility. mIRC is taking the step in trying to settle the dispute once and for all by sticking to a specific implementation.
- argv[0] on EFnet #mIRC - "Life is a pointer to an integer without a cast"
|
|
|
|
Joined: Dec 2002
Posts: 155
Vogon poet
|
Vogon poet
Joined: Dec 2002
Posts: 155 |
This is a bigger issue than it appears to be. Especially for languages with which an ANSI encoding has always been enough. Right now, I'm sure that at least one popular channel is having trouble with this in Undernet, where channel registration is not taken lightly.
In any case, at least Khaled is trying to take a step forward by adding full UTF-8 support, however, I'm not sure that it's a good idea to make mIRC so that it can no longer interact with ANSI-coded channels within an instance of an IRC server. I agree with mths: this can be a security issue. I hope Khaled can come up with a convenient solution for this.
What I'm sure about is that this is a bigger issue than many people think, and this transition is going to be very bothersome. Right now, I rather stick with 6.35, disabling and enabling UTF-8 display and encoding as I see fit.
Last edited by Strider; 09/08/10 12:12 AM.
|
|
|
|
Joined: Oct 2003
Posts: 3,918
Hoopy frood
|
Hoopy frood
Joined: Oct 2003
Posts: 3,918 |
I think you're making it into a bigger issue than it is. I've just been doing surveys of networks like Undernet and QuakeNet, and I'm seeing maybe 5% of the registered channels with UTF-8 incompatible channel names. It might be a really big deal to those 5%, and I completely understand why and sympathize, but those users can always stick to 6.35 if they need this support. The other 90+% of users will greatly benefit from UTF-8.
The security side of this is relatively bogus. Besides EFNet, there are really no networks without services support. More importantly, opers aren't drones. If a user registers a the unicode equivalent channel name of an existing codepage channel in order to gain control of a channel, an oper can quickly out whether the use is legitimate or not and decide whether to revoke the registration.
- argv[0] on EFnet #mIRC - "Life is a pointer to an integer without a cast"
|
|
|
|
Joined: Feb 2003
Posts: 3,432
Hoopy frood
|
Hoopy frood
Joined: Feb 2003
Posts: 3,432 |
Thanks for your comments mths.
I think it would probably make sense for mIRC to hide ISO channels in /list. It may even make sense to /part an auto-joined ISO channel and automatically /join the UTF-8 version. That would migrate many users to UTF-8 automatically.
I dont know if that sounds like a good idea, this means alot of people need to rewrite/update there servers, and they need to do it for 1 program people use to chat with, it's many others out there, and i for one belive this will force many users to stick with a outdated program. (mirc v6.x) And as argv0 said, "QuakeNet, and I'm seeing maybe 5% of the registered channels with UTF-8 incompatible channel names.", i using åäö in my language, and many channels im in using some of the letters, and QuakeNet is only one network. Think of all the other networks to, there you find many more channels and users joining channels with UTF-8 incompatible channel names.
if ($me != tired) { return } | else { echo -a Get a pot of coffee now $+($me,.) }
|
|
|
|
Joined: Oct 2004
Posts: 8,330
Hoopy frood
|
Hoopy frood
Joined: Oct 2004
Posts: 8,330 |
Just as a note on those channels, the ops can easily join the UTF8 version of those channels and slowly work to move people to the UTF8 version of the channel. This would really be a good idea even if mIRC adds some easy way to continue using the non-UTF8 channels. It would be easy to set up the UTF8 channel and then ask people to move over there. The ops can then leave a "bot" in the channel that will tell anyone who joins the non-UTF8 channel to use the UTF8 channel instead and could even /invite the user to the correct channel. Eventually, more clients are likely to start using unicode and you'll start running into more problems with users unable to access the non-UTF8 channels. It would serve in everyone's best interests to start making the move.
Yes, changing the servers themselves takes more work, but changing a channel is fairly easy to do and considering you're keeping the same name (just encoded differently), you aren't sacrificing anything.
It was much more work for me to update the Invision script to support unicode/mIRC 7.1 than it would be for me to move everyone in one of my channels to a unicode version of the channel if I had such a channel. And I would have already made that move if I had that kind of channel. It would have been done before mIRC 7.1 was ever released because it was very well known to anyone following beta that this was coming.
Invision Support #Invision on irc.irchighway.net
|
|
|
|
Joined: Oct 2003
Posts: 3,918
Hoopy frood
|
Hoopy frood
Joined: Oct 2003
Posts: 3,918 |
sparta:
I did not only test one network. I understand there are probably "worse" choices in terms of network, however QuakeNet, being the largest network, has the widest random sample population. You need only take a stats 101 course to realize that 5% of a large enough random sample size means it's likely to be 5% in other places too (except for specialty networks or hyper-regional ones).
For comparison, I just tested RusNet, which I guessed would have a much higher percentage, but the numbers are closer to 10-12%, ie. high, but not as high as I thought. This is, of course, a biased sample, and probably one of the worst-case scenarios. I should also point out that RusNet has (experimental) server-side utf-8 support, so it's not as big a deal there either. Finally, putting this all in context, RusNet only has 5,000 users and 6,000 channels. 10% of that number is small change.
But again, even if there are more than 10%, the vast majority of this 5% (or 10%, or 15%?) is still mIRC clients. Of all the channels I polled, almost 95% were mIRC. Other clients I noticed were: kVIRC (once), x-chat (once), Miranda IM (once, and ironic because it was a "Unicode build"), and irssi 0.8.15 (latest version, maybe 15 times), and a handful of eggdrops. irssi was the second largest number, but still paled in comparison to the amount of mIRC clients that I counted.
The rejoin function Khaled proposed is to avoid rewriting servers. mIRC is unable to join these codepage encoded channels, so they already have to stick to 6.35 anyway. Migrating the users that *do* upgrade seems like the sane solution, rather than doing nothing.
On a sidenote, if you think 6.35 is outdated, you should see the clients these users use. One some channels, the average version was 6.2 (and went as low as 5.9). I don't think they will be complaining about having to continue using their version. Clearly, these aren't the users who will be upgrading to 7.1 any time soon.
- argv[0] on EFnet #mIRC - "Life is a pointer to an integer without a cast"
|
|
|
|
Joined: Aug 2010
Posts: 1
Mostly harmless
|
Mostly harmless
Joined: Aug 2010
Posts: 1 |
uh, did I get this right? There is currently no way to join ISO channels using mIRC anymore?
To be honest, this is a reason for me to switch the client. It may very well be ideologically right to settle to UTF-8 in 2010 but also practically? I am using an irssi proxy in a Linux Latin-1 environment, and beside not being able to join ISO channels, I am also not able to receive query messages anymore.
Maybe one should add a note next to the mIRC download button: "Beware: Cannot handle ISO-8859-x anymore!"
|
|
|
|
Joined: Oct 2004
Posts: 8,330
Hoopy frood
|
Hoopy frood
Joined: Oct 2004
Posts: 8,330 |
The main reason to even use 7.1 versus 6.35 is for unicode support. If you don't want or need it, then 6.35 works just fine and you're not going to miss out on all that much... yet. As newer versions come out with additional features, you'll slowly start missing out on a lot by not using valid UTF8 channels.
Invision Support #Invision on irc.irchighway.net
|
|
|
|
Joined: Feb 2006
Posts: 12
Pikka bird
|
Pikka bird
Joined: Feb 2006
Posts: 12 |
I think mIRC really needs to keep exactly what the server sends. I.e. if there's an invite with a channel name that is not UTF8 encoded, it should join the non-utf8 channel. And whenever that channel name is used it should be used in the non-utf8 way so messages to the channel etc. actually arrive.
You cannot expect network admins to upgrade all their ircds quickly just due to you dropping non-utf8 support. As you might know ircd upgrades require restarts (unless they are very modular, but some of the biggest networks like Quakenet, Undernet and Gamesurge use non-modular ircds) of the ircd which results in dropping users. This usually means updates can't be done quickly because servers have to be removed from the DNS round-robins and then you have to wait a few days for the server to become less populated. Assuming you have 15 servers and you want to wait 3 days after removing them from the RR before you upgrade them to keep the number of users quitting in a netsplit low. Then, if you upgrade 3 servers at once, it will take you about 2 weeks until you've upgraded all servers. Two weeks where mirc7 users will be pissed about not being able to join their channels (switching them to utf8 is not an option since then mirc6 users will not be able to join without an invite or by using copy&paste so they get into the utf8 channel). And with irc being less popular thanks to all this social network crap etc. I don't think any network wants to lose users due to this.
So I think mIRC should be smart. When joining a channel manually it should check if either the UTF8 or the non-UTF version exists. If the non-UTF version exists and the UTF8 version doesn't, it should use the non-UTF8 version. This would avoid issues with legacy channel names while actively promoting UTF8 as newly created channels would use UTF8 encoding.
|
|
|
|
Joined: Oct 2004
Posts: 8,330
Hoopy frood
|
Hoopy frood
Joined: Oct 2004
Posts: 8,330 |
It really isn't that simple. Both versions can exist for one. For another, if it did that, you could never choose to join the UTF8 version of the channel if the non-UTF8 version existed. That's just two examples why you can't do it the way you suggest. It would also be a pain to manage all of the events that need to track whether or not to use a UTF8 or non-UTF8 channel name.
There may be options available for IRCops that will let them manage the "outdated" channel names at some point. In the meantime, mIRC 6.35 works just fine. Almost the entire reason to upgrade to mIRC 7.1 is FOR Unicode support. If you don't want or need it, there's not much reason to upgrade from 6.35 at this time. IRCops can easily stick to that without any issues until either mIRC offers an option for them to handle those non-UTF8 channels/nicks or they update their servers to no longer use non-UTF8.
Invision Support #Invision on irc.irchighway.net
|
|
|
|
Joined: Jul 2006
Posts: 248
Fjord artisan
|
Fjord artisan
Joined: Jul 2006
Posts: 248 |
Any channels or servers that use only code page support and not UTF8 support will have issues until they start supporting UTF8. You will have to stick to 6.35 until support for UTF8 is added to those channels and/or servers. By the way, IRC protocol standard document clearly states what there is no character set defined, this particularly means payload text is raw octets (8 bits, sic!) and conversion to characters is up to clients. So, you cannot accuse any server in not supporting UTF-8 (or any other code pages). The main reason to upgrade to mIRC 7.1 is for unicode support and if you aren't going to use it, there's not much reason to upgrade anyhow. For me, it was kind of race condition on losing focus under Windows 7 with Aero enabled. (mIRC 6.35 had more time on CPU than IE 8, heh) Sorry to interrupt UTF-8 praises, but theoretical maximum of 491 bytes (:n!u@h PRIVMSG #c :<491 bytes of payload><CR><LF>) reduces by 2 times (or 3 times for eastern ircers) that's why people speaking unsupported languages do not hurry to make their IRC UTF-8 compliant.
|
|
|
|
Joined: Oct 2003
Posts: 3,918
Hoopy frood
|
Hoopy frood
Joined: Oct 2003
Posts: 3,918 |
By the way, IRC protocol standard document clearly states what there is no character set defined, ... So, you cannot accuse any server in not supporting UTF-8 You can accuse them all you want. You're misinterpreting the spec. The RFC states that the specification defines no specific encoding-- ie. it is undefined by the spec and can be left up to either implementation of the client and/or server. Just because the spec does not say "you must support encoding X", does not mean a server implementor cannot be accused of not supporting encoding X. The HTTP specification does not enforce user-agents to support the PUT method, for instance, but the lack of support for PUT in popular browsers has been a huge hindrance on REST adoption over the last few years. Even though they can cite the spec all they want, web browser developers are still to blame for this. It's their responsibility (just like it is with servers) to ensure that the spec is implemented in a way that is most useful to its users. Encoding is a large problem for all IRC clients. Supporting encodings on the server is the easiest way to solve this problem. It should be done. Clinging to the literal spec that was written when computers had less than a megabyte of RAM is stifling. And people wonder why IRC is declining in popularity. Hint: it has nothing to do with webcam support; the problem is much more basic than that. Sorry to interrupt UTF-8 praises, but theoretical maximum of 491 bytes (:n!u@h PRIVMSG #c :<491 bytes of payload><CR><LF>) reduces by 2 times SJIS, the popular encoding for those "eastern ircers" also uses 2 bytes, so you're reducing the byte count there too. And again, you're taking the spec way too literally here. The 512 byte length has always been an arbitrary value imposed by implementations of the time; again, this was written in 1993, when 0.5kb was a lot of data. 512 is a common value supported by some servers since its listed in the spec, but not all servers follow this. If UTF-8 adoption increased (and it has), servers would adjust accordingly. It would be similar to the way mIRC adjusted its line length in scripts from 900 to 4096 recently.
- argv[0] on EFnet #mIRC - "Life is a pointer to an integer without a cast"
|
|
|
|
|