|
|
Joined: Jan 2003
Posts: 20
Ameglian cow
|
OP
Ameglian cow
Joined: Jan 2003
Posts: 20 |
A user entered one of our rooms today with a very odd nickname.
In Unicode: ι̲̅и̲̅т̲̅э̲̅и̲̅ѕ̲̅ιfι̲̅э̲̅đ̲̅Ł̲̅о̲̅ģ̲̅ι̲̅ς̲̅
In UTF-8: ι̲̅и̲̅т̲̅Ñ̲̅и̲̅ѕ̲̅ιfι̲̅Ñ̲̅đ̲̅Å̲̅о̲̅ģ̲̅ι̲̅ς̲̅
Ya I know you can't read it here most likely but it displays fine in the mIRC nicklist.
The problem I am noticing is that mIRC isn't giving the entire UTF-8 value as the $nick so none of the events are displaying correctly.
Value displayed: ι̲и̅т̲Ñ̅и̲ѕ̅ιfι̅ÑÌ…Ä‘Ì…Å̅о̅ģÌ
The correct value has a length of 191 characters while the processed value only has 110. This prevents the name from being encoded and decoded properly. The only way I have found so far to display data is to connect to the server via a socket connection and display events manually when recieved instead of sending them back to the local server to be processed by mIRC. Is there another way of handling this so that I may still use the mIRC events?
* Note: This causes issues when the user joins or leaves the channel. Not removing them from the nicklist and if the rejoin adding the short value to the nicklist and the original still being listed since it did not regester the user leaving.
Last edited by Tewl; 27/09/07 05:27 PM.
|
|
|
|
Joined: Oct 2004
Posts: 8,330
Hoopy frood
|
Hoopy frood
Joined: Oct 2004
Posts: 8,330 |
That sounds like mIRC has a limit on nick lengths since most networks would never allow that many characters for a nick. You might want to test this with a non-UTF nick that is the same length and see if you have the same problem. If so, I'd put it in as a bug.
Invision Support #Invision on irc.irchighway.net
|
|
|
|
Joined: Oct 2006
Posts: 48
Ameglian cow
|
Ameglian cow
Joined: Oct 2006
Posts: 48 |
since when was
ι̲̅и̲̅т̲̅э̲̅и̲̅ѕ̲̅ιfι̲̅э̲̅đ̲̅Ł̲̅о̲̅ģ̲̅ι̲̅ς̲̅
a valid nickname??? I thought only...
a-Z,1-0 {}[]()`^.-_
Were valid??
|
|
|
|
Joined: Jan 2006
Posts: 468
Fjord artisan
|
Fjord artisan
Joined: Jan 2006
Posts: 468 |
Some networks allow Chinese/Malaysian characters in nicknames, others allow Arabic. But Unicode text like this, never seen before.
|
|
|
|
Joined: Jan 2003
Posts: 20
Ameglian cow
|
OP
Ameglian cow
Joined: Jan 2003
Posts: 20 |
Here is what the name looks like. The problem I am having isn't just with the limitation of the length of nicknames but channel names as well. The server I am on is much like MSN was after they went to webchat allowing multiple languages and long channel names. Back before utf8 support was added to mirc we would use hash tables to create references to the long names (channels and nicknames) which was fine because the names didnt display correctly anyway. lol. Anyways, I would really like to display the nicknames and maybe even the channel names in full length if possible.
|
|
|
|
Joined: Oct 2004
Posts: 8,330
Hoopy frood
|
Hoopy frood
Joined: Oct 2004
Posts: 8,330 |
I would suggest putting the request in as either a feature suggestion or a bug. A network that allows such long nicks and/or channels is very rare, so it really isn't worth taking too much time to make it work in mIRC. Yet, I have a feeling the length limit won't be difficult to change, so it might be possible to fix without too much work.
Invision Support #Invision on irc.irchighway.net
|
|
|
|
Joined: Jan 2003
Posts: 20
Ameglian cow
|
OP
Ameglian cow
Joined: Jan 2003
Posts: 20 |
The thing is, it does limit the number of characters in a nickname but the length is based on the unicode value not the utf-8 which can be much longer. I don't know what mIRC shortens the names though =\
Last edited by Tewl; 29/09/07 05:43 PM.
|
|
|
|
Joined: Sep 2007
Posts: 2
Bowl of petunias
|
Bowl of petunias
Joined: Sep 2007
Posts: 2 |
1 unicode character can be upto 4 utf-8 characters in ascii count (i.e. 4 bytes). xchat has already adjusted for this issue, it would be nice if mirc adjusts for this issue too. - Support receiving 2048 bytes per line from server and dcc-chat, so we can support 512 UTF-8 characters that some servers now send.
http://66.102.9.104/search?q=cache:0SwBOBN15BMJ:www.winfuture-forum.de/lofiversion/index.php%3Ft56045.html+xchat+unicode+2048&hl=en&ct=clnk&cd=3
quote says 512 UTF-8, but it means 512 Unicode characters in UTF-8 format.
|
|
|
|
Joined: Mar 2006
Posts: 395
Pan-dimensional mouse
|
Pan-dimensional mouse
Joined: Mar 2006
Posts: 395 |
Tewl, The UTF-8 setup in mIRC does not fix the nick length issue.
AFAIK the only way to do this would to be to create references to the full nickname, as seen in Vincula etc.
I don't believe this is a problem in mIRC's UTF-8 support, but in its maximum nicklength.
Other posters: This is still a common nickname length. The IRCX draft extended the nickname length somewhat. Now with a bunch of MSN Clones made to be like "MSN Chat"-(RIP), the unexperienced developers do not think about things such as nick lengths.
Regards, JD
Edit: IRCX also allows most charachters in nicknames...
Last edited by The_JD; 02/10/07 01:33 AM.
[02:16] * Titanic has quit IRC (Excess Flood)
|
|
|
|
Joined: Jan 2003
Posts: 20
Ameglian cow
|
OP
Ameglian cow
Joined: Jan 2003
Posts: 20 |
The_JD The point that I am trying to establish is the fact that mIRC is not taking into account that UTF-8 nicknames in WIDE UNICODE would be the valid length but exceed the normal bytes that a normal ascii name would have. I already know that Vincula and many other scripts used hash tables to back reference channels and nicknames but this causes performance hits and invalid display. I am trying state that the bytes for channel names and nicknames be extended as Chuck said in his post above.
P.S. Try reading the original RFC. It states: "Each client is distinguished from other clients by a unique nickname having a maximum length of nine (9) characters." IE the nickname length that was extended long before the IRCx draft was published.
|
|
|
|
Joined: Oct 2003
Posts: 3,918
Hoopy frood
|
Hoopy frood
Joined: Oct 2003
Posts: 3,918 |
If it was stated in the original RFC then it was not "extended", it was defined, since the original RFC is the official IRC specification.
edit: I think the more pertinant thing here is to show what mIRC's *non utf* nick limit is before attacking what UTF should or shouldn't do.
- argv[0] on EFnet #mIRC - "Life is a pointer to an integer without a cast"
|
|
|
|
Joined: Mar 2006
Posts: 395
Pan-dimensional mouse
|
Pan-dimensional mouse
Joined: Mar 2006
Posts: 395 |
AFAIK. Mirc is not UTF-8 compliant, but stores the ascii string and the UTF8 string to reference each other... I'm sure khalid mentioned it before.
My guess is that the nick length that mIRC allows for ascii char's is measured in bytes, not charachters.
I'm sure I understand what you want here, however I can't picture Khaled modifiying this too soon.
I personally think mIRC should allow NICKLEN= to describe the size and go along with it (if NICKLEN is present).
(and channels = same prob)
I think this should be more of a feature request than a bug report.
Edit: Missed a post (above)... same thing said anyhow.
Last edited by The_JD; 03/10/07 04:06 AM.
[02:16] * Titanic has quit IRC (Excess Flood)
|
|
|
|
Joined: Oct 2003
Posts: 313
Fjord artisan
|
Fjord artisan
Joined: Oct 2003
Posts: 313 |
For reference, RFC 2812 (the update to RFC 1459) says: 1.2.1 Users
Each user is distinguished from other users by a unique nickname having a maximum length of nine (9) characters. See the protocol grammar rules (section 2.3.1) for what may and may not be used in a nickname.
While the maximum length is limited to nine characters, clients SHOULD accept longer strings as they may become used in future evolutions of the protocol.
Sais
|
|
|
|
|
|