mIRC Homepage
Posted By: Tewl UTF-8 Nickname display truncated? - 27/09/07 05:01 PM
A user entered one of our rooms today with a very odd nickname.

In Unicode: ι̲̅и̲̅т̲̅э̲̅и̲̅ѕ̲̅ιfι̲̅э̲̅đ̲̅Ł̲̅о̲̅ģ̲̅ι̲̅ς̲̅

In UTF-8: ι̲̅и̲̅т̲̅э̲̅и̲̅ѕ̲̅ιfι̲̅э̲̅đ̲̅Ł̲̅о̲̅ģ̲̅ι̲̅ς̲̅

Ya I know you can't read it here most likely but it displays fine in the mIRC nicklist.

The problem I am noticing is that mIRC isn't giving the entire UTF-8 value as the $nick so none of the events are displaying correctly.

Value displayed: ι̲и̅т̲э̅и̲ѕ̅ιfι̅э̅đ̅Ł̅о̅ģÌ

The correct value has a length of 191 characters while the processed value only has 110. This prevents the name from being encoded and decoded properly. The only way I have found so far to display data is to connect to the server via a socket connection and display events manually when recieved instead of sending them back to the local server to be processed by mIRC. Is there another way of handling this so that I may still use the mIRC events?

* Note: This causes issues when the user joins or leaves the channel. Not removing them from the nicklist and if the rejoin adding the short value to the nicklist and the original still being listed since it did not regester the user leaving.
Posted By: Riamus2 Re: UTF-8 Nickname display truncated? - 27/09/07 08:56 PM
That sounds like mIRC has a limit on nick lengths since most networks would never allow that many characters for a nick. You might want to test this with a non-UTF nick that is the same length and see if you have the same problem. If so, I'd put it in as a bug.
Posted By: Ghozer Re: UTF-8 Nickname display truncated? - 28/09/07 02:07 AM
since when was

ι̲̅и̲̅т̲̅э̲̅и̲̅ѕ̲̅ιfι̲̅э̲̅đ̲̅Ł̲̅о̲̅ģ̲̅ι̲̅ς̲̅

a valid nickname??? I thought only...

a-Z,1-0 {}[]()`^.-_

Were valid??
Posted By: symphony Re: UTF-8 Nickname display truncated? - 28/09/07 02:19 AM
Some networks allow Chinese/Malaysian characters in nicknames, others allow Arabic. But Unicode text like this, never seen before.
Posted By: Tewl Re: UTF-8 Nickname display truncated? - 28/09/07 02:40 AM
Here is what the name looks like.



The problem I am having isn't just with the limitation of the length of nicknames but channel names as well.

The server I am on is much like MSN was after they went to webchat allowing multiple languages and long channel names. Back before utf8 support was added to mirc we would use hash tables to create references to the long names (channels and nicknames) which was fine because the names didnt display correctly anyway. lol. Anyways, I would really like to display the nicknames and maybe even the channel names in full length if possible.
Posted By: Riamus2 Re: UTF-8 Nickname display truncated? - 28/09/07 10:21 AM
I would suggest putting the request in as either a feature suggestion or a bug. A network that allows such long nicks and/or channels is very rare, so it really isn't worth taking too much time to make it work in mIRC. Yet, I have a feeling the length limit won't be difficult to change, so it might be possible to fix without too much work.
Posted By: Tewl Re: UTF-8 Nickname display truncated? - 29/09/07 05:42 PM
The thing is, it does limit the number of characters in a nickname but the length is based on the unicode value not the utf-8 which can be much longer. I don't know what mIRC shortens the names though =\
Posted By: ChuckB Re: UTF-8 Nickname display truncated? - 30/09/07 12:17 PM
1 unicode character can be upto 4 utf-8 characters in ascii count (i.e. 4 bytes).

xchat has already adjusted for this issue, it would be nice if mirc adjusts for this issue too.

Quote:

- Support receiving 2048 bytes per line from server and dcc-chat, so we can support 512 UTF-8 characters that some servers now send.

http://66.102.9.104/search?q=cache:0SwBOBN15BMJ:www.winfuture-forum.de/lofiversion/index.php%3Ft56045.html+xchat+unicode+2048&hl=en&ct=clnk&cd=3

quote says 512 UTF-8, but it means 512 Unicode characters in UTF-8 format.
Posted By: The_JD Re: UTF-8 Nickname display truncated? - 02/10/07 01:31 AM
Tewl,
The UTF-8 setup in mIRC does not fix the nick length issue.

AFAIK the only way to do this would to be to create references to the full nickname, as seen in Vincula etc.

I don't believe this is a problem in mIRC's UTF-8 support, but in its maximum nicklength.

Other posters: This is still a common nickname length. The IRCX draft extended the nickname length somewhat.
Now with a bunch of MSN Clones made to be like "MSN Chat"-(RIP), the unexperienced developers do not think about things such as nick lengths.

Regards,
JD

Edit: IRCX also allows most charachters in nicknames...
Posted By: Tewl Re: UTF-8 Nickname display truncated? - 02/10/07 03:36 AM
The_JD
The point that I am trying to establish is the fact that mIRC is not taking into account that UTF-8 nicknames in WIDE UNICODE would be the valid length but exceed the normal bytes that a normal ascii name would have. I already know that Vincula and many other scripts used hash tables to back reference channels and nicknames but this causes performance hits and invalid display. I am trying state that the bytes for channel names and nicknames be extended as Chuck said in his post above.


P.S. Try reading the original RFC. It states: "Each client is distinguished from other clients by a unique nickname having a maximum length of nine (9) characters." IE the nickname length that was extended long before the IRCx draft was published.
Posted By: argv0 Re: UTF-8 Nickname display truncated? - 02/10/07 08:57 PM
If it was stated in the original RFC then it was not "extended", it was defined, since the original RFC is the official IRC specification.

edit: I think the more pertinant thing here is to show what mIRC's *non utf* nick limit is before attacking what UTF should or shouldn't do.
Posted By: The_JD Re: UTF-8 Nickname display truncated? - 03/10/07 03:59 AM
AFAIK. Mirc is not UTF-8 compliant, but stores the ascii string and the UTF8 string to reference each other...
I'm sure khalid mentioned it before.

My guess is that the nick length that mIRC allows for ascii char's is measured in bytes, not charachters.

I'm sure I understand what you want here, however I can't picture Khaled modifiying this too soon.

I personally think mIRC should allow NICKLEN= to describe the size and go along with it (if NICKLEN is present).

(and channels = same prob)

I think this should be more of a feature request than a bug report.

Edit: Missed a post (above)... same thing said anyhow.
Posted By: Sais Re: UTF-8 Nickname display truncated? - 03/10/07 10:59 AM
For reference, RFC 2812 (the update to RFC 1459) says:

Quote:

1.2.1 Users

Each user is distinguished from other users by a unique nickname having a maximum length of nine (9) characters. See the protocol grammar rules (section 2.3.1) for what may and may not be used in a nickname.

While the maximum length is limited to nine characters, clients SHOULD accept longer strings as they may become used in future evolutions of the protocol.
© mIRC Discussion Forums