mIRC Home    About    Download    Register    News    Help

Print Thread
UTF-8 Nickname display truncated? #186865 27/09/07 05:01 PM
Joined: Jan 2003
Posts: 20
T
Tewl Offline OP
Ameglian cow
OP Offline
Ameglian cow
T
Joined: Jan 2003
Posts: 20
A user entered one of our rooms today with a very odd nickname.

In Unicode: ι̲̅и̲̅т̲̅э̲̅и̲̅ѕ̲̅ιfι̲̅э̲̅đ̲̅Ł̲̅о̲̅ģ̲̅ι̲̅ς̲̅

In UTF-8: ι̲̅и̲̅т̲̅э̲̅и̲̅ѕ̲̅ιfι̲̅э̲̅đ̲̅Ł̲̅о̲̅ģ̲̅ι̲̅ς̲̅

Ya I know you can't read it here most likely but it displays fine in the mIRC nicklist.

The problem I am noticing is that mIRC isn't giving the entire UTF-8 value as the $nick so none of the events are displaying correctly.

Value displayed: ι̲и̅т̲э̅и̲ѕ̅ιfι̅э̅đ̅Ł̅о̅ģ

The correct value has a length of 191 characters while the processed value only has 110. This prevents the name from being encoded and decoded properly. The only way I have found so far to display data is to connect to the server via a socket connection and display events manually when recieved instead of sending them back to the local server to be processed by mIRC. Is there another way of handling this so that I may still use the mIRC events?

* Note: This causes issues when the user joins or leaves the channel. Not removing them from the nicklist and if the rejoin adding the short value to the nicklist and the original still being listed since it did not regester the user leaving.

Last edited by Tewl; 27/09/07 05:27 PM.
Re: UTF-8 Nickname display truncated? [Re: Tewl] #186884 27/09/07 08:56 PM
Joined: Oct 2004
Posts: 8,330
Riamus2 Offline
Hoopy frood
Offline
Hoopy frood
Joined: Oct 2004
Posts: 8,330
That sounds like mIRC has a limit on nick lengths since most networks would never allow that many characters for a nick. You might want to test this with a non-UTF nick that is the same length and see if you have the same problem. If so, I'd put it in as a bug.


Invision Support
#Invision on irc.irchighway.net
Re: UTF-8 Nickname display truncated? [Re: Riamus2] #186905 28/09/07 02:07 AM
Joined: Oct 2006
Posts: 48
G
Ghozer Offline
Ameglian cow
Offline
Ameglian cow
G
Joined: Oct 2006
Posts: 48
since when was

ι̲̅и̲̅т̲̅э̲̅и̲̅ѕ̲̅ιfι̲̅э̲̅đ̲̅Ł̲̅о̲̅ģ̲̅ι̲̅ς̲̅

a valid nickname??? I thought only...

a-Z,1-0 {}[]()`^.-_

Were valid??

Re: UTF-8 Nickname display truncated? [Re: Ghozer] #186906 28/09/07 02:19 AM
Joined: Jan 2006
Posts: 468
symphony Offline
Fjord artisan
Offline
Fjord artisan
Joined: Jan 2006
Posts: 468
Some networks allow Chinese/Malaysian characters in nicknames, others allow Arabic. But Unicode text like this, never seen before.

Re: UTF-8 Nickname display truncated? [Re: symphony] #186908 28/09/07 02:40 AM
Joined: Jan 2003
Posts: 20
T
Tewl Offline OP
Ameglian cow
OP Offline
Ameglian cow
T
Joined: Jan 2003
Posts: 20
Here is what the name looks like.



The problem I am having isn't just with the limitation of the length of nicknames but channel names as well.

The server I am on is much like MSN was after they went to webchat allowing multiple languages and long channel names. Back before utf8 support was added to mirc we would use hash tables to create references to the long names (channels and nicknames) which was fine because the names didnt display correctly anyway. lol. Anyways, I would really like to display the nicknames and maybe even the channel names in full length if possible.

Re: UTF-8 Nickname display truncated? [Re: Tewl] #186921 28/09/07 10:21 AM
Joined: Oct 2004
Posts: 8,330
Riamus2 Offline
Hoopy frood
Offline
Hoopy frood
Joined: Oct 2004
Posts: 8,330
I would suggest putting the request in as either a feature suggestion or a bug. A network that allows such long nicks and/or channels is very rare, so it really isn't worth taking too much time to make it work in mIRC. Yet, I have a feeling the length limit won't be difficult to change, so it might be possible to fix without too much work.


Invision Support
#Invision on irc.irchighway.net
Re: UTF-8 Nickname display truncated? [Re: Riamus2] #186979 29/09/07 05:42 PM
Joined: Jan 2003
Posts: 20
T
Tewl Offline OP
Ameglian cow
OP Offline
Ameglian cow
T
Joined: Jan 2003
Posts: 20
The thing is, it does limit the number of characters in a nickname but the length is based on the unicode value not the utf-8 which can be much longer. I don't know what mIRC shortens the names though =\

Last edited by Tewl; 29/09/07 05:43 PM.
Re: UTF-8 Nickname display truncated? [Re: Tewl] #187091 30/09/07 12:17 PM
Joined: Sep 2007
Posts: 2
C
ChuckB Offline
Bowl of petunias
Offline
Bowl of petunias
C
Joined: Sep 2007
Posts: 2
1 unicode character can be upto 4 utf-8 characters in ascii count (i.e. 4 bytes).

xchat has already adjusted for this issue, it would be nice if mirc adjusts for this issue too.

Quote:

- Support receiving 2048 bytes per line from server and dcc-chat, so we can support 512 UTF-8 characters that some servers now send.

http://66.102.9.104/search?q=cache:0SwBOBN15BMJ:www.winfuture-forum.de/lofiversion/index.php%3Ft56045.html+xchat+unicode+2048&hl=en&ct=clnk&cd=3

quote says 512 UTF-8, but it means 512 Unicode characters in UTF-8 format.

Re: UTF-8 Nickname display truncated? [Re: ChuckB] #187242 02/10/07 01:31 AM
Joined: Mar 2006
Posts: 393
T
The_JD Offline
Fjord artisan
Offline
Fjord artisan
T
Joined: Mar 2006
Posts: 393
Tewl,
The UTF-8 setup in mIRC does not fix the nick length issue.

AFAIK the only way to do this would to be to create references to the full nickname, as seen in Vincula etc.

I don't believe this is a problem in mIRC's UTF-8 support, but in its maximum nicklength.

Other posters: This is still a common nickname length. The IRCX draft extended the nickname length somewhat.
Now with a bunch of MSN Clones made to be like "MSN Chat"-(RIP), the unexperienced developers do not think about things such as nick lengths.

Regards,
JD

Edit: IRCX also allows most charachters in nicknames...

Last edited by The_JD; 02/10/07 01:33 AM.

[02:16] * Titanic has quit IRC (Excess Flood)
Re: UTF-8 Nickname display truncated? [Re: The_JD] #187249 02/10/07 03:36 AM
Joined: Jan 2003
Posts: 20
T
Tewl Offline OP
Ameglian cow
OP Offline
Ameglian cow
T
Joined: Jan 2003
Posts: 20
The_JD
The point that I am trying to establish is the fact that mIRC is not taking into account that UTF-8 nicknames in WIDE UNICODE would be the valid length but exceed the normal bytes that a normal ascii name would have. I already know that Vincula and many other scripts used hash tables to back reference channels and nicknames but this causes performance hits and invalid display. I am trying state that the bytes for channel names and nicknames be extended as Chuck said in his post above.


P.S. Try reading the original RFC. It states: "Each client is distinguished from other clients by a unique nickname having a maximum length of nine (9) characters." IE the nickname length that was extended long before the IRCx draft was published.

Re: UTF-8 Nickname display truncated? [Re: Tewl] #187288 02/10/07 08:57 PM
Joined: Oct 2003
Posts: 3,918
A
argv0 Offline
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
If it was stated in the original RFC then it was not "extended", it was defined, since the original RFC is the official IRC specification.

edit: I think the more pertinant thing here is to show what mIRC's *non utf* nick limit is before attacking what UTF should or shouldn't do.


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Re: UTF-8 Nickname display truncated? [Re: argv0] #187304 03/10/07 03:59 AM
Joined: Mar 2006
Posts: 393
T
The_JD Offline
Fjord artisan
Offline
Fjord artisan
T
Joined: Mar 2006
Posts: 393
AFAIK. Mirc is not UTF-8 compliant, but stores the ascii string and the UTF8 string to reference each other...
I'm sure khalid mentioned it before.

My guess is that the nick length that mIRC allows for ascii char's is measured in bytes, not charachters.

I'm sure I understand what you want here, however I can't picture Khaled modifiying this too soon.

I personally think mIRC should allow NICKLEN= to describe the size and go along with it (if NICKLEN is present).

(and channels = same prob)

I think this should be more of a feature request than a bug report.

Edit: Missed a post (above)... same thing said anyhow.

Last edited by The_JD; 03/10/07 04:06 AM.

[02:16] * Titanic has quit IRC (Excess Flood)
Re: UTF-8 Nickname display truncated? [Re: argv0] #187317 03/10/07 10:59 AM
Joined: Oct 2003
Posts: 313
S
Sais Offline
Fjord artisan
Offline
Fjord artisan
S
Joined: Oct 2003
Posts: 313
For reference, RFC 2812 (the update to RFC 1459) says:

Quote:

1.2.1 Users

Each user is distinguished from other users by a unique nickname having a maximum length of nine (9) characters. See the protocol grammar rules (section 2.3.1) for what may and may not be used in a nickname.

While the maximum length is limited to nine characters, clients SHOULD accept longer strings as they may become used in future evolutions of the protocol.


Sais