mIRC Homepage

UTF-8 support in mIRC

Posted By: Krejt

UTF-8 support in mIRC - 17/02/06 10:58 PM

As you might have noticed Khaled added support for display of UTF-8 text as unicode to mIRC 6.17. This works in status, channel, query, and other windows, and in nickname listboxes, window titlebars, switchbar, and tooltips. It does not (yet) work for dialogs such as Options, Chat, Send, etc. since this would require making mIRC 100% unicode. The current changes already required recoding of significant parts of mIRC's display, text-wrapping, and mark/copy routines.

The UTF-8 support is new for us (and mIRC) and therefore still somewhat "experimental"... Your feedback on it would be appreciated!

The display of UTF-8 can be enabled by default for all windows in the Options/IRC/Messages dialog, or individually for any window you like via the Fonts dialog. Use the /font command to open the fonts dialog. Make sure you select a font that contains the characters or script (hebrew, arabic, greek, cyrillic,...) you want to see!

mIRC does not convert incoming UTF-8 into the local codepage. Server text is stored internally unchanged. This enables mIRC (scripts, etc.) to work fully with UTF-8 IRC servers that allow UTF-8 in channel and nick names. See <a href="irc://irc.unilang.org/" target="_blank">irc.unilang.org</a> for a fully enabled UTF-8 server.

The Fonts dialog also has an "Encode" option that encodes outgoing text in UTF-8 based on the script/codepage selected for that window. The Encode feature is selective, ie. it only encodes the parts of an outgoing message that are not already in UTF-8 format. If it sees a server (such as irc.unilang.org) which has CHARSET=utf-8 it will Encode the whole line, from start to end. If it sees a non-utf-8 server, it encodes the text after the : colon in the following outgoing messages: PRIVMSG, NOTICE, PART, TOPIC. You can also type unicode text in the editbox. To limit effects on non-UTF8 users, the unicode editbox only works when the UTF-8 Encode option is enabled for a window via the font dialog.

Of course also some UTF-8 related idfentifiers like $utfencode(), $utfdecode() and $isutf() have been added for scripters to play with.

Have lots of fun playing with this new feature!

Posted By: xyz

Re: UTF-8 support in mIRC - 17/02/06 11:05 PM

Thanks one more time for this feature. Been waiting this one now for some time laugh
Posted By: Mentality

Re: UTF-8 support in mIRC - 17/02/06 11:43 PM

Whoo! Here's a little newbie FAQ I've written to help people who are perhaps a little confused by it all. It's been put up at www.mirc.net/newbie/unicode.php but I figured I'd include a copy here :-)

What is Unicode?

Unicode is a way for all kinds of text to be displayed across the Internet. You will find that complex characters, such as those used in Chinese, Japanese and Korean writing, will not display with the current character encoding, ASCII. You'll get a bunch of black boxes or question marks in place of the actual characters. Unicode means that people can communicate on IRC in their own language, or (just for fun) use weird symbols they couldn't use before, both in their messages and in nicknames. Due to the extremely fast growth of Internet popularity, and indeed the diversity of those that use mIRC, Unicode is becoming an extremely popular feature in many programmes.

For these characters to appear in nicknames or channels, the IRC server you're connected to must support it. At this time, very few IRC servers do. This does NOT however affect your ability to send/receive text.

What is UTF-8?

UTF-8 is basically what makes these Unicode characters able to display on your screen. UTF stands for Unicode Transformation Format. The "8" shows that it's an 8-bit character encoding. It uses 1-4 bytes for each character. There are other character encodings for Unicode, such as UTF-7, UTF-16 and UTF-32. These are either obsolete or simply unpopular (and unnecessary), with UTF-8 being the usual chosen standard for translating Unicode. It is also backwards compatible with ASCII, which just means that it fully supports ASCII AND Unicode.

How does this affect my mIRC?

UTF-8 support was added in mIRC version 6.17. This version, or higher, must be used for these characters to display correctly. Support has been added to status, channel, query and DCC windows, as well as the nicklist, titlebar, switchbar and tool tips. Various other windows have also got support for display/encoding of UTF-8 text.

For the average English-speaking user (or anyone else who uses a language that displays fine with ASCII), there's not a huge change. You probably won't need to enable UTF-8 support. However, others may still try to speak to you with their own language and therefore use Unicode characters. If you want to view these characters correctly, rather than them appearing as black boxes, you can enable UTF-8 support. There is no harm in doing so whether you intend to use it or not. Assuming UTF-8 support becomes more widely implemented on IRC servers, this may become a more essential feature in the future.

How can I use this in mIRC?

This feature is enabled by default. However, should you need to enable it for some reason, go to Tools > Options > IRC > Messages and check the box saying 'UTF-8 display'. Individual window support is also available via the Fonts dialog. Right click on the channel name in the switchbar and click on 'Font...' and from the drop down list choose 'Display and encode' or 'Display only'.

Note, mIRC the application does not support Unicode, it can only display it or encode outgoing messages into it in IRC channel windows, or DCC chat windows, etc. Therefore, Unicode characters will not display correctly in various text boxes throughout mIRC, such as the ones found in the Address Book, /uwho or the Options dialog. For characters to display correctly in the editbox, you will need to enable the 'Multibyte editbox' option, again in Tools > Options > IRC > Messages.

It is worth noting that enabling 'UTF-8 Display' in the options dialog will not automatically make your messages encoded to UTF-8 - you need to enable this separately in the Font dialog, by choosing 'Display and encode' from the dropdown menu.

Will this affect my scripts?

Not really. Protection scripts which ban "$nick" will work fine as usual, as with any other identifiers or scripts which set variables containing nicknames, text or channel names or uses such data in other ways.

Has extra support been added in scripting related to UTF-8?

Of course! Three identifiers have been added should you need them - $utfencode, $utfdecode and $isutf.

$utfencode and $utfdecode will encode/decode given text respectively. $isutf returns a numerical value based on the state of the provided text, where 0 means the text is not UTF-8, 1 means it is plaintext (e.g. just normal English characters), and 2 means it is UTF-8 text (e.g. Chinese/Japanese/Korean writing, but could be many other languages too).

Okay, so why does it not display right on the network I use?

Text should display fine, provided you have UTF-8 enabled and are using mIRC 6.17 or above. You cannot include Unicode chars in nicknames or channel names however, unless the server you use supports UTF-8. You will notice that a bunch of jargon text appears when you connect to IRC - part of this is the settings for the server, for example, the max number of channels you can join (MAXCHANS=15) or the max length of topics (TOPICLEN=400). If you see "CHARSET=utf-8", then nicknames and channels should show up fine. If it does not (and none of the major networks do support it), then you are limited to text only. For an example of a network which does support CHARSET=utf-8 though, try /server irc.unilang.org.

Furthermore, the font you use within mIRC may not be able to display Unicode characters correctly. Fixedsys, mIRC's default font, is a poor choice for Unicode display. There is no specific "Unicode font", but there are a number of current fonts which are deemed acceptable (all of which should be available in your default Font dialog, View > Font):

Lucida Sans Unicode
Times New Roman
MS Gothic
Bitstream Cyberbit (not available by default)

Bitstream Cyberbit is available from http://ftp.netscape.com/pub/communicator/extras/fonts/windows/Cyberbit.ZIP, and documentation concerning it can be found at http://ftp.netscape.com/pub/communicator/extras/fonts/windows/ReadMe.htm

Fonts vary in their ability to encode/display Unicode, but the ones mentioned above are popular choices, and you should not run in to too many problems.

Where can I get more information?

Plenty of information is available on the web, including these good links:

http://www.unicode.org/ - The Unicode homepage.
http://en.wikipedia.org/wiki/Unicode - Wikipedia article on Unicode.
http://en.wikipedia.org/wiki/UTF-8 - Wikipedia article on UTF-8.
http://www.microsoft.com/typography/default.mspx - Microsoft's Typography website.
http://www.google.com - A brilliant search engine!

If you have something not too advanced to add, a correction to any wrong information (I'm no guru!) or anything else to say, please send me a private message rather than fill up this thread. Cheers!

Posted By: Adrenalin

Re: UTF-8 support in mIRC - 18/02/06 10:51 AM

Great tut Mentality !
I have find the default mIRC Font Fixedsys with support of unicode ! It's called Fixedsys Excelsior and it look exactly like the default one.. But with support of unicode(of course it doesn't support all language like the 6mb Bitstream Cyberbit).
The lastest version what i was able to find is 2.0.

The official site seem to be dead/down(but still a archive copy here), the last author message on the board, date 06-08-2004, he said what a new version will be released in a month, but it seem to be 2006 now, and no new version..

Unicode ranges:
On this site i find what Fixedsys Excelsior 1.3 support(i was unable to find other sources for v. 2.0):
Basic Latin; Latin-1 Supplement; Latin Extended-A; Latin Extended-B; IPA Extensions; Spacing Modifier Letters; Combining Diacritical Marks; Greek; Cyrillic; Armenian; Hebrew; Arabic; Thai; Latin Extended Additional; Greek Extended; General Punctuation; Superscripts and Subscripts; Currency Symbols; Combining Diacritical Marks for Symbols; Letterlike Symbols; Number Forms; Arrows; Mathematical Operators; Miscellaneous Technical; Control Pictures; Optical Character Recognition; Enclosed Alphanumerics; Box Drawing; Block Elements; Geometric Shapes; Miscellaneous Symbols; Dingbats; CJK Symbols and Punctuation; Hiragana; Katakana; Alphabetic Presentation Forms; Arabic Presentation Forms-A; Small Form Variants; Arabic Presentation Forms-B; Halfwidth and Fullwidth Forms; Specials; Ethiopic; Cherokee; Ogham; Runic; Mongolian; Braille Patterns


Also i make some mirrors, just in case..

Installation of the font:
To install the font you must copy the fsex2p00_public.ttf from fixedsys.zip archive to the fonts dir from windows directory.
Click start->run type: fonts, the fonts directory will be opened. Copy the fsex2p00_public.ttf to this directory.
Now you can select the font from mIRC Font box.

Flash Video instalation of the font and activate UTF-8 for Channels and Private Messages Windows:
http://irc.server.md/mirc_utf8.htm (533kb)
Posted By: raZOR

Re: UTF-8 support in mIRC - 18/02/06 04:13 PM

i still dont understand 1 thing

on 6.17 i use Verdana font with western encoding, i set in font options Display and Encode.

now i typed some letters from my language in mirc 6.03
i my channel, it shows them weird (thats ok) but mirc 6.17 also received those char's as weird...

BUT when i typed them in 6.17, then 6.17 showed them ok...

so does this means that 6.17 can receive(display) such char's right ONLY if 6.17 communicates with 6.17 ?
Posted By: Adrenalin

Re: UTF-8 support in mIRC - 19/02/06 07:33 PM

so does this means that 6.17 can receive(display) such char's right ONLY if 6.17 communicates with 6.17 ?

That's right, when the encode option are activated, mIRC encode all the special chars to UTF-8, so only a utf-8 compatible client(like mIRC 6.17 or XChat) can decode and display it correcly.
Posted By: Rounin

Re: UTF-8 support in mIRC - 19/02/06 11:25 PM

Hm, yes, the only languages for which UTF-8 is compatible with the old encodings are those which only use the letters A-Z, primarily English.

For all other languages, everyone needs to be able to at least read the same character sets.

One way to make this happen is to ensure that everyone on Windows uses mIRC 6.17 and has UTF-8 on. The alternative is for everyone to use the old encodings, but this in turn creates compatibility problems between languages and between users of Windows and Linux/Mac OS X.

I can only say that I hope people will try Unicode out... It's quite handy once one gets used to it!
Posted By: Doqnach

Re: UTF-8 support in mIRC - 20/02/06 12:44 PM

strange thing is that I can't get UTF-8 to work at all between irssi v0.8.10 and mIRC...

the irssi client is set to full utf-8 support (it's someone elses client, we were testing the UTF-8 of mIRC together).
between irssi clients it works fine but my mIRC still displays junk :-P

I have the channel set to display + encode, have utf-8 display turned on aswell as multibyte display.
my font is courier new (also tried with courier and bitstream cyberbit)

I also have asian languages support installed (winxp sp2)

I can't seem to get it to work at all and can't find what I'm doing wrong :-/
Posted By: Adrenalin

Re: UTF-8 support in mIRC - 20/02/06 03:19 PM

Hm, it seem what XChat undestand well mIRC's unicode.
I think the problem is in irssi.

Posted By: Rounin

Re: UTF-8 support in mIRC - 20/02/06 06:35 PM

Probably. Actually, irssi users seem to have problems with Unicode quite often, either because of their terminal emulator, or because of screen, which they use keep irssi online when they log off, or because of configuration problems. There basically seem to be a lot of sources for potential error with that particular combination of software.

One way of testing this would of course be to repeat the experiment with irssi and X-Chat... Most likely the same problem will appear.
Posted By: TheShadowRunner

Re: UTF-8 support in mIRC - 01/03/06 02:31 AM

Adrenalin, I installed Fixedsys Excelsior ver2.0
but unfortunately, while i can select almost any "script", the only one i'm interested in, Japanese, is missing.
Strange since Excelsior apparently supports Japanese!?
Does Japanese show up for you in the script list?

Also, I _am_ able to get Japanese with the original Fixedsys by using AppLocale. [ http://alcahest.club.fr/perso/apploc/ircafter.jpg ]
I wonder why it's not possible to do so with the new mIRC (withoug using AppLocale).

Thanks for your support.

you can check AppLocale here http://alcahest.club.fr/perso/apploc/applocale.html
Posted By: MeGaA

Re: UTF-8 support in mIRC - 01/03/06 09:20 AM

! mad
Posted By: Doqnach

Re: UTF-8 support in mIRC - 01/03/06 11:13 AM

I got it too work now!

you need to select the script type at the font window...

though I still can't see the utf-8 norwegian stuff that get's thrown at me :-P

but I can see and write japanese ;-]
Posted By: TheShadowRunner

Re: UTF-8 support in mIRC - 01/03/06 11:31 AM

what font are you using Doqnach?
Posted By: pouncer

Re: UTF-8 support in mIRC - 23/03/06 06:13 PM

hey adrenalin, do you what font that is in the xchat client?
Posted By: Adrenalin

Re: UTF-8 support in mIRC - 24/03/06 01:48 PM

hey adrenalin, do you what font that is in the xchat client?

The XChat standart one(Monospace size 9), i think that font is XChat built-in because i can't found it in mIRC font list..

Maybe it is here http://savannah.nongnu.org/projects/freefont/ ?
Posted By: pouncer

Re: UTF-8 support in mIRC - 24/03/06 02:48 PM

nope cant seem to find it anywhere, if anyone has Monospace font, could they upload it?

thanks alot!
Posted By: Grue

Re: UTF-8 support in mIRC - 20/04/06 12:19 AM

It seems like there's a bug in mIRC when you use the special character "" in your timestamps. If you use the middledot it screws up any special UTF-8 characters like "".

I posted it as a bug in the bug forum.


You can get the Monospace font here.
Posted By: SysFixer

Re: UTF-8 support in mIRC - 11/06/06 02:29 AM

I am the creator of Fixedsys Excelsior. The last public release was 2.00, but since that time I had added hundreds of new characters and improved the Cyrillic and Armenian ranges greatly. Unfortunately, I lost all my additions and changes a long time ago and because mIRC with Unicode seemed so impossible (well, we waited a while!) I gave up. Now that I have learned mIRC has Unicode support, I will gladly resume the project. Please let me know if you have any specific requests for character inclusion or changes. You can contact me at gmail.com (name Valentinium). Considering how it went this time I wont make any promises, but I think it is reasonably possible that I could release an improved version in one month.

I am very happy people use it, thank you for that.

Posted By: Krejt

Re: UTF-8 support in mIRC - 14/06/06 07:54 PM

Ah, so you created Fixedsys Excelsior! Good job! Thanks!

Yup, an update of your font would be appreciated a lot, especially by me since I still use the Fixedsys a lot.

Posted By: SysFixer

Re: UTF-8 support in mIRC - 15/06/06 03:55 PM

You're welcome. It's exciting that this once nearly pointless project is suddenly useful. So far I have done much ... for one, the underline bug is fixed. I have gone through all the Latin-based ranges and improved many characters for European and African languages etc. I am trying to keep as much as possible fixedwidth (at least within a given script) this time even when it seems contrary to aesthetics, because I know people use the font for programming and ASCII (unicode?) art.

I have added three complete new scripts, also.




I will post more samples for any additional added scripts. I would like to add support for devanagari but I doubt I could implement it correctly.
Posted By: SysFixer

Re: UTF-8 support in mIRC - 15/06/06 09:14 PM

I just completed the update to Armenian. I think it is a considerable improvement but no matter what, the spacing seems awkward and I dont know how far I can push the shapes without making it look strange (as I cannot not read Armenian myself). If anyone who reads Armenian sees this, feel free to offer suggestions for improvements.

Posted By: donatzsky

Re: UTF-8 support in mIRC - 08/07/06 10:00 AM

Here's a number of useful unicode resources:


Fonts in particular:

Posted By: bwuser

Re: UTF-8 support in mIRC - 29/07/06 12:44 AM

I was unsure whether to post this here or in the "Bugs" section, but Tjerk asked for feed-back and I couldn't pass out on the opportunity to compliment the author of Fixedsys Excelsior on his fine and laudable work, so here it is - thank you Darien.

Anyway, the issue I have is this:
I was thrilled when 6.17 came out because I deal a lot in strange languages and weird scripts, and indeed - I am grateful to Khaled for upgrading mIRC to support UTF-8, something I had been hoping for in years. But there is an issue that I don't think I've seen mentioned anywhere else. Namely this:

I have no problem inputting basic unicode text, like simple Greek letters (&#949;&#955;&#955;&#951;&#957;&#953;&#954;&#951;) or Japanese (&#12395;&#12411;&#12435;&#12366;), but whenever I try to input more complex unicode scripts mIRC just gives me replacement characters (boxes or question marks). I think I've narrowed it down to those instances where the script in question uses glyph rearranging features, like with combining characters and the like. For instance if I try to input devanagari (&#2342;&#2375;&#2357;&#2344;&#2366;&#2327;&#2352;&#2368;), accented Greek (&#7952;&#955;&#955;&#951;&#957;&#953;&#954;&#8134;) or Latin script with more "exotic" diacriticals (&#7747;&#7681;&#347;&#257;&#365;&#491;&#7773;&#7729;&#7717;m&#778;q&#803;&#775;). I have no problems writing these letters in any other unicode-enabled context. The really weird thing is that I can easily paste these into mIRC and then they'll show up fine - I just can't input them directly into the mIRC channel editbox for some reason.

[edit: Ok, this obviously got garbled into HTML entities, but I'll attribute that to this web forum as I do not normally have any problems writing unicode.]

Another somewhat odd problem I've noticed concerns the Windows function of shifting between installed keyboards. One can of course switch by cliking the icon in the language band and chosing another keyboard, but Windows also allows one to switch by using keyboard shortcuts (in my case left CTRL+SHIFT) which is much preferrable. This works fine, except in mIRC, where it displays some weird behaviour. I normally have five keyboards installed and by pressing this keyboard shortcut, it'll change through them one at a time. But in mIRC, it'll change only between two of them. Say I'm writing something in Danish, I press the shortcut to change to Greek, press it once more, it goes to Hebrew. Then when I press it again, it would normally go through Japanese, Devanagari and then back to Danish, but in mIRC it doesn't - it just keeps changing between Greek and Hebrew, and I have to use the mouse and the language band button to switch back to my standard Danish keyboard layout.

I know this is a bit of a mouthful, but I couldn't seem to find anyone else reporting these issues. I'm kind of fearing that if it isn't a problem specific to my setup or computer, it might be one of those things that'll require a more thorough rewriting of mIRC and cannot be easily fixed, but I'm very interested in hearing what you have to say on this matter.

(I am using mIRC 6.2 on a WinXP SP2.)

Posted By: Rounin

Re: UTF-8 support in mIRC - 03/08/06 07:40 PM

Have you enabled the multibyte editbox and whatnot? I remember having a few problems because I forgot that in the previous version.

And do you get the same problems if you use a font specifically designed for the language you're typing in, or is it only when relying on font substitution?

Meh, I've got nothing, really. I just had to ask.
Posted By: bwuser

Re: UTF-8 support in mIRC - 04/08/06 08:50 AM

Have you enabled the multibyte editbox and whatnot? I remember having a few problems because I forgot that in the previous version.

And do you get the same problems if you use a font specifically designed for the language you're typing in, or is it only when relying on font substitution?

Yes, to both questions, sadly. Still won't work. But thanks for the suggestions anyway.

Posted By: Scorpwanna

Re: UTF-8 support in mIRC - 21/09/06 06:18 AM

A font....
Posted By: bwuser

Re: UTF-8 support in mIRC - 24/09/06 05:20 PM

It is not a font issue - I've tried with several fonts all of whom work fine in any other application than mIRC. That is, I can type given unicode text into any other application using a given font - but when I try to do so in mIRC, the problem is there.

Posted By: theriel

Re: UTF-8 support in mIRC (INPUT) - 27/09/06 05:06 PM

I've a similar problem, so I thought I'd post here rather than start a new thread. Do you have any suggestion for input? I've been trying to figure out how to input IPA into mirc, since I sometimes have linguistic discussions. I use a program called Keyman, which uses a unicode-based keyboard input. I've got the correct fonts, and I could type things in IPA d&#658;&#601;st&#688; f&#945;in (just fine) here for instance smile (Although evidently they don't show up :P) I noticed I could read the fonts when someone else tried to type them, but it came up as ɣ, when i tried to copy/paste. What is mirc doing? And is there anything I can do to fix that?

I'm using Doulos SIL font, which is perfectly utf8 compatible, and winxp with mirc 6.2. I've tried a variety of fonts with the same results.

Posted By: KoiDesiPagal

Re: UTF-8 support in mIRC (INPUT) - 09/01/07 04:44 AM

anyone know if sysfixer has made any progress with fixedsys excelsior? his old site seems to have gone down so ya... just wanted to know if there was any update on the font.

PS: i'm waiting for japanese script support to be added into it.
Posted By: globalplayer

Re: UTF-8 support in mIRC (INPUT) - 24/01/07 01:01 PM

Thank you SOOOOO much guys!!

Great feature! Finally I can now use a different charset for non-unicode applications (e. g. Hebrew) and true Unicode apps (foobar2000)!
This is important if you have French Windows but you want to search for music in Hebrew in Soulseek, which is non-unicode (unfortunately).
BUT ... the "Multibyte editbox" option is extremely confusing. It does not tell anything to me, so if I had not read this forum thread I would not have got it working, as UTF-8 enabling is NOT sufficient. I suggest that you do not bother the user with such things and remove it, enabling this internally when the user wants UTF-8. My 2c. mIRC is an application mainly for chats, and if you want to chat in foreign characters, this "Multibyte editbox" is mandatory! Otherwise you will get a lot of question marks in the channel, and users will complain they do not get the characters right.
Posted By: Kenshin_87

Re: UTF-8 support in mIRC (INPUT) - 02/02/07 04:49 PM

Hi... I am wondering how come I can't directly type Korean into mIRC? I have to use a another program like Microsoft word to type then I copy and paste it to mIRC then it will work...

When I'm using mIRC window, the language bar will only show "KO Korean"... But when I use the Microsoft Word window, it will show the full bar, "Han/Eng", "Hanja", "Pad"...

How do I configure my mIRC to type Korean words directly?

Posted By: SysFixer

Re: UTF-8 support in mIRC (Fixedsys 3) - 09/03/07 07:39 AM

I finished Fixedsys Excelsior 3.00 the other day. I hope you like it. Hopefully mIRC will have the complex-script input bugs worked out soon.


© 2022 mIRC Discussion Forums