mIRC Homepage
Posted By: klez /write files - 12/10/16 08:43 AM
Hi!

I don't know is a bug or not, but i have a problem:

I created (in Windows) a test.TXT file with attribute Unicode. I issue /write test.txt message.
So, when i open this file i see that this file became not Unicode as was, but Ansi or UTF-8 if i used symbols in other languages.

Can you modify to write every file and to not change this file in ANSI/UTF-8 and so on?

I understand if i write in other languages using ANSI all symbols will not be saved as are, but, please, even i use this symbols can /write command to use Unicode, but not UTF-8?

Will be more better if /write command has -u parameter that will indicate to use how to write: in Ansi, UTF-8 or Unicode.
Posted By: Wims Re: /write files - 12/10/16 01:34 PM
This has been discussed in the past.
'Unicode' here is just utf16 while mIRC use utf8.
Why do you want to use Unicode specifically? the text file's content is the same in utf8 or utf16, it's just how it's encoded.
That being said, you can't decode for sure if you don't know how it was encoded, we should technically be able to tell how it's encoding/decoding when writting/reading to a file.
Again this was pointed out in the past, but currently mIRC simplify things for us, and it's fine 99% of the time.. So even though there is a lack of support, it shouldn't really be an issue for an IRC client.
Posted By: klez Re: /write files - 14/10/16 10:18 AM
Many players like Winamp or Aimp use Unicode playlists.. So when i try to write files in romanian language, winamp or aimp don't support.. when i transform this playlist in unicode, all is ok.
Posted By: sdamon Re: /write files - 16/10/16 10:03 PM
Is $utfencode/$utfdecode(text) insufficient?
Posted By: klez Re: /write files - 17/10/16 10:57 AM
Is insufficient to write the files, not to read! I can't program Aimp or Winamp to read files from UTF-8 encoded! These programs read only unicode files!
Posted By: sdamon Re: /write files - 21/10/16 08:20 AM
There is no such thing as a 'unicode' file format. Unicode is a standard that defines integers (not bytes!) to glyphs. Unicode can be encoded to bytes as UCS2, UCS4, UTF-8, UTF-8 with BOM, UTF-16LE, UTF-16BE, UTF-32LE, or UTF-32BE.

Winamp 'unicode' is UTF-8 with BOM (since thats the win32 default for utf-8, theological discussion aside), however it will read straight UTF-8 just fine (in my experiments).

If you are asking for comprehensive encoding support ... thats not easy. Mirc uses, by default, UTF-16LE because thats what wchars are in win32. Doing something else would require all the strings you can access through mirc to link in software that either has to be written from scratch or is notoriously hard to link against on windows (or even illegal to link to mirc without drastically changing it's license).
© mIRC Discussion Forums