Originally Posted By: argv0
There are plenty of UTF-16 files with no BOM, and the spec explains how to deal with such files:

From wikipedia:
Originally Posted By: wikipedia
If the BOM is missing, the standard says that big-endian encoding should be assumed.


It's not that clear cut unfortunately, as explained by the next sentence from that Wikipedia article:

Originally Posted By: Wikipedia
If the BOM is missing, the standard says that big-endian encoding should be assumed. (In practice, due to Windows using little-endian order by default, many applications also assume little-endian encoding by default.)


For this reason, it's common for UTF-16 files without a BOM to be LE, not BE, if they originated from a Windows program.

Originally Posted By: argv0
Therefore mIRC should be able to handle these files; how to detect them is an interesting question, but not really our concern. In other words, whether mIRC performs auto-detection or we have to tell it manually isn't really the issue. There needs to be a way to access these files.

I would consider it a bug if mIRC is unable to read non-BOM UTF-16 files given that it is supposedly "unicode compatible"-- at worst, it should be a feature suggestion. Perhaps a new switch on $read and its counterparts to specify file encodings (which mIRC badly needs!) would help.


I generally don't like the concept of auto-detection. I'm sure you've seen the Notepad trick where you save some text, close and reopen the file, and the text becomes gibberish. UTF-8 is much more commonly used in text files anyway, so it should only be an issue in a small number of cases. However, I think it would be a good idea to add a switch to force mIRC to use a particular encoding when reading a text file.