mIRC Home    About    Download    Register    News    Help

Print Thread
Joined: May 2018
Posts: 3
P
Self-satisified door
OP Offline
Self-satisified door
P
Joined: May 2018
Posts: 3
I am copying a password from a message my friend sent. (He thinks he's clever by including emoji as part of the password. ;-)

When I copy this string:
🍆d439c38e-219e-52d0-b04e-8822ca19e72e🍆
the final character (egg plant) gets corrupted, and instead I get:
🍆d439c38e-219e-52d0-b04e-8822ca19e72e🀀

note that this last character is also the end of the line.

This means I need to take two steps to use the password, I have to copy it to something else (Notepad++ in my case) and fix up the character and copy it again from there.

Joined: Jan 2004
Posts: 2,127
Hoopy frood
Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
You might need to give more info, as I'm not able to replicate what you're seeing. By "message" this means a query window? Which mirc version?, version of windows? Which font are you using?

Your message in forum appears to me having the ampersand-thru-semicolon format, so I paste that string into google search and the result page shows the emoji symbol. I then copy it from the google page into the clipboard, then do in status window: //echo -a $cb
From the resulting message displayed, I then paste it into this command to see what's in the clipboard:

//bset -t &v 1 contents-of-clipboard-replaces-this | echo -a $regsubex($bvar(&v,1-),/(\d+)/g,$base(\1,10,16,2))

... and I get the hex values of the utf-8 encoded string in the clipboard:

F0 9F 8D 86 64 34 33 39 63 33 38 65 2D 32 31 39 65 2D 35 32 64 30 2D 62 30 34 65 2D 38 38 32 32 63 61 31 39 65 37 32 65 F0 9F 8D 86

The first 4 and last 4 are the correct utf-8 encoding of your 127814 character, and are nothing similar to the encoding of 126976.

The only font I have which displays the emoji symbol is "Segoe UI Symbol", but the results from copypaste from window to the bset command are the same for Consolas or other fonts which just show the generic 'unknown' symbol.

Do you continue seeing the corrupted emoji when it's *not* the last character of the line? Does this problem still happen when you paste what you echoed to a window? With the following code you can see the bytes as filled into the text before the clipboard gets involved. Try changing 'person' into the friend's nick, then see if the emoji is corrupted before it reaches the clipboard.

Code:
On *:TEXT:*:*: { 
  if ($nick == person) {
    bset -t &v 1 $1-
    echo -s DEBUG as txt: $1-
    echo -s DEBUG as hex: $regsubex($bvar(&v,1-),/(\d+)/g,$base(\1,10,16,2))
  }
}

Joined: May 2018
Posts: 3
P
Self-satisified door
OP Offline
Self-satisified door
P
Joined: May 2018
Posts: 3
Okay, my apologies for having to blank out so much for privacy, but here's a pic: https://snag.gy/qzltug.jpg

Let me know if that doesn't tell you all you need to know... I was just drag selecting from the first emoji to the last which auto copies it to the clipboard. When I subsequently paste it, it has a corrupt final char.

[Edit: The font is fixed-sys 9pt regular.]

[Edit2: Windows 10. Problem was present on 1709 but also on the 1803 I am now running.]

[Edit3: When I edited the text in the forum, it showed the emoji's. When I submitted, the forum software converted them to the html escapes.... the ampersand number semicolon thingies. They arrived as proper emojis and displayed correctly in mIRC as pictured. It's only the last one on the line that seems corrupted.]

[Edit4: had to edit pic to remove more private infos]

Last edited by PaulHolder; 03/05/18 05:07 PM.
Joined: Jan 2004
Posts: 2,127
Hoopy frood
Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
When the message begins+ends with that emoji, does that mean the debug message in hex also begins and ends with F0 9F 8D 86 ?
And if the line has some other text after the emoji, does the emoji or whatever is at the end still get corrupted? Might also help if you had other characters within the 256-65535 range after the trailing emoji, such as alt-10004 checkmark.

Joined: May 2018
Posts: 3
P
Self-satisified door
OP Offline
Self-satisified door
P
Joined: May 2018
Posts: 3
Okay, I edited out much of the middle chars, and here is the result after select the chars and then pasting them into mIRC... BUT the thing is when I paste inside of mIRC the corruption does not occur... it looks correct on the screen and seems correct here:

F0 9F 8D 86 63 32 32 65 F0 9F 8D 86

Oh wait, now I can't get it to reproduce any more... what the heck?

The only change was your debugging statement... could that have changed the state of something?

Last edited by PaulHolder; 03/05/18 07:59 PM.
Joined: Jan 2004
Posts: 2,127
Hoopy frood
Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
I don't see how the debug statement could have fixed the 'whatever'. You should be able to eliminate any effects of the debug msg by disabling the ON TEXT event, then using /clear in the status window.

Joined: Feb 2003
Posts: 2,812
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2003
Posts: 2,812
It looks like the characters you are working with are outside of the Unicode Plane 0 Basic Multilingual Plane (BMP) (0x0000-FFFF), and into Plane 1 Supplementary Multilingual Plane (SMP) (0x10000-1FFFF).

mIRC can only manage BMP in many aspects of its string handling, so you should expect quirks here and there. If you can exactly reproduce a bug, give a step by step 1-2-3 series of commands, and maybe Khaled can address it. But maybe not. Don't expect wide reliable SMP-and-beyond support for a long while. mIRC is an old program and was written using older Windows APIs.


Well. At least I won lunch.
Good philosophy, see good in bad, I like!

Link Copied to Clipboard