mIRC Home    About    Download    Register    News    Help

Print Thread
Joined: Nov 2009
Posts: 295
P
pball Offline OP
Fjord artisan
OP Offline
Fjord artisan
P
Joined: Nov 2009
Posts: 295
I'm making a script that uses google translation. I'm having a problem when translating from some language to Japanese (or any other non-roman language). In the sockread portion of the script I have regex to find the translated text. Then I set the translated text to a variable with $regml() being the text. After everything is found I msg the translated text to the chan.

I can get こんにちは translated to hello but doing the reverse it returns ‚±‚ñ‚É‚¿‚Í instead of the japanese characters. I've saved the sockread to a file and in the file there is the japanese, so some where between reading it, storing it in a variable, and messaging the chan the japanese chracters are messed up.

Anyone have any clue how to fix this. I'm using mIRC 6.35, I have it so it displays utf-8 character. Thanks


http://scripting.pball.win
My personal site with some scripts I've released.
Joined: Jun 2007
Posts: 933
5
Hoopy frood
Offline
Hoopy frood
5
Joined: Jun 2007
Posts: 933
At the moment this is still a problem in mIRC, yes.
All I can suggest at this point is to try and use the undocumented $utfencode() identifier on the $read (or otherwise sockread).

Joined: Nov 2009
Posts: 295
P
pball Offline OP
Fjord artisan
OP Offline
Fjord artisan
P
Joined: Nov 2009
Posts: 295
Thanks for undocumented command. I played with it some and it works nicely playing around but I can't get it to fix the script.

http://pastebin.com/m5d588388

Thanks a link to the main part of my script if anyone wants to take a look. Just type /trans phrase, and the phrase will be translated to japanese. Granted what's returned is just gibberish until a fix for this problem is found.


http://scripting.pball.win
My personal site with some scripts I've released.
Joined: Oct 2004
Posts: 8,330
Hoopy frood
Offline
Hoopy frood
Joined: Oct 2004
Posts: 8,330
Make sure your settings are such that you not only display characters, but that you send characters in UTF-8. Beyond that, someone who uses unicode more can give a more specific answer. It still is somewhat limited because of mIRC not being fully unicode, but you should be able to handle the characters as I've seen other translation scripts that display fine.


Invision Support
#Invision on irc.irchighway.net
Joined: Dec 2002
Posts: 5,411
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,411
I tested your script but it did not seem do anything when I typed in "/trans phrase" - it connected to google but on SOCKREAD was never triggered and nothing arrived. It looks like there is a missing on SOCKOPEN event that needs to send the request?

Apart from that, if the information you are receiving is in UTF-8 then I don't there should be a problem with sending it to the channel. Perhaps the UTF-8 text is getting scrambled because of the encoding/codepage setting for that Window. Also, you might want to check the encoding/codepage for the status window - the socket routines may use that as the default since they are not generally associated with a particular window.

Joined: Nov 2009
Posts: 20
X
Ameglian cow
Offline
Ameglian cow
X
Joined: Nov 2009
Posts: 20
Also, SJIS/JIS conversion option in mIRC options dialog is useful wink

Alt + O - IRC - Messages

Have a great day!

Joined: Nov 2009
Posts: 295
P
pball Offline OP
Fjord artisan
OP Offline
Fjord artisan
P
Joined: Nov 2009
Posts: 295
Not sure why that code I posted isn't working but I tested this by itself and it works for me. http://pastebin.com/m36083f0f

@Riamus2
I can send and display utf-8 characters just fine in mirc.

@Khaled
I didn't really test the snippet that posted before, but the new link should work. I also checked that utf-8 was on for all windows.

@XperTeeZ
turned that on and it didn't seem to do anything

@Everyone
Thanks for all help.

Edit:
Well my suspensions were correct. Google is at fault here, well at least they seem to use some weird characters. Babblefish seems to have encoded utf-8 when I looked at the source for a page with a translation on it. So doing $utfdecode() on the result from Babblefish yields the japansese text. This is really a bummer since google has an auto-detect feature which is quite handy.

Edit2:
Well it might not be completely google, saving a webpage and having the sockread write out a file. They turn out differently. So maybe the sockread just doesn't handle utf-8 that well.

Since I can't give up on something like this I might try getting something else like a cmd line app to save the webpage then parse that out.

Last edited by pball; 04/12/09 05:43 PM.

http://scripting.pball.win
My personal site with some scripts I've released.
Joined: Oct 2004
Posts: 8,330
Hoopy frood
Offline
Hoopy frood
Joined: Oct 2004
Posts: 8,330
More likely, you just need to change your sockwrite commands to send what you need to the site so it knows how to respond. Changing HTML 1.1 to 1.0 helps in some situations (or vice versa), or changing your encoding or what types of output the size can send you (eg. gzip). Basically, you can send exactly the same thing to the site with sockets as if you went there with a browser... you just need to know what to send.


Invision Support
#Invision on irc.irchighway.net
Joined: Nov 2009
Posts: 295
P
pball Offline OP
Fjord artisan
OP Offline
Fjord artisan
P
Joined: Nov 2009
Posts: 295
Well I've HTTP/1.0 and HTTP/1.1 with no difference. There is also
sockwrite -nt $sockname Accept-Language: en-us
which also doesn't seem to change anything whether it's there or not.

Is there anything else that might change the return? I don't know all the options you can use with on sockopen.


http://scripting.pball.win
My personal site with some scripts I've released.
Joined: Oct 2004
Posts: 8,330
Hoopy frood
Offline
Hoopy frood
Joined: Oct 2004
Posts: 8,330
I have had luck using the trial IE addon http://www.iewatch.com/ to view what the browser sends when loading a page. Sometimes, that's the easiest way to find out the correct sockwrite messages to send.


Invision Support
#Invision on irc.irchighway.net
Joined: Nov 2009
Posts: 295
P
pball Offline OP
Fjord artisan
OP Offline
Fjord artisan
P
Joined: Nov 2009
Posts: 295
Riamus2 I love you. I got the user agent info ie was using and stuck that in the script and it works great now.


http://scripting.pball.win
My personal site with some scripts I've released.
Joined: Oct 2004
Posts: 8,330
Hoopy frood
Offline
Hoopy frood
Joined: Oct 2004
Posts: 8,330
Glad it helped. smile


Invision Support
#Invision on irc.irchighway.net

Link Copied to Clipboard