mIRC Home    About    Download    Register    News    Help

Print Thread
#220920 02/05/10 07:54 PM
Joined: Apr 2003
Posts: 342
M
Fjord artisan
OP Offline
Fjord artisan
M
Joined: Apr 2003
Posts: 342
Sheesh I gotta be somewhere in ten minutes i'll be quick.

Originally Posted By: Versions.txt
1. Added Unicode/UTF-8 BOM support to file routines. This applies to all files that are read or written as text, including ini files.


You will need to force binary reading/writing now with /fopen. Add switches for read-only mode too. /fopen -rwb (-read, -write, -binary)

/fseek -w <wildcard>
/fseek -r <regex>

You *need* to fix $fread to get the matchtext or matched expressions after performing this. You should not force using $fread to read up to the next line when using a standard %var.

That's it. Gotta run.


Beware of MeStinkBAD! He knows more than he actually does!
Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
Binary reading/writing has nothing to do with files that have a BOM. In fact, a BOM implies that the data is Unicode text, not binary data (unicode is not binary data). Realize that by supporting the BOM mIRC will be able to translate any UTF-16 into UTF-8 plaintext, so those files *used* to be seen as "binary" to mIRC, but not anymore thanks to this BOM support.

As for returning the matchtext, you should be able to use a subsequent call to $fread with $regex to match any text. I guess it could help to fill $regml when fseek -r is used, but I'm not sure how you'd return the matchtext of a wildcard.

Thirdly, $fread() isn't "forced" to read to the next line, this is simply the default behaviour, since it's the most common file operation. You can always utilize the switches to make it read N bytes, or read N to a bvar and set a %var to $bvar(&bvar,1-).text


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Apr 2003
Posts: 342
M
Fjord artisan
OP Offline
Fjord artisan
M
Joined: Apr 2003
Posts: 342
Originally Posted By: argv0
Binary reading/writing has nothing to do with files that have a BOM. In fact, a BOM implies that the data is Unicode text, not binary data (unicode is not binary data). Realize that by supporting the BOM mIRC will be able to translate any UTF-16 into UTF-8 plaintext, so those files *used* to be seen as "binary" to mIRC, but not anymore thanks to this BOM support.


mIRC still doesn't support UTF-16, and the byte offset marker is optional. And mIRC shouldn't convert anything unless I desire it too...

Quote:
As for returning the matchtext, you should be able to use a subsequent call to $fread with $regex to match any text. I guess it could help to fill $regml when fseek -r is used, but I'm not sure how you'd return the matchtext of a wildcard.

Thirdly, $fread() isn't "forced" to read to the next line, this is simply the default behaviour, since it's the most common file operation. You can always utilize the switches to make it read N bytes, or read N to a bvar and set a %var to $bvar(&bvar,1-).text


Here... I'll get right to the point...

Download this XML Dump from my TiVo. 360K, all on one line.

Now, just the opposite... take a look at the following snippet...
Code:
<font color="#555555"><b><!--79,00,00,60555555,00000027,00000027,0000,00,01,00,00--> Merchandise placed on auction.If merchandise remains unsold after 9 weeks (Vana'diel time), it will be returned to your current residence.
If a successful bid is made, the proceeds from the sale will be delivered to your current residence.
Signed items will lose their signature after being purchased.</b></font><br> 
<!--TIME: 2010,1,1,5,23,44, --> 
<font color="#555555"><b><!--79,00,00,60555555,00000028,00000028,0000,00,01,00,00--> You have to pay a transaction fee of 51 gil.</b></font><br> 
<font color="#555555"><b><!--79,00,00,60555555,00000029,00000029,0000,00,01,00,00--> Merchandise placed on auction.If merchandise remains unsold after 9 weeks (Vana'diel time), it will be returned to your current residence.
If a successful bid is made, the proceeds from the sale will be delivered to your current residence.
Signed items will lose their signature after being purchased.</b></font><br> 
<font color="#555555"><b><!--79,00,00,60555555,0000002a,0000002a,0000,00,01,00,00--> You have to pay a transaction fee of 51 gil.</b></font><br> 
<font color="#555555"><b><!--79,00,00,60555555,0000002b,0000002b,0000,00,01,00,00--> Merchandise placed on auction.If merchandise remains unsold after 9 weeks (Vana'diel time), it will be returned to your current residence.
If a successful bid is made, the proceeds from the sale will be delivered to your current residence.
Signed items will lose their signature after being purchased.</b></font><br> 
<font color="#555555"><b><!--79,00,00,60555555,0000002c,0000002c,0000,00,01,00,00--> You have to pay a transaction fee of 51 gil.</b></font><br> 
<font color="#555555"><b><!--79,00,00,60555555,0000002d,0000002d,0000,00,01,00,00--> Merchandise placed on auction.If merchandise remains unsold after 9 weeks (Vana'diel time), it will be returned to your current residence.
If a successful bid is made, the proceeds from the sale will be delivered to your current residence.
Signed items will lose their signature after being purchased.</b></font><br> 


Text between tags is hard wrapped. And /fseek <-r|-w> <handle> <string> appears limited too the end of line. The solution would be to include a $fseek(<handle>, expression) function I suppose, which would return the length of the matched string. Then you could use $fread(<handle>,<length>,&bvar).






Beware of MeStinkBAD! He knows more than he actually does!
Joined: Dec 2002
Posts: 5,411
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,411
Note that in v7.02 there is a bug in /fseek -rw relating to end-of-line handling that has been fixed for the next version.

Joined: Apr 2003
Posts: 342
M
Fjord artisan
OP Offline
Fjord artisan
M
Joined: Apr 2003
Posts: 342
You know, I was thinking that using PCRE with /fseek doesn't make too much sense... rather just finding the next offset that matches an exact string in the file would work just fine. But it must treat EOLs like any other character, and unless specified as part of the matching string, it continues seeking. Then load it into a &bvar where it can be further broken down.

Problem is, you can't perform regular expressions directly with binary variables.


Beware of MeStinkBAD! He knows more than he actually does!
Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
/f* commands do not just operate on binvars. For binvars, regex may not make [as much] sense, but for plaintext operations I see no problem.


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Feb 2003
Posts: 307
T
Fjord artisan
Offline
Fjord artisan
T
Joined: Feb 2003
Posts: 307
Hi Khaled

It seems fseek -l is also not working well

It came to my attention because it breaks one of my scripts.
It seems to be counting one line less then it should.

Best regards

Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
How are you reproducing this? I'm not getting any issues:

Code:
/write -c test.txt a
/write test.txt b
/write test.txt c

//echo -a $lines(test.txt)
; 3

/fopen a test.txt
/fseek -l a 1
; * fseek set 'a' to line 1
/fseek -l a 2
; * fseek set 'a' to line 2
/fseek -l a 3
; * fseek set 'a' to line 3
/fseek -l a 4
; * fseek failed on 'a'


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Feb 2003
Posts: 307
T
Fjord artisan
Offline
Fjord artisan
T
Joined: Feb 2003
Posts: 307
Hi there, thanks for your reply

try to do a fread on top of that, and check if it match smile

* fseek set 'a' to line 1
- ---> //echo ai $fread(a)
ai b


Also i couldn't access to line 1 with it (the a letter) using your example

regards

Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
You're right, /fseek -l seems to seek to the *end* of the line rather than the beginning.

In the meantime you can use /fseek a 0 to seek to / read the first line, though Khaled will probably correct this issue.


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Feb 2003
Posts: 307
T
Fjord artisan
Offline
Fjord artisan
T
Joined: Feb 2003
Posts: 307
Hi again,

I did try that.

I got second line again smile

Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
Are you sure you're on the latest beta?

Note that I did not use the -l switch there.


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Feb 2003
Posts: 307
T
Fjord artisan
Offline
Fjord artisan
T
Joined: Feb 2003
Posts: 307
Yes i am sure!

0 returns line 2 (but shows as line 1 in fseek)
1 returns line 2
2 returns line 3
....

Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
Make sure you copy the following lines exactly:

Code:
/fopen a test.txt
/fseek a 0
//echo -a $fread(a)


The above returns line 1, not line 2.


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Dec 2002
Posts: 5,411
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,411
Thanks I was able to reproduce the /fseek -l issue, it should be fixed for the next version.

Joined: Apr 2003
Posts: 342
M
Fjord artisan
OP Offline
Fjord artisan
M
Joined: Apr 2003
Posts: 342
Originally Posted By: argv0
/f* commands do not just operate on binvars. For binvars, regex may not make [as much] sense, but for plaintext operations I see no problem.


Binvars maintain \n and \r charactes. Binvars maintain whitespace characters. Binvars don't have a length limit of 4KB. You can deal with files that are a mix of text and encoded binary. Occasionally non-encoded binary.


Beware of MeStinkBAD! He knows more than he actually does!

Link Copied to Clipboard