|
|
|
Joined: Apr 2003
Posts: 342
Fjord artisan
|
OP
Fjord artisan
Joined: Apr 2003
Posts: 342 |
Sheesh I gotta be somewhere in ten minutes i'll be quick. 1. Added Unicode/UTF-8 BOM support to file routines. This applies to all files that are read or written as text, including ini files. You will need to force binary reading/writing now with /fopen. Add switches for read-only mode too. /fopen -rwb (-read, -write, -binary) /fseek -w <wildcard> /fseek -r <regex> You *need* to fix $fread to get the matchtext or matched expressions after performing this. You should not force using $fread to read up to the next line when using a standard %var. That's it. Gotta run.
Beware of MeStinkBAD! He knows more than he actually does!
|
|
|
|
Joined: Oct 2003
Posts: 3,918
Hoopy frood
|
Hoopy frood
Joined: Oct 2003
Posts: 3,918 |
Binary reading/writing has nothing to do with files that have a BOM. In fact, a BOM implies that the data is Unicode text, not binary data (unicode is not binary data). Realize that by supporting the BOM mIRC will be able to translate any UTF-16 into UTF-8 plaintext, so those files *used* to be seen as "binary" to mIRC, but not anymore thanks to this BOM support.
As for returning the matchtext, you should be able to use a subsequent call to $fread with $regex to match any text. I guess it could help to fill $regml when fseek -r is used, but I'm not sure how you'd return the matchtext of a wildcard.
Thirdly, $fread() isn't "forced" to read to the next line, this is simply the default behaviour, since it's the most common file operation. You can always utilize the switches to make it read N bytes, or read N to a bvar and set a %var to $bvar(&bvar,1-).text
- argv[0] on EFnet #mIRC - "Life is a pointer to an integer without a cast"
|
|
|
|
Joined: Apr 2003
Posts: 342
Fjord artisan
|
OP
Fjord artisan
Joined: Apr 2003
Posts: 342 |
Binary reading/writing has nothing to do with files that have a BOM. In fact, a BOM implies that the data is Unicode text, not binary data (unicode is not binary data). Realize that by supporting the BOM mIRC will be able to translate any UTF-16 into UTF-8 plaintext, so those files *used* to be seen as "binary" to mIRC, but not anymore thanks to this BOM support. mIRC still doesn't support UTF-16, and the byte offset marker is optional. And mIRC shouldn't convert anything unless I desire it too... As for returning the matchtext, you should be able to use a subsequent call to $fread with $regex to match any text. I guess it could help to fill $regml when fseek -r is used, but I'm not sure how you'd return the matchtext of a wildcard.
Thirdly, $fread() isn't "forced" to read to the next line, this is simply the default behaviour, since it's the most common file operation. You can always utilize the switches to make it read N bytes, or read N to a bvar and set a %var to $bvar(&bvar,1-).text Here... I'll get right to the point... Download this XML Dump from my TiVo. 360K, all on one line. Now, just the opposite... take a look at the following snippet... <font color="#555555"><b><!--79,00,00,60555555,00000027,00000027,0000,00,01,00,00--> Merchandise placed on auction.If merchandise remains unsold after 9 weeks (Vana'diel time), it will be returned to your current residence.
If a successful bid is made, the proceeds from the sale will be delivered to your current residence.
Signed items will lose their signature after being purchased.</b></font><br>
<!--TIME: 2010,1,1,5,23,44, -->
<font color="#555555"><b><!--79,00,00,60555555,00000028,00000028,0000,00,01,00,00--> You have to pay a transaction fee of 51 gil.</b></font><br>
<font color="#555555"><b><!--79,00,00,60555555,00000029,00000029,0000,00,01,00,00--> Merchandise placed on auction.If merchandise remains unsold after 9 weeks (Vana'diel time), it will be returned to your current residence.
If a successful bid is made, the proceeds from the sale will be delivered to your current residence.
Signed items will lose their signature after being purchased.</b></font><br>
<font color="#555555"><b><!--79,00,00,60555555,0000002a,0000002a,0000,00,01,00,00--> You have to pay a transaction fee of 51 gil.</b></font><br>
<font color="#555555"><b><!--79,00,00,60555555,0000002b,0000002b,0000,00,01,00,00--> Merchandise placed on auction.If merchandise remains unsold after 9 weeks (Vana'diel time), it will be returned to your current residence.
If a successful bid is made, the proceeds from the sale will be delivered to your current residence.
Signed items will lose their signature after being purchased.</b></font><br>
<font color="#555555"><b><!--79,00,00,60555555,0000002c,0000002c,0000,00,01,00,00--> You have to pay a transaction fee of 51 gil.</b></font><br>
<font color="#555555"><b><!--79,00,00,60555555,0000002d,0000002d,0000,00,01,00,00--> Merchandise placed on auction.If merchandise remains unsold after 9 weeks (Vana'diel time), it will be returned to your current residence.
If a successful bid is made, the proceeds from the sale will be delivered to your current residence.
Signed items will lose their signature after being purchased.</b></font><br> Text between tags is hard wrapped. And /fseek <-r|-w> <handle> <string> appears limited too the end of line. The solution would be to include a $fseek(<handle>, expression) function I suppose, which would return the length of the matched string. Then you could use $fread(<handle>,<length>,&bvar).
Beware of MeStinkBAD! He knows more than he actually does!
|
|
|
|
Joined: Dec 2002
Posts: 5,524
Hoopy frood
|
Hoopy frood
Joined: Dec 2002
Posts: 5,524 |
Note that in v7.02 there is a bug in /fseek -rw relating to end-of-line handling that has been fixed for the next version.
|
|
|
|
Joined: Apr 2003
Posts: 342
Fjord artisan
|
OP
Fjord artisan
Joined: Apr 2003
Posts: 342 |
You know, I was thinking that using PCRE with /fseek doesn't make too much sense... rather just finding the next offset that matches an exact string in the file would work just fine. But it must treat EOLs like any other character, and unless specified as part of the matching string, it continues seeking. Then load it into a &bvar where it can be further broken down.
Problem is, you can't perform regular expressions directly with binary variables.
Beware of MeStinkBAD! He knows more than he actually does!
|
|
|
|
Joined: Oct 2003
Posts: 3,918
Hoopy frood
|
Hoopy frood
Joined: Oct 2003
Posts: 3,918 |
/f* commands do not just operate on binvars. For binvars, regex may not make [as much] sense, but for plaintext operations I see no problem.
- argv[0] on EFnet #mIRC - "Life is a pointer to an integer without a cast"
|
|
|
|
Joined: Feb 2003
Posts: 307
Fjord artisan
|
Fjord artisan
Joined: Feb 2003
Posts: 307 |
Hi Khaled
It seems fseek -l is also not working well
It came to my attention because it breaks one of my scripts. It seems to be counting one line less then it should.
Best regards
|
|
|
|
Joined: Oct 2003
Posts: 3,918
Hoopy frood
|
Hoopy frood
Joined: Oct 2003
Posts: 3,918 |
How are you reproducing this? I'm not getting any issues:
/write -c test.txt a
/write test.txt b
/write test.txt c
//echo -a $lines(test.txt)
; 3
/fopen a test.txt
/fseek -l a 1
; * fseek set 'a' to line 1
/fseek -l a 2
; * fseek set 'a' to line 2
/fseek -l a 3
; * fseek set 'a' to line 3
/fseek -l a 4
; * fseek failed on 'a'
- argv[0] on EFnet #mIRC - "Life is a pointer to an integer without a cast"
|
|
|
|
Joined: Feb 2003
Posts: 307
Fjord artisan
|
Fjord artisan
Joined: Feb 2003
Posts: 307 |
Hi there, thanks for your reply try to do a fread on top of that, and check if it match  * fseek set 'a' to line 1 - ---> //echo ai $fread(a) ai b Also i couldn't access to line 1 with it (the a letter) using your example regards
|
|
|
|
Joined: Oct 2003
Posts: 3,918
Hoopy frood
|
Hoopy frood
Joined: Oct 2003
Posts: 3,918 |
You're right, /fseek -l seems to seek to the *end* of the line rather than the beginning.
In the meantime you can use /fseek a 0 to seek to / read the first line, though Khaled will probably correct this issue.
- argv[0] on EFnet #mIRC - "Life is a pointer to an integer without a cast"
|
|
|
|
Joined: Feb 2003
Posts: 307
Fjord artisan
|
Fjord artisan
Joined: Feb 2003
Posts: 307 |
Hi again, I did try that. I got second line again 
|
|
|
|
Joined: Oct 2003
Posts: 3,918
Hoopy frood
|
Hoopy frood
Joined: Oct 2003
Posts: 3,918 |
Are you sure you're on the latest beta?
Note that I did not use the -l switch there.
- argv[0] on EFnet #mIRC - "Life is a pointer to an integer without a cast"
|
|
|
|
Joined: Feb 2003
Posts: 307
Fjord artisan
|
Fjord artisan
Joined: Feb 2003
Posts: 307 |
Yes i am sure!
0 returns line 2 (but shows as line 1 in fseek) 1 returns line 2 2 returns line 3 ....
|
|
|
|
Joined: Oct 2003
Posts: 3,918
Hoopy frood
|
Hoopy frood
Joined: Oct 2003
Posts: 3,918 |
Make sure you copy the following lines exactly:
/fopen a test.txt
/fseek a 0
//echo -a $fread(a)
The above returns line 1, not line 2.
- argv[0] on EFnet #mIRC - "Life is a pointer to an integer without a cast"
|
|
|
|
Joined: Dec 2002
Posts: 5,524
Hoopy frood
|
Hoopy frood
Joined: Dec 2002
Posts: 5,524 |
Thanks I was able to reproduce the /fseek -l issue, it should be fixed for the next version.
|
|
|
|
Joined: Apr 2003
Posts: 342
Fjord artisan
|
OP
Fjord artisan
Joined: Apr 2003
Posts: 342 |
/f* commands do not just operate on binvars. For binvars, regex may not make [as much] sense, but for plaintext operations I see no problem. Binvars maintain \n and \r charactes. Binvars maintain whitespace characters. Binvars don't have a length limit of 4KB. You can deal with files that are a mix of text and encoded binary. Occasionally non-encoded binary.
Beware of MeStinkBAD! He knows more than he actually does!
|
|
|
|
|
|
|
|