You can check the value of the Content-Length header, and work something out with that.
For instance, you could sockmark the connection with the size of the headers + value from the content length header, together they tell you the entire data you will receive. Once ($sock($sockname).mark == $sock($sockname).rcvd) you'll know you're at the end of the data.
For subsequent receiving of data, you'll just increment the value of the sockmark with the new content length and size of headers (which you can easily catch by incrementing the value with $sockbr in the headers phase), and then do the same comparison again with .rcvd.
Personally, I'd probably go for the on sockclose event, but if you absolutely want to keep it open...
Looking for </html> could be a workable solution, however if you're reading a text file, an XML document, or other data, this will be of no use. Although you did mention that its to read a webpage, I do think it's an important point to raise for the general case.
Even if we are dealing with HTML here, it's not certain that you will actually receive the </html> in one piece, depending on the formatting of the site. Heck, some sites don't even have a closing html tag, if I look at the source of my Gmail inbox, there's no </html>, only an opening <html> at the beginning.
However, if you know the formatting is decent, and it looks like you'll be able to catch </html> just fine, then this looks like the best way when keeping the socket open.