Just an observation that the index wouldn't need to hold every line position for this to be effective. A couple dozen pointers stored for each file would be enough to divide the file into chunks which are each 4% of the line count. From there, it would be easy to divide the file similar to how the buckets divide the hashtable, so fseek could limit the search within the 400 lines within that chunk, instead of the 10000 lines of the entire file.