Knoeki's reply basically summarizes my point.

Yes, a well designed multi-threaded app will be more efficient than a single threaded app. That's not the question. The question is whether that efficiency is necessary. Most of the time it's not. If you want to show that it is, benchmark it-- theory doesn't really help.

Your example has a few problems. First, this is not how you implement caching. If a cache was "out of date" (hard to happen, since you would cache on-write and on-read), you still wouldn't reprocess the full "file", you would just invalidate the cache and cache on the next read. We agreed that we'd be talking about "well designed" systems here-- therefore, you would never process the full "file" -- again, there's a good chance you're even using an indexed database here . In fact, 10,000 records is handled extremely easily with MySQL or SQLite; so easily that a dataset of this size is really considered "small". These databases can handle queries over millions of records in a few milliseconds.

More importantly, there are algorithms to keep track of the "top 10" users in real time with O(1) complexity. If we were specifically talking about your scenario of accessing the top 10, this would be as simple as updating the list of 10 users whenever a round ended (offloading the "heavy" data processing after a round and therefore not delaying requests to answer). It would be nothing more than /msg %nick %trivia.top10, and updating %trivia.top10 with 10 nicknames after the round.

But even if this was any arbitrary query over a dataset (which you could still optimize for in the same manner as above), and even if you decided to badly design your system to reparse the index *during* a round (which you wouldn't do), we're still talking about an unlikely situation. It's easier to avoid this than to add the multithreading complexity in handling all that locking. Writing mutlithreaded code is really hard, it adds a lot of weird edge cases and a whole set of new problems. For instance, let's say you had that multithreaded code and 2 people asked for the top 10 list at once? You'd have to make sure you put a mutex around your file reprocessing, otherwise you'd end up corrupting your cache. Do you really need to add all that complexity instead of just updating your top 10 cache between rounds?

Like I said, multithreaded code is not useless, but it should be taken on a case by case basis and should really be the last resort. And at the level of a scripting language, you rarely need the last resort.


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"