There's No Free Lunch When it Comes to Performance (Or Anything Else for that Matter)

6 October 2013

Originally posted on Tumblr, 6 October 2013.

Was just reading a new Brent Simmons’ blog post about how he is reinvestigating Core Data for the Vesper (link no longer available) app.

As someone who has used Core Data for years, and generally had good experiences, I was always a bit peeved when Brent would bag the framework because it was slower than raw SQL in a few corner cases. Brent is a big name in the development community, and frankly I think he has managed to turn quite a few developers off the framework, for no good reason.

Brent would probably say he never tried to dissuade anyone, and was just pointing out problems in exceptional cases, but I don’t know how many times I have heard developer’s ask: “Core Data, isn’t that slow? Didn’t Brent Simmons write a blog post about that?”

That he is reinvestigating the framework, and finding it is not as bad as he thought it would be, only rubs more salt into the wound. Even so, he has caged his bets, and states in the post that if he finds Core Data incapable of importing 30000 Vesper notes with performance comparable to SQL, he may still drop the framework. I find this a perplexing requirement, given the rest of the post, in which he describes how effective Core Data batch fetching is for performance of browsing table views.

In my view, Brent has this all backwards. He is thinking up a mythical user with 30000 Vesper notes, and wants to be sure that that 1/1000th of a percent of his customers doesn’t experience any lag. What he doesn’t seem to realize, is that that choice has a big impact on the other 99.999% of his costumers, who almost certainly will pay a performance penalty.

See, there’s no such thing as a free lunch in the performance game. It’s a mathematical fact. You can even read about it on Wikipedia. When you optimize a program for one scenario, you are making it less optimal for other scenarios.

Here’s a concrete example: Imagine Brent decides Core Data doesn’t cut the mustard when importing 30000 notes. He drops the framework and returns to using SQL directly. He’s now hitting the database directly, reading the disk. This is very effective for importing 30000 notes, but it is a lot slower than accessing RAM, which is what Core Data is optimized for.

Core Data is limiting accesses to the database by caching data in memory, and fetching data in batches. These techniques will not help with importing 30000 notes, but they will sure help when you have 500 notes, like 95% of Brent’s customers.

So he is right to revisit Core Data, but he his metrics all messed up. He should be optimizing for the 99% of users who have under 1000 notes, not the 1% who have more. Make it a pleasure for standard use, and useable for non-standard use. If it takes a minute or two longer to import your 30000 notes, fine. That’s an exceptional case.