I'm not sure how many ebooks there are, but for now at least, there's more than one story.
It has been widely reported, since late March, that the iBooks Store would open, and did open, with a total catalog of about 60,000 books, of which about half (30,000) are free listings from Project Gutenberg.
This has been the story from Apple's friends at Gizmodo, reported here:
The official Apple way to get ebooks for the iPad, the iBooks store has just 60,000 books in it for now, all in the ePub format.
That 60,000 number has been reported elsewhere by ZDnet, the Christian Science Monitor, USA Today, and the New York Times' David Pogue, to name a few. And of course, it's important. For some of us, the potential of the iBooks Store and other iPad reading apps like the Kindle, Stanza, AudioBooks, and Kobo Books have been a major influence in our decision to buy an iPad.
Then along comes an O'Reilly Radar blog post Thursday by Ben Lorica, a Senior Researcher in the Market Research Group at O'Reilly Media, Inc., to take away any confidence that I had in that 60,000 figure. Lorica provides a professional presentation with tables to break down the catalog in the iBooks Store by category or genre, publisher, and mean ebook price within top categories. It's an interesting and graphically impressive chart. In addition to being posted on O'Reilly's highly respected research website, it has been picked up by reputable sites like Teleread and discussed right here at iPad Nation Daily.
After its initial statements about the number of books downloaded from the iBooks store during the first weekend in April, Apple has been uncharacterustically quiet lately about relevant catalog or sales numbers in its iBooks store. Amazon is even quieter about such numbers with regard to the Kindle Store, but the overall transparency of the Kindle Store for search, sort, and browse makes it easy to mine and quantify Amazon's data, whereas the iBooks Store gives up very little metadata about itself. So it's great to have a professional come in and mine the data, and Lorica's web page makes it clear that he is a no slouch when it comes to having the chops as a miner, stat guy, and market analyst:
I am currently I have applied Business Intelligence, Data Mining and Statistical Analysis in a variety of domains including Financial Engineering, Direct Marketing, Consumer and Market Research, Targeted Advertising, and Text Mining. My background includes stints with an investment management company, internet startups, and financial services. At O'Reilly, I work on custom research and consulting projects, open source data warehousing and analytics. I remain interested in Quantitative Finance and my musings can be found on my blog The Practical Quant. An ex-academic, I was an Assistant Professor at U.C. Davis and was the founding Department Chair for Statistics and Mathematics at C.S.U. Monterey Bay. I have been a visiting member of the Mathematical Sciences Research Institute in Berkeley and have taught at U.C. Santa Barbara and the University of the Philippines. I enjoy writing and have written and published on topics ranging from Applied Mathematics and Statistics, Finance, Marketing Research, and Technology.Only one problem: when Lorica adds it all up, he refers to the total of "books available through the iBooks app" as "over 46,000 (paid and free)."
Excuse me? I'm no data miner, but I have some experience with communication. Ordinarily when someone says "over 46,000" it is shorthand for 46,326 or 46,785, or something along those lines. It is not a phrase that anyone would intentionally use if they meant "about 60,000." Nowhere in the presentation does Lorica use the word "estimate" or "about" say words to the effect that "these are rough figures." When Teleread linked to the post, Teleread editor Paul K. Biba read the 46,000 figure as definitively as I did -- "O’Reilly Radar has some data on the 46,000 (paid and free) bnooks [sic] available through the iBooks app" -- and there are no comments from Lorica or O'Reilly protesting the specificity.
Are such numbers important? Of course they will all be forgotten in time if the iPad and, conceivably, the iBooks Store continue their dizzying ascent to nerd nirvana. But there's a big difference between 46,000 and 60,000, and someone ought to have egg on his or its face here, at the very least. When New York Times tech columnist David Pogue caught Barnes & Noble trying to downsize the weight of its Nook ebook reader by less than an ounce in January, his widely discussed rant entitled Bogus Tech Measurements added to the overall bad taste that rendered the initial launch of the Nook a disaster of Edselian proportions.
Like many other iPad Nation citizens, I care about my iPad and I care about such details. I've been trying to get either Lorica and O'Reilly, or Apple, to clean it up. I typed the following email on my Mac and sent it to four individuals in Apple's press office about 15 hours ago, when there were still three or four hours left in what I assume is the workday in Cupertino:
Subject: Question about size of iBooks catalog after O'Reilly Research report that it's only about 46,000
By way of introduction, I have a relatively new but popular blog called iPad Nation Daily, a sister blog called Kindle Nation Daily, and my book on the Kindle was the #1 seller, period, in the Kindle Store way back in 2008. I'm also a happy iPad owner who has bought iPod Touches for my son, daughter, and girlfriend.
Although we've all been reporting that the iBooks store launched with about 60,000 titles, a pretty reliable researcher and data guy named Ben Lorica had a post yesterday on O'Reilly Radar in which he claimed that the total number of titles in the iBooks store is only 46,000, with about 32% of them free.
Could you verify that the figure at launch was about 60,000 or otherwise correct the record on this? I don't want to play gotcha and I would be happy to learn that Lorica was in error or that there was some other basis for the discrepancy.
I look forward to hearing from you or a colleague as soon as possible, and you can feel free either to email me here or call my cell at 339-368-xxxx.
I also sent Ben a couple of comments:
Terrific work, Ben. I'm surprised that the aggregate figure is 46,000, since we were all hearing 60,000 as of the 4/3 release date and would have expected some growth since then. Is there any chance that the 46,000 figure covers titles for which there has been an actual sale, or am I barking up the wrong apple tree? I'm naturally resistant to the notion that Apple would have padded the numbers!
Ben, sorry to be a nudge about this, but I'm still confused about the 46,000 figure. Do you know of anything in your approach that could have led to an undercounting of the total size of the catalog? Reporting of 60,000+ books has been consistent across all other media, so if it is only 46,000 you have a real story.
It took him 26 hours to respond, but he did finally email and post the following:
Stephen Windwalker,So, let's see, Ben. The original post said nothing about estimates and included no disclaimer to offset its presentation of numbers as definitive. Now, in the 12th comment on the post, you disclose that your original numbers are not only "estimates," but "estimates at best?" During your stints as an Assistant Professor at U.C. Davis or as the founding Department Chair for Statistics and Mathematics at C.S.U. Monterey Bay, or in your publications on topics ranging from Applied Mathematics and Statistics, Finance, Marketing Research, and Technology, did you ever advise students that they could throw numbers that were "estimates at best" around to create a definitive-looking public presentation without any disclaimer but, not to worry, they could always come back in an offhand comment a day or two later and explain that they were "estimates at best?"
What I said was "... half of the over 46,000 ...".
Our numbers are estimates at best, thus my preference for showing percentages, instead of absolute numbers.
I don't meant to harsh anyone's mellow out there in California. Maybe the total catalog is 60,000, maybe it's 46,000, and maybe it's somewhere in between. I'm a huge fan of O'Reilly Media and (until now) all it does, of Apple's products, and of the truth. I love the fact that Ben Lorica did the study, and I would like to be able to rely on presentations like this from O'Reilly.
But Lorica and O'Reilly should either clean the original post up and state the necessary disclaimers prominently from the top, or take it down with an apology that is not burdened by any aggression, active or passive, toward the messenger. A statement that they understand the importance of relative precision in such matters would also be appropriate.
If the iBooks catalog figure is indeed north of 60,000, as I hope and still expect it to be, Apple doesn't have to do a thing other than get on the case of growing it from such puny levels.
On the other hand, if the real figure is only 46,000 or thereabouts, Mr. Jobs, you got some splainin' to do....
Update, 3 :30 pm ET 5.1.2010: Ben Lorica responded significantly with the following comment, and changed the wording of his original post accordingly:
I made the adjustments to reflect that the mix of titles by category and publisher are estimates based on the titles "... we detected as being offered through the iBooks app...".
This seems like a fair representation, and of course it puts the onus squarely on Apple to provide two pieces of information:
- First, what is the actual number of titles at this point in the iBooks Store?
- Second, if it is significantly higher than 46,000, why are some titles undetectable?