Another follow-up post, this one on the number of books sold. I’m putting this up as it does bring some clarity and actual numbers to the game…

One thing the PRH/SS merger trial revealed is that publishing has a lot of problems. This is very true! At the same time, many of the problems seem to have mutated into unbelievable chimeras as they made their way around the discourse. Today, for example, much of the literary internet was debating a claim that 50% of books published sell fewer than 12 books.

Full article HERE from Lincoln Michel’s Countercraft substack h/t Monalisa Foster

This one has gone back and forth for a few days now, with a number of interesting ‘articles/comments/commentary’…

But I do want to post one of the comments on this thread in full, since Kristen McLean from NPD Bookscan which puts a few REAL numbers on what she sees…

Hey y’all, it’s Kristen McLean, lead industry analyst from NPD BookScan. I thought I would chime in with some numbers here, since that statistic from the DOJ is super-misleading, and I’m not sure where it originally came from, since we did not provide it directly.

It is possible it came from our data, and was provided by one of the publisher parties, but based on the 58,000 figure, it’s not obvious what exactly it includes in terms of “publisher frontlist”. 58,000 titles is way too small a number for “all frontlist books published in a year by every publisher”–that’s more like 487,000 frontlist titles–so it’s clear it’s a slice but I’m not sure HOW it was sliced.

NPD BookScan (BookScan is owned by The NPD Group, not Nielsen, BTW), collects data on print book sales from 16,000 retail locations, including Amazon print book sales. Included in those numbers are any print book sales from self-publishing platforms where the author has opted for extended distribution and a print book was sold by Amazon or another retailer. So that 487K “new book” figure is all frontlist books in our data showing at least 1 unit sale over the last 52 weeks coming from publishers of all sizes, including individuals.

Lots of press outlets have been calling about it today, so I did a little digging to see if I could reverse-engineer the citation, and am happy to share our numbers here for clarity.

Because this is clearly a slice, and most likely provided by one of the parties to the suit, I decided to limit my data to the frontlist sales for the top 10 publishers by unit volume in the U.S. Trade market. My ISBN list is a little smaller than the one quoted in the DOJ, but the principals will be the same.

The data below includes frontlist titles from Penguin Random House, Simon & Schuster, Hachette Book Group, HarperCollins, Scholastic, Disney, Macmillan, Abrams, Sourcebooks, and John Wiley. The figures below only include books published by these publishers themselves, not pubishers they distribute.

Here is what I found. Collectively, 45,571 unique ISBNs appear for these publishers in our frontlist sales data for the last 52 weeks (thru week ending 8-24-2022).

In this dataset:

>>>0.4% or 163 books sold 100,000 copies or more

>>>0.7% or 320 books sold between 50,000-99,999 copies

>>>2.2% or 1,015 books sold between 20,000-49,999 copies

>>>3.4% or 1,572 books sold between 10,000-19,999 copies

>>>5.5% or 2,518 books sold between 5,000-9,999 copies

>>>21.6% or 9,863 books sold between 1,000-4,999 copies

>>>51.4% or 23,419 sold between 12-999 copies

>>>14.7% or 6,701 books sold under 12 copies

So, only about 15% of all of those publisher-produced frontlist books sold less than 12 copies. That’s not nothing, but nowhere as janky as what has been reported.

BUT, I think the real story is that roughly 66% of those books from the top 10 publishers sold less than 1,000 copies over 52 weeks. (Those last two points combined)

And less than 2% sold more than 50,000 copies. (The top two points)

Now data is a funny thing. It can be sliced and diced to create different types of views. For instance we could run the same analysis on ALL of those 487K new books published in the last 52 weeks, which includes many small press and independetly published titles, and we would find that about 98% of them sold less that 5,000 copies in the “trade bookstore market” that NPD BookScan covers. (I know this IS a true statistic because that data was produced by us for The New York Times.)

But that data does not include direct sales from publishers. It does not include sales by authors at events, or through their websites. It does not include eBook sales which we track in a separate tool, and it doesn’t include any of the amazing reading going on through platforms like Substack, Wattpad, Webtoons, Kindle Direct, or library lending platforms like OverDrive or Hoopla.

BUT, it does represent the general reality of the ECONOMICS of the publishing market. In general, most of the revenue that keeps publishers in business comes from the very narrow band of publishing successes in the top 8-10% of new books, along with the 70% of overall sales that come from BACKLIST books in the current market. (Backlist books have gained about 4% in share from frontlist books since the pandemic began, but that is a whole other story.)

The long and short of it is publishing is very much a gambler’s game, and I think that has been clear from the testimony in the DOJ case. It is true that most people in publishing up to and including the CEOs cannot tell you for sure what books are going to make their year. The big advantage that publisher consolidation has brought to the top of the market is deeper pockets and more resources to roll those dice. More money to get a hot project. More money to influence outcomes through marketing, more access to sales and distribution mechanisms, and easier access to the gatekeepers who decide what books make it onto retailers’ shelves. And better ability to distribute risk across a bigger list of gambles.

It is largely a numbers game and I’m not just saying that because I’m a numbers gal. It’s a tough business.

Hope this is helpful.

I believe her numbers and her analysis is dead on the money. And she’s right, the numbers can be sliced/diced/massaged to get pretty much anything you want out of them. When you add in that most authors are never allowed to SEE the books and see what their actual sales/sell through is, it makes sense.

If any of this makes sense…


Sigh… — 6 Comments

  1. The bookkeeping in the publishing business reminds me VERY much of the accounting in the film business ie. if you don’t have the money/clout to force your own accountant into the mix you ain’t gunna get real figures.

    • Truly, my thoughts exactly. Never take a cut of the Net, always take a cut of the Gross.

  2. Carlton/Beans- True. That is why I’ve stayed Indie. That way I get to see the actual numbers (at least according to Amazon)…

  3. Interesting that they include Scholastic (kids stuff) and Wiley (academic and specialty). Given the price of some academic books, I can believe the “very low number sold.” And Wiley’s not the worst. *coughElsevierandSpringercough*

  4. The data is also limited to just the last 52 weeks – so some of the books included could have been best sellers years earlier, and are now showing up in the sub 1,000 copies per year category. I’ve been doing the e-reading thing for just over a decade now, where I only buy printed books when they don’t exist on Kindle, or are highly technical – and since mom died last year, I’ve reread several series including Robert B. Parker’s Spenser novels, and John Sandford’s Lucas Davenport and and Virgil Flowers series, as well as these Grey Man novels that are a lot of fun to reread. For Parker and Davenport that resulted in me buying earlier titles again in e-format; but I’d be surprised if their earlier titles are still selling as well as they did at the time of publication…