Wednesday June 4, 2014

Machines For Making Books

I’ve long been interested in automating the creation of books in multiple formats. I like to dream about a kind of black box, where you put your plain text book in one end, and out the other end comes a web version, a PDF, a Kindle or ePub version, or even a physical paperback edition — all from a single source text or data store.

A great example is Tom Armitage’s code for making books out of his Pinboard links. Now that he’s set it up, he can generate decent printed books out of his reading material with a minimum of effort.

I’d like to finish a similar system of my own, for use with normal prose work. I wouldn’t really have to make my own in order to have that kind of functionality; I could just use Leanpub, currently the best¹ Machine For Making Books that I know of. Markdown files go in, digital ebooks, web books, and print-ready PDFs come out, no code or scripting necessary. But Leanpub’s choice of fonts and page design, although certainly adequate, feel pretty generic. If you’re a true publishing Jedi nerd, at some point you’re going to want your work to be more expressive — you’re going to want to build your own lightsaber.

Pollen

I saw yesterday that the eminent Matthew Butterick has released his own toolkit for digital bookmaking called Pollen, a programming language that generates text documents instead of software. This could be a useful approach — if you fall into the very narrow edge case for which Butterick designed it. Pollen is for books being published (a) directly on the web (b) without WordPress’s elephantine structure but that also (c) have content that can’t be expressed in Markdown. That last one is pretty rare and unusual for most authors. If you want a and b but don’t really need c, then another tool like Jekyll or Kirby would probably work just as well (and ultimately be just as complicated) for creating web-only documents.

Pollen looks like it could be a great tool if you need the capabilities afforded by (or just like the idea of) writing your book as you would a program, with code, formatting and text content all bound in together in a single system. And although it seems geared mainly at creating HTML books that live on the web, it could prove to be the kind of one-to-many tool I look for. Theoretically, for example, you could create Pollen templates that would output the book in other formats like LaTeX or ePub as well as in HTML.

Other Approaches

I have several half-finished attempts to build my own book-making machine. One approach that worked out very well was to use Pandoc as a kind of book compiler.² Pandoc converts documents between almost any document format you can think of: HTML, Markdown, LaTeX, Word, ePub, PDF, etc. Unlike Pollen, it doesn’t impose the need to learn a complicated set of templating rules or a new programming language. You can just create your templates in their respective formats, or even use Pandoc’s default templates. Once you have those templates, a bash script or a makefile setup would then suffice to turn your Markdown text files into LaTeX, HTML and ePub files. I like this approach because if you take away the scaffolding (pandoc) you still have a very useable source (plain text files). The same would not be true of Pollen.

Another possible route is Scrivener, which can compile a book from Markdown source to LaTeX (which you could then use to get PDFs for print or digital reading), and HTML. Scrivener doesn’t directly support converting Markdown sources into ePub files, but with some extra scaffolding you can make this work.

Finally, PrinceXML might be a good option for generating PDFs. It has the advantage of using HTML and CSS to drive the print design, which avoids the need to get messy with LaTeX in any way shape or form. This is what Tim Armitage (the Pinboard book guy above) uses in his system. I have avoided it so far because while it’s free for personal use, it’s quite pricey for commercial use.

For PDFs, especially print-ready PDFs³, I would hesitate to use a workflow that didn’t include LaTeX as the last step, because of its superior typesetting.

Covers

The ideal book machine would also include a mechanism for creating covers. Tim Armitage still just does his covers separately in Photoshop, without automating that part at all. But clearly it’s possible to do better. I love the work Karsten Schmidt did on the Faber Finds series, which generated a whole family of unique book covers algorithmically. I would like to create something like that, but using more of a Lapham-style design vocabulary (such as this lovely cover).

Of course it’s all about trade-offs, right? As much as I like to tinker, I personally love it when I don’t have to memorize huge amounts of extra information just to get basic results. (Which is why I avoid any workflow that involves using git to publish prose.) ↩
This is what you see in the Vine video above. At one point I had a couple of very nice LaTeX templates set up, including one using the Tufte book & handout layouts which worked very well with Markdown, converting footnotes into sidenotes and so forth. Then I switched platforms and haven’t got around to reassembling it all on my Macbook. ↩
You might be tempted to think the same PDF could be used for screen reading and printed books, but you really shouldn’t think that. Print-ready PDFs have their own oddities — such as different margins for odd/even pages, and ligatures (which impair the text’s searchability) — that make them unsuitable for screen-based reading. ↩