How to create an eBook for free


Simon Brock reveals the open-source tools you’ll need to create an eBook for the Kindle or iPad.

Over the past year or so, I’ve been developing for handheld platforms such as the iPhone, iPad, Android and Kindle, so I’ve accumulated a great deal of knowledge about how these gizmos work and how to get the best out of them.

You might consider eBooks to be the duller end of handheld software, since “apps” get all the attention nowadays, while eBooks are, well, just books.

From my viewpoint, however, eBooks present two sets of interesting problems that need solving: how do I get content into the right format (and hence to the reader); and, once I have it in eBook form, what else can I do with it?

This last point is crucial, because many current eBooks are merely pale imitations of their paper versions – for example, a cookery book that I sampled by a well-known celebrity chef was actually devoid of pictures!

Images are the one thing you can add to an eBook for almost zero cost (with the exception of Amazon’s 3G distribution charges), compared with a paper book in which they enormously increase the printing and paper costs.

Most eBook readers can present far more than just text, and even ones with limited displays still offer various possibilities for interactivity (iBooks offer much, much more).

What is an eBook?

Regrettably, creating eBooks is currently nowhere near as easy as it should be. However, there are a few open-source tools that can help, and over the next year or so they’ll get far better.

An eBook is a book, formatted so that it can be read on some form of electronic reader device. The most widely accepted format is “ePub”, which is supported directly by most devices other than the Kindle – but even if you’re publishing for Kindle, you’ll normally first create an ePub file, then convert that into Kindle format.

At its simplest, an ePub file is nothing more than a zip file. If you take any ePub file and change its extension to ZIP, your favourite unzipping program will show what’s inside it.

In most cases, you’ll discover a collection of HTML files, some media-like images, audio or video, and then a bunch of other files, which are important since they define the metadata for the book – such as its title and ISBN, its chapter structure and files that are included in it. (This metadata must be present and correct for the book to be accepted by some publishers.)

Since an ePub book is essentially a collection of web pages, if you’re familiar with modern HTML and CSS techniques, you’re already halfway to being able to produce and improve on eBooks. However, don’t be fooled into thinking you can just take your website and convert it straight into an eBook – you need to understand how an eBook works once it’s installed on a reader.

Different eBook readers have different HTML rendering abilities: at one end of the spectrum is the Kindle and its converter program, which have limited support for current web standards, while at the other end, iBooks for iPad and iPhone employ the full WebKit rendering engine used by the Safari and Chrome web browsers. Even so, the iBook’s version of WebKit has its oddities, and to understand them you need to understand how an eBook reader displays content.

The trouble with basing eBooks on HTML is that their pages don’t behave like typical web pages. The latter are rendered in a window that can scroll up and down vertically without limit, while an eBook reader wants to render each page of a book as a static image. You can’t scroll down this – you can only flip to the next page. Typically, each chapter of an eBook is contained in a single HTML file that’s presented as a collection of pages, which, if opened in your web browser, would look like a single web page.

So the first thing you have to remember is that some program is going to take your content and chop it up into fixed-length pages. This introduces complexities. The pages may vary in size, as may the font they’re rendered in. For example, I can view an iBook on an iPad in portrait mode, one page at a time, or in landscape mode two pages at a time – which will require paginating the book in two totally different ways. Worse still, the reader may vary the font size and face, both of which again affect the pagination.

Another major pagination problem arises when you have pictures in your text: when the reader is rendering HTML into pages, it must decide whether each image will fit on to the current page. Generally, the guidelines for creating eBooks tell you not to put fixed sizes on your images. When embedding an image in an HTML page for the web, you’d typically write something like:<img src=”Images/MyPics.jpg” width=”400″ height=”600″>

This tells the browser to display the picture in a box 400 pixels wide by 600 pixels high. I’d supply the image in that size, and always make such image declarations as explicit as possible, so the browser knows how to present the page the way I want it. For an eBook, on the other hand, I’m encouraged to declare images like this: <img src=”Images/MyPics.jpg”>

I should also include an image file that’s as large as possible, rather than one tailored to the device’s screen.

The reason for this is that when the reader paginates my book, it will look at that image’s size, and how much space is left on the current page. The software has to decide how much to scale the image to fit into the available space, which means tricky arbitration between available space on the current page, the size of the page, the maximum size of the image, and how the image scales when its size is reduced.

This can get quite complex, particularly for portrait format images that have a different aspect ratio to the page itself. At this point, some eBook readers get it wrong and split your image across two pages. This is a bug in the reader software, and you just have to work around it – either by inserting explicit page breaks into your HTML (using the page-break-before CSS element) or by fiddling with the image size. If your image has a caption, your life will become even more complicated, because you’ll want that image to stay with its caption.

This illustrates a big problem with eBook readers and their HTML rendering. If you think they should look the same as classic paperback novels – collections of chapters without any pictures – then everything is fine. However, HTML rendering dictates that all images be placed inline in the text, so that a paragraph ends before an image, and the next paragraph starts after it. There’s no way provided in HTML/CSS to break this order, nor enable me – if I want the text to continue unbroken – to float an image that doesn’t fit to the top of the next page, or even put it onto a page by itself.

Such facilities do exist in most word processors, although few of us actually use them. Turn back the clock to the 1980s, and the typesetting systems of that era, such as LaTeX, had the ability to say this with a statement something like:

… figure contents …

Here, {t} means: “Float this image to the top of a page.” That way, text fills up pages, and images stay close to their associated content.

Such rendering issues have other side-effects. Since iBooks are based on WebKit, you might expect to be able to use all the cool HTML5/CSS3 animations that are supported in Safari, but this mostly isn’t the case. For example, if you set an image as hidden, it will stay hidden even if you try to set its display attribute via JavaScript. Similarly, z-index doesn’t work. The reason is fairly obvious: iBooks need to paginate the book, and can’t cope with the possibility that pagination will be changed afterwards.

Open-source tools

When creating an eBook, there are three types of tools you might need. First, many people will want a wysiwyg eBook editor that inputs and outputs ePub files – such as Sigil, which I’ll look at below. Or, rather than native editing you might wish to create your eBook in a familiar editor such as Word, then convert it into ePub and other formats: the open-source tool for that is Calibre.

Finally, if you’re tweaking existing ePub format files directly, you’ll need to check that what you’ve done is correct, since it’s very easy to break the HTML and the special files. There are a couple of applications to help with that too.

Sigil is an open-source application that runs on most platforms, and lets you read and edit ePub files in an almost wysiwyg fashion. You can set paragraph and character styles in the text, and there’s a split-window mode in which you can fiddle with the underlying HTML while seeing what the result looks like. I used Sigil when I first started to create eBooks, but eventually gave it up – it works very well if you only need to create a very simple eBook with text and a few images, but I needed to do more than that. On the other hand, it’s a great application for getting an eBook project off the ground.

At first sight, Calibre looks less useful than Sigil, but I find myself using it frequently. It’s really a conversion application for ePub files rather than an editor per se – it can read ANSI text, RTF, PDF and other file formats, which it will then convert into ePub, and also other formats such as Mobi for Kindle.

This really is a Swiss Army knife for ePub production, and it lets you perform complex tasks, including targeting particular types of reader devices. It can also read newsfeeds and create eBooks. A feature that I find particularly useful is its ability to tweak ePub files, then send them to a device: whenever you do this, Calibre expands the tweaked file into constituent folders so that you can edit the HTML directly. I’ve found this invaluable when inserting videos and animations into ePub files.

When you’ve made your changes, Calibre will rebuild the ePub file and you can send it off to a device – in my case an iPad, which it transfers via iTunes in fairly simply. There’s a great community around Calibre, and barely a week goes by without an update. If you’re producing eBooks, you should take a look at Calibre.

ePub checkers

As I mentioned earlier, an ePub file is a collection of files zipped up together, and for an eBook reader to understand those files they must be consistent, complete (they must all be listed), and structured properly. All eBook readers are terribly fussy about the HTML code they’ll render, which must be structured according to the strict rules of the standard. The main application used for checking ePub files is epubcheck, which is a command-line Java application that checks the file for the properties mentioned, plus many others. (There’s also a forked version of the program called FlightCrew, which comes with a rudimentary GUI). It’s very important that you use EpubCheck if you’re submitting a book to Apple – the corporation has embedded a version of EpubCheck into its submission service, so if a book can’t pass it, Apple won’t even see, let alone accept it.

Having now produced books for both iBooks and Kindle, my overwhelming feeling is that the market for eBooks is really only just starting. Without doubt, many paper books will go into oblivion, in the way that celluloid film has. My personal metric to judge the adoption of a new technology is based on what I observe on public transport in London – and on my bus this morning, I saw one person reading a book on an iPad and someone else using a Kindle.

However, the real breakthrough won’t come until we learn to enable eBooks to exploit the fact that the device we read them on is more than just passive electronic paper. We’ve barely scratched the surface of applying audio, video and animation content, but much more is bound to follow as the market matures.

Leave a Reply

Your email address will not be published. Required fields are marked *