Building an E-Book from HTML: Sample Code

Update 8/30/12: For those interested in simply writing an ePub file from scratch in a clean editor, Check out the free project Sigil. While it creates a fantastic ePub file, the Table of Contents does not carry over when Amazon converts it. However, Sigil simply creates HTML and this tutorial will give you some additional information regarding how to manually create a Kindle ready version.

I recently wrote up a case study for the audience at NovelRank.com on converting blog posts into an ePub e-book, and the experience helped me immensely. It was a bit of refresher course, as I had gone through the process to create my second book, 50 Conversation Starters for the Modern Age. Once it was built using the sample HTML code below, I used a fantastic and free program called Calibre, which is available for all operating systems, to convert the file to both ePub and Mobi (Kindle) formats. Finally, for editing the HTML, I used Notepad++, especially for its ability to do Find and Replace based upon Regular Expressions.

Here is the sample HTML code:

<html>
<head>
<title>
</title>
 
<style type="text/css">
body {font-family: Arial; letter-spacing: 0; line-height: 1.3em; font-size: 1em;}
img {display: block; margin: 5px auto; text-align:center;}
hr {page-break-after:always;}
h1 {line-height:1.6em;}
h1,h2,h3 {page-break-after:avoid; text-align:center;}
.large {font-size:200%; font-weight:bold;}
.medium {font-size:130%;}
.center {text-align:center;}
</style>
</head>
 
<body>
<br><br>
<img src='small-cover-image.jpg' alt='title' />
<p class='center large'>Title</p>
<br><br><br><br>
<p class='center medium'>Copyright 2011 &mdash; Author Name</p>
<hr>
 
<h1>Introduction</h1>
<p>Content</p>
<hr>
 
<br><h1>Category of Posts</h1>
<p>This section includes the following articles:</p>
<ul>
<li></li>
</ul>
<hr>
 
<h2>Title of Individual Post</h2>
<p>Content</p>
<hr>
 
<h2>Title of Next Individual Post</h2>
<p>Content</p>
<hr>
 
<br><br><br><br><p class='large center'>THE END</p>
</body>
</html>

One of the most important details in this sample code is the usage of the horizontal rule (<hr>) as a page break. Note the following CSS code:

hr {page-break-after:always;}

This allows to explicitly apply formatting. While Calibre is capable of working with an external CSS file, there is a limit to the CSS that is supported by both ePub and Mobi formats. For instance, the following CSS is supported by both formats, however does not center the image when it is viewed on a Nook.

img {display: block; margin: 5px auto; text-align:center;}

Since this sample explicitly defines the page breaks only for horizontal rules, there are some adjustments done in Calibre settings to reflect this. Also, the usage of heading tags (h1, h2, etc) allows you to define the Table of Contents automatically within Calibre.

Convert Books Settings in Calibre

Within Calibre, you will add the book by selecting the HTML file you have created. Make sure all images are saved in that same folder, as Calibre will build a .zip file of all of the contents. The next step is adjusting the Metadata and adding your cover image. That is all pretty straight-forward, so let’s move into the setting called ‘Convert Books’. In the top-right of that window, you select the output format (ePub or Mobi). These settings apply to both formats. The sections below refer to the side tabs on the left side.

Page Setup
Under the Output Profile box, this can be left as ‘Default Output Profile’ for ePub files, and should be changed to ‘Kindle’ for Mobi files. Regarding the margins, I recommend reducing these to 2.0pt for all sides. Remember, the e-reader can override this setting on their device, so don’t stress about it too much.

Structure Detection
Since we are setting the page breaks manually, in the section called ‘Insert page breaks before (XPath expression):’, remove everything in the box. The default code in that box will put in page breaks after all h1 and h2 heading tags, and we don’t want that.

Table of Contents
We are adjusting both the ‘Level 1 TOC (XPath expression):’ and ‘Level 2 TOC (XPath expression):’ options. You can either use the Wizard button on the right side of each box, or manually enter in these options:

  • Level 1: //h:h1
  • Level 2: //h:h2

These settings design your top tier of your Table of Contents (TOC) as all H1 heading tags, and any sub-chapters are H2 heading tags.

EPUB Output
The only option I recommend checking here is to ‘Preserve cover aspect ratio’. Of course, if your image is 600×800, this shouldn’t be an issue, but remember that not everyone views an ePub on a mobile e-reader device.

MOBI Output
While I don’t recommend changing anything, you should be aware that a Table of Contents is inserted at the end of the file. If you do not wish to offer this (I recommend leaving it), you can check the option ‘Do not add Table of Contents to book’.

Final Words

Congratulations, you’ve just built an e-book for 99% of e-reader devices. Any questions, tips, or additional comments, please feel free to leave a comment below.

10 thoughts on “Building an E-Book from HTML: Sample Code

  1. Pingback: Converting Blog Posts into an ePub E-Book: Case Study | NovelRank Blog

  2. mathews

    good code, now i need some help i really need to find out the code for insertinng books and past papers on the website in otrher words some sort of online library. in kind developing a website for a school as my school assignment.

    Reply
  3. JEFFREY ATENCIO

    I saw this and have a question: I want to create readers for autistic and Dow’s kids and their parents. The books must be customizable – the parents have to be able to add their own personal pictures in the book for the books to have any learning value. This is because the kids are visual learners. No picture, no meaning.

    Example: This is my mom (the picture above the caption is a picture of her mom, not a cartoon or a royaly-free image.) Simple, right? I need an e-book set up that allows parents to change the pictures and then print out the books. Can this do that? If not, can it be done using HTML?
    Thanks

    Reply
    1. milo

      sounds like an wonderful concept. if you haven’t figured anything out yet…

      you can create a web portal that will allow the family members to upload images/info and then dynamically create the necessary html markup. then you could manually create their epub/mobi file with the html the server created for you.

      I’m sure there must be a converter like calibre for web servers. this would allow you to have the entire thing automated and instantly output a downloadable book for the family.

      i don’t have a ton of time but would be willing to help you figure it out if you still need help. i’m a web developer.

      Reply
  4. Pingback: Amazon eBook Publishing; a Guide | The Daily Gargle

Leave a Reply

Your email address will not be published. Required fields are marked *