Building an E-Book from HTML: Sample Code

This entry was posted Monday, 18 July, 2011 at 9:00 am

I recently wrote up a case study for the audience at NovelRank.com on converting blog posts into an ePub e-book, and the experience helped me immensely. It was a bit of refresher course, as I had gone through the process to create my second book, 50 Conversation Starters for the Modern Age. Once it was built using the sample HTML code below, I used a fantastic and free program called Calibre, which is available for all operating systems, to convert the file to both ePub and Mobi (Kindle) formats. Finally, for editing the HTML, I used Notepad++, especially for its ability to do Find and Replace based upon Regular Expressions.

Here is the sample HTML code:

<html>
<head>
<title>
</title>
 
<style type="text/css">
body {font-family: Arial; letter-spacing: 0; line-height: 1.3em; font-size: 1em;}
img {display: block; margin: 5px auto; text-align:center;}
hr {page-break-after:always;}
h1 {line-height:1.6em;}
h1,h2,h3 {page-break-after:avoid; text-align:center;}
.large {font-size:200%; font-weight:bold;}
.medium {font-size:130%;}
.center {text-align:center;}
</style>
</head>
 
<body>
<br><br>
<img src='small-cover-image.jpg' alt='title' />
<p class='center large'>Title</p>
<br><br><br><br>
<p class='center medium'>Copyright 2011 &mdash; Author Name</p>
<hr>
 
<h1>Introduction</h1>
<p>Content</p>
<hr>
 
<br><h1>Category of Posts</h1>
<p>This section includes the following articles:</p>
<ul>
<li></li>
</ul>
<hr>
 
<h2>Title of Individual Post</h2>
<p>Content</p>
<hr>
 
<h2>Title of Next Individual Post</h2>
<p>Content</p>
<hr>
 
<br><br><br><br><p class='large center'>THE END</p>
</body>
</html>

One of the most important details in this sample code is the usage of the horizontal rule (<hr>) as a page break. Note the following CSS code:

hr {page-break-after:always;}

This allows to explicitly apply formatting. While Calibre is capable of working with an external CSS file, there is a limit to the CSS that is supported by both ePub and Mobi formats. For instance, the following CSS is supported by both formats, however does not center the image when it is viewed on a Nook.

img {display: block; margin: 5px auto; text-align:center;}

Since this sample explicitly defines the page breaks only for horizontal rules, there are some adjustments done in Calibre settings to reflect this. Also, the usage of heading tags (h1, h2, etc) allows you to define the Table of Contents automatically within Calibre.

Convert Books Settings in Calibre

Within Calibre, you will add the book by selecting the HTML file you have created. Make sure all images are saved in that same folder, as Calibre will build a .zip file of all of the contents. The next step is adjusting the Metadata and adding your cover image. That is all pretty straight-forward, so let’s move into the setting called ‘Convert Books’. In the top-right of that window, you select the output format (ePub or Mobi). These settings apply to both formats. The sections below refer to the side tabs on the left side.

Page Setup
Under the Output Profile box, this can be left as ‘Default Output Profile’ for ePub files, and should be changed to ‘Kindle’ for Mobi files. Regarding the margins, I recommend reducing these to 2.0pt for all sides. Remember, the e-reader can override this setting on their device, so don’t stress about it too much.

Structure Detection
Since we are setting the page breaks manually, in the section called ‘Insert page breaks before (XPath expression):’, remove everything in the box. The default code in that box will put in page breaks after all h1 and h2 heading tags, and we don’t want that.

Table of Contents
We are adjusting both the ‘Level 1 TOC (XPath expression):’ and ‘Level 2 TOC (XPath expression):’ options. You can either use the Wizard button on the right side of each box, or manually enter in these options:

  • Level 1: //h:h1
  • Level 2: //h:h2

These settings design your top tier of your Table of Contents (TOC) as all H1 heading tags, and any sub-chapters are H2 heading tags.

EPUB Output
The only option I recommend checking here is to ‘Preserve cover aspect ratio’. Of course, if your image is 600×800, this shouldn’t be an issue, but remember that not everyone views an ePub on a mobile e-reader device.

MOBI Output
While I don’t recommend changing anything, you should be aware that a Table of Contents is inserted at the end of the file. If you do not wish to offer this (I recommend leaving it), you can check the option ‘Do not add Table of Contents to book’.

Final Words

Congratulations, you’ve just built an e-book for 99% of e-reader devices. Any questions, tips, or additional comments, please feel free to leave a comment below.

Share and Enjoy:
  • Facebook
  • Twitter
  • Reddit
  • StumbleUpon
  • Digg
  • del.icio.us
  • Tumblr
  • PDF

1 Comment to Building an E-Book from HTML: Sample Code

  1. PeazĂȘ says:

    January 21st, 2012 at 2:25 pm

    Hey Mario,
    Thanks, more clear than that not even crystal.
    Best
    PeazĂȘ, from Rio.

Leave a comment