Killing clunky PDF with mobile EPUB


A couple of weeks ago, a former colleague asked me for advice about embedding a video into a PDF file. She had tried MOV and MP4 video formats, so I suggested FLV format. But in each case, only the audio played, not the video. I’m not sure if she eventually resolved the issue, but I suspect it was related to her Adobe product versions. This episode led me to wonder about PDF files.

Specifically, why are we still stuck using the clunky PDF? First, in terms of being “trapped by this traditional paper-based expectation”, the PDF is looking more like a locked cage. Second, in terms of the “rapid evolution of social-media and socio-mobile technology”, the PDF is behaving more like an aging dinosaur. Not even a video-embedding PDF feature can slow the mobile EPUB momentum.

EPUB document on Moon+ Reader 2.4.1 mobile app

EPUB document on Moon+ Reader 2.4.1 mobile app

Hi, my name is Jay, and I’m an IBM TRIRIGA information developer at IBM. While there are probably countless comparisons between PDF and EPUB documents on the web, I’d like to highlight several articles that caught my eye. From there, I’ll toss my own two-cent thoughts on the troubling PDF-related issues that IBM might be facing, if not now, then soon, or sooner than we think.

What is EPUB?

First, to set the stage for those who are less familiar with the EPUB format, here are several Wikipedia articles. The key point is the idea of “reflowable” content versus “fixed-layout” content.

EPUB (short for electronic publication) is a free and open e-book standard by the International Digital Publishing Forum (IDPF). Files have the extension .epub.

EPUB is designed for reflowable content, meaning that an EPUB reader can optimize text for a particular display device. EPUB also supports fixed-layout content. The format is intended as a single format that publishers and conversion houses can use in-house, as well as for distribution and sale. It supersedes the Open eBook standard.

A reflowable document is a type of electronic document that can adapt its presentation to the output device. Typical prepress or fixed page size output formats like PostScript or PDF are not reflowable during the actual printing process because the page is not resized. For end users, the world wide web standard, HTML is a reflowable format as is the case with any resizable electronic page format…

Besides HTML, commercially available systems include… ePUB is a simple reflowable format that allows a single column with inline images, in many ways similar to a stripped-down HTML.

The EPUB format is the most widely supported vendor-independent XML-based (as opposed to PDF) e-book format; that is, it is supported by the largest number of e-Readers. The popularity of Amazon.com’s Kindle devices in America has led also to the prominence of KF8 and AZW formats; Kindle does not support EPUB.

To illustrate the visual flexibility of reflowable EPUB content, I chose the Moon+ Reader mobile app.

EPUB document on Moon+ Reader 2.4.1 mobile app

EPUB document on Moon+ Reader 2.4.1 mobile app

EPUB document on Moon+ Reader 2.4.1 mobile app

EPUB document on Moon+ Reader 2.4.1 mobile app

EPUB document on Moon+ Reader 2.4.1 mobile app

EPUB document on Moon+ Reader 2.4.1 mobile app

EPUB document on Moon+ Reader 2.4.1 mobile app

EPUB document on Moon+ Reader 2.4.1 mobile app

What is the difference between PDF and EPUB?

If you’re already familiar with fixed-layout PDF content on desktop, laptop, or mobile devices, and this is the first time you’ve seen reflowable EPUB content, then you can probably and easily see the difference in visual flexibility. If you’re already familiar with both formats, then I probably don’t have to convince you of the advantages and disadvantages of one format over the other.

However, if you’re curious about how others have compared the two, here are a handful of opinions.

Undoubtedly, there’s a mashup occurring between the traditional markets for PDFs and EPUBs. PDFs, which were often relegated to the business and technical domain, are now being used for e-books… Additionally, current PDF readers are being improved so that they can more easily reflow text and do all the good things that e-readers can do.

EPUBs, on the other hand, will certainly be spreading into the technical and business environments, once problems related to complex formatting are solved…

As these two technologies invade each other’s spaces, it’s likely that they will borrow from each other’s feature set. In fact, it’s already happening, since the latest Adobe Reader can now reflow text. So stay tuned to see what happens as these two technologies morph and compete in the rapidly expanding, multifaceted world of e-books.

These are only a few examples of why many ebook makers prefer to manually go through a text that is being converted from PDF to ebook: there are too many little details for a converting program to catch, and so the human eye is still vital in the process.

Think about how far we are from Gutenberg’s moveable type: a PDF may remind us of the beautiful, static pages Gutenberg offered the Western world, but the ebook redefines the notion of “moveable” type — with a well-made ebook (and often “handmade” ebook), each reader can move the type to create his own version of the text.

This client read somewhere that all eBooks can read PDFs, so she’s thinking we can just convert the print version to a PDF and call it a day. I think that’s not the route to take. The beauty of ePubs is that they are fluid; PDFs are not. I’m no expert when it comes to PDFs on eBooks, but I don’t think PDFs can come loaded with metadata (name of book, author, publish date, built-in table of contents) nor can I create hyperlinks within (such as creating my own TOC that will appear at the beginning of the ePub).

What are the PDF issues that IBM might be facing?

Several months ago, I conducted a couple of informal experiments — the first one broke away from the “aging notion that PDF documents must be printed on paper” and the second one broke away from the “disadvantages of topic-based XML output” such as static XHTML and PDF documentation. Coincidentally, both approaches point toward a “more-flexible ‘social documentation’ platform”.

If I really think about it, would I sit down at my immobile computer to print out the user guide for my mobile smartphone? Or instead, if I have a PDF-reading app or browser, would I simply open the PDF user guide on my smartphone? If I don’t need to print anything, why should the PDF still look like a standard piece of paper? Not only Apple or IBM, but probably most companies that produce user guides are still trapped by this traditional paper-based expectation…

In terms of PDF user guides and manuals, this mobile convenience becomes even more valuable when the text involves a lengthy procedure or collections of lengthy procedures. We shouldn’t be forced to repeatedly zoom in and out or drag back and forth through the text more than necessary just because companies are still confined to the aging notion that PDF documents must be printed on paper.

Restyled PDF experiment: Jay's workout

Restyled PDF experiment: Jay’s workout

When IBM chose to migrate and consolidate hundreds of its information centers into a single Knowledge Center, it aimed to address the issues of searchability, unity, and consistency. While searchability might be resolved, I still expect that the Knowledge Center will still face challenges with information unity and consistency among the thousands of its products. Moreover, by experimenting with blog-based and wiki-based approaches, I can more easily see its glaring lack of socio-mobile flexibility and community potential.

The IBM Knowledge Center is a monumental first step, but it’s far from the final step. I’ll go so far as to predict that depending on the rapid evolution of social-media and socio-mobile technology within the next 5-to-10 years, IBM will be forced to migrate its product information once again — to a more-flexible ”social documentation” platform.

TRIRIGAPEDIA experiment: Mobile viewing and editing

TRIRIGAPEDIA experiment: Mobile viewing and editing

Because the IBM Knowledge Center (KC) is still in its crawling stages as a socio-mobile platform, the primary PDF-related issue that IBM might be facing is the unsustainable multiplication of PDF-delivery methods. On the DITA-source side, the server-based build process transforms the DITA to PDF output. Meanwhile, on the XHTML-output side, users can customize their own PDF collections.

Ideally, both of these PDF processes plus any other internal or external PDF-related production should all follow the identical PDF-delivery method. Unfortunately, the source side and output side each present its own unique set of challenges that haven’t yet been completely covered by a single PDF solution. This uncertainty makes me wonder about the feasibility of a single EPUB solution.

By initiating a simple search for “EPUB converters”, you can find an endless list of PDF-to-EPUB, DOC-to-EPUB, and HTML-to-EPUB converters. If you add a few extra keywords, you can even find XML-to-EPUB and DITA-to-EPUB converters. So, conceivably, IBM might be able to find the ideal converter that can transform both DITA source and XHTML output into reflowable EPUB content.

Whether IBM finds that ideal EPUB converter, or eventually develops its own mobile EPUB solution, or ultimately kills its own clunky PDF delivery, remains to be seen. Personally, I’d love to see the day when I can customize and generate my own reflowable EPUB collection from the DITA source or XHTML output. If I could do so with my own IBM TRIRIGA content, that would be awesome. :)

Do I have an update?

Five months after killing PDF with EPUB, I wondered about killing XML authoring with IBM apples!

Apple + IBM

Apple + IBM

Related articles

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.