Killing XML authoring with IBM apples

Apple and IBM! The recent joint announcement by Apple and IBM of their “global partnership to transform enterprise mobility” just shows that given enough time, anything can happen, especially in an ever-evolving and occasionally unpredictable socio-mobile industry that touches everything from bitcoins and wearables to 3D printers and biometric sensors. Yes, even structured XML authoring.

But can XML authoring survive? As I noted in a previous post, “the DITA Open Toolkit was released only 9 years ago“. By comparison, MadCap Flare was launched only 8 years ago, and Syncro Soft Oxygen introduced visual XML editing only 7 years ago. But today, I can see the signs that XML authoring, like CDs and DVDs, is already past its prime and approaching the end of its useful life.

Apple + IBM

Hi, my name is Jay, and I’m an IBM TRIRIGA information developer at IBM. While I’ve previously tackled the subject of XML authoring several different times from several different angles, I wanted to take another look in the new light of this “anything can happen” Apple and IBM announcement. Would it be poetic if IBM, which introduced DITA-XML authoring in 2001, helped to kill it by 2021?

What do I mean by killing XML authoring?

Last week, I stumbled across a tweet quoting an off-target yet eye-catching prediction by Aaron Koenig: “In twenty years, we will use bitcoin as naturally as we use the internet today.” Why is it off-target? Here’s why. Although I understand its intended meaning, the statement as written assumes that both bitcoins and the internet as we know them today will still be recognizable in 20 years.

The statement would be similar to making the following off-target prediction in 1984: “In thirty years, we will use compact discs as naturally as we use cassette players today.” Unfortunately for compact discs, the 2001 release of the Apple iPod and the 2007 release of the Apple iPhone popularized the portability of digital music without disc storage and contributed to the decline of CDs since 2000.

@michaelterpin @aaron_koenig @CoinTelegraph In 20 yrs we may transact #bitcoins via #wearable e-skin or #3Dprint them as #biometric cash :)— Jay Manaloto (@Jay_IBM) August 5, 2014

To highlight its technological assumptions, I sent a playful reply: “In twenty years, we may transact bitcoins via wearable e-skin or 3D print them as biometric cash.” After all, if audio data can be retrieved from vibrations through a bag of potato chips today, then in 20 years, isn’t it possible that coin data might be stored in water bottles, or oceans, or even our own bodies in a global hydronet?

Sounds crazy? Then again, XML authoring and iOS mobile apps would sound pretty crazy in 1984. In terms of “killing XML authoring”, the rise of IBM-iOS apps might mean the fall of DITA-XML output. How? To set the context for my projections, let’s take a look at what structured XML authoring is and isn’t, what my previous posts predicted, and how IBM-iOS apps might undermine DITA-XML output.

What is structured XML authoring?

For those of you who are new to the idea of structured XML authoring, here’s a brief overview.

Dissecting DITA relationship tables (08 Nov 2013)

What is DITA-XML? If you’re not familiar with DITA-XML topic files or topic-based authoring, then try to think of individual topic files like individual scenes in a movie screenplay. Like movie scenes, topic files are the basic building blocks that can be individually added, edited, deleted, or rearranged to create a different or more-effective experience.

Wikipedia: Darwin Information Typing Architecture (23 Apr 2014‎)

DITA content is created as topics, each an individual XML file. Typically, each topic covers a specific subject with a singular intent, for example, a conceptual topic that provides an overview, or a procedural topic that explains how to accomplish a task. Content should be structured to resemble the file structure in which it is contained.

Oxygen XML: Structured Authoring (24 Feb 2011)

Structured authoring can mean many things, but in the context of this document, structured authoring means a standardised, methodological approach to content creation incorporating systematic labelling, modular, topic-based architecture, constrained writing environments, and the separation of content and form.

TerraXML: The 3 Main Characteristics of XML Authoring (19 Dec 2013)

Data Re-use

By leveraging XML, companies can significantly reduce duplicate information. By writing content once and using it anywhere, XML saves businesses time and money, while simultaneously ensuring that information remains accurate and up-to-date.

Data Granularity

In an information overloaded world, we need to have the right information at the right time. XML facilitates tailoring information according to specific to user needs.

Separation of Appearance and Content

Historically, the need for different output formats often dictated that authors work with very different creation tools and formats. By using XML, authors can generate many different output formats from the same base data, thus ensuring data security and message consistency for every user.

What is structured XML authoring “not”?

Despite its topic-based structure, XML authoring does not generate structured data that is typically associated with relational databases. Although the DITA-XML source might be categorized as semi-structured data at best, its topic-based output — whether it is published in XHTML, PDF, or EPUB format — is still categorized as unstructured data. Big data consists primarily of unstructured data.

Webopedia: Unstructured Data (23 Jun 2014)

Unstructured Data and Big Data

[U]nstructured data is the opposite of structured data. Structured data generally resides in a relational database, and as a result, it is sometimes called “relational data.” This type of data can be easily mapped into pre-designed fields. For example, a database designer may set up fields for phone numbers, zip codes and credit card numbers that accept a certain number of digits. Structured data has been or can be placed in fields like these. By contrast, unstructured data is not relational and doesn’t fit into these sorts of pre-defined data models.

In addition to structured and unstructured data, there’s also a third category: semi-structured data. Semi-structured data is information that doesn’t reside in a relational database but that does have some organizational properties that make it easier to analyze. Examples of semi-structured data might include XML documents and NoSQL databases.

The term “big data” is closely associated with unstructured data. Big data refers to extremely large datasets that are difficult to analyze with traditional tools. Big data can include both structured and unstructured data, but IDC estimates that 90 percent of big data is unstructured data. Many of the tools designed to analyze big data can handle unstructured data.

IBM: What is big data? (16 Oct 2013)

“Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few. This data is big data.”

What are my predictions for structured XML authoring?

Looking back at my string of observations and expectations, my past predictions about structured XML authoring seem to flourish under three different flavors — (1) its aging XML-based PDF output, (2) its aging XML-oriented CMS platforms, and (3) its glaring limitations when compared to easy-to-use, SaaS-friendly, and app-friendly socio-mobile CMS platforms like WordPress and MediaWiki.

XML-based PDF output

Restyling PDF manuals for mobile (23 Nov 2013)

We shouldn’t be forced to repeatedly zoom in and out or drag back and forth through the text more than necessary just because companies are still confined to the aging notion that PDF documents must be printed on paper… I’d love to restyle our IBM TRIRIGA PDF user guides into a more mobile-friendly user experience. But under the weight of paper-based traditions and expectations, I doubt it’ll happen any time soon.

Killing clunky PDF with mobile EPUB (30 Mar 2014)

By initiating a simple search for “EPUB converters”, you can find an endless list of PDF-to-EPUB, DOC-to-EPUB, and HTML-to-EPUB converters. If you add a few extra keywords, you can even find XML-to-EPUB and DITA-to-EPUB converters. So, conceivably, IBM might be able to find the ideal converter that can transform both DITA source and XHTML output into reflowable EPUB content. Whether IBM finds that ideal EPUB converter, or eventually develops its own mobile EPUB solution, or ultimately kills its own clunky PDF delivery, remains to be seen.

XML-oriented CMS platforms

Shifting DITA-XML editors into SaaS (20 Jan 2014)

Because a social-media CMS platform like WordPress is already being delivered as a free-hosted SaaS through WordPress.com, it’s easy for organizations to try it, test it, and when they’re ready, invest in it as a full-featured CMS solution, whether it’s an enterprise-level SaaS like WordPress.com VIP or self-hosted implementation like WordPress.org.

Similarly, if an XML-oriented CMS platform like Astoria and Componize can be delivered as a free-hosted SaaS like WordPress.com and Alfresco.com (“Alfresco in the cloud”), then it’ll be easier for organizations to try it, test it, and invest in it as an enterprise-level SaaS solution, whether it’s a CMS-only SaaS or full CMS-plus-oXygen SaaS…

Whether or not its PHP-based platform contributes to WordPress’s popularity, I can’t say for certain. But it certainly hasn’t hurt it. However, if Java-based platforms are required to run XML-oriented CMSs like Astoria and Componize, then XML-oriented integrations like oXygen Author Component and Componize for Alfresco will forever be incompatible with PHP-based CMSs like WordPress. As the WordPress community continues to expand and evolve, this XML incompatibility poses another question: Will the next evolutionary step for XML-oriented CMSs be extinction? Time will tell.

Burning XHTML bridges to WordPress (09 Feb 2014)

Looking beyond this aging XHTML import tool, these dual-CMS issues would still arise with any similar topic-based XML-to-WordPress tools or plugins in the future. In fact, on Mike Little’s aforementioned blog post, one of his comments struck me. When a reader asked Mike about “any new developments or new insights” in using his import tool, Mike crystallized the central point.

[I] haven’t worked on it for years. I don’t work with DITA any more, so the need is not there.

Socio-mobile challenges

Killing XML and IA with social media (23 Dec 2013)

You might be asking, “What are you talking about? Why would I want to connect to the cloud to edit my already published output?” If so, then I’d answer, “So you could edit your content at any time or place without being tied to a single computer. Just take a look at the blogs hosted by WordPress.com and the MediaWiki-powered wikis hosted by Wikia.com. Or you can set up your own self-hosted versions from WordPress.org and MediaWiki.org.”

In the context of social and mobile media, you can connect to the cloud to log into your published blog posts or wiki pages and directly edit the source from your computer, tablet, or smartphone. If you need to simply fix a link in a particular page, then you can revise that single page without republishing a hundred other pages in the entire blog or wiki project. By being freed from the constraints of “compiled” project-level or version-level output, you are given much more flexibility as an author, editor, or architect to react to community feedback and revise your “published” output whenever or wherever you choose…

Moreover, by experimenting with blog-based and wiki-based approaches, I can more easily see its glaring lack of socio-mobile flexibility and community potential… The IBM Knowledge Center is a monumental first step, but it’s far from the final step. I’ll go so far as to predict that depending on the rapid evolution of social-media and socio-mobile technology within the next 5-to-10 years, IBM will be forced to migrate its product information once again — to a more-flexible “social documentation” platform.

Comment by Jan Benedictus of FontoXML (07 Jan 2014)

Without going so far as to predict the end for structured authoring, I fully agree that information-consumption is changing fundamentally, also in the field of specialized documentation… However, user communities and documentation are often ‘separate worlds’. That means that there is no mechanism to govern the contributions from users (mostly holding a strong expertise) other than having authors reproducing these and incorporating it into the formal documentation.

Splitting IBM TRIRIGA into mobile bits (13 Jul 2014)

Will standalone browsers really be extinct in another 7 years? Why not? If you own a smartphone, tablet, wearable, or other mobile device, think about the dozens or hundreds of mobile apps that you’ve installed until today. Since all mobile apps are expected to be internet-ready by their very nature, how often do you open your mobile browser? Maybe 5% of the time? Maybe less than 1%?

The modern smartphone has existed for only 7 years. But today, are you still buying physical calculators, calendars, and compact discs as often as you used to? If typical desktops and laptops, especially non-gaming ones, are giving way to mobile devices, while websites and web applications are giving way to mobile apps, do you see a future for standalone desktop and mobile browsers?

Editing a Wikipedia.org article with the Wikipedia mobile app

Editing a WordPress.com post with the WordPress mobile app

How might IBM-iOS enterprise mobility kill XML authoring?

Not surprisingly, the frame of reference and natural target of my predictions is the IBM Knowledge Center. So, to put it bluntly, these 3 different flavors exposes or highlights at least 3 different sets of challenges that the IBM Knowledge Center must inevitably face. With regard to the Apple and IBM announcement, the IBM MobileFirst for iOS strategy weakens the role of the website experience.

How? To be clear, although IBM is doing a masterful job of implementing resizable and reflowable responsive web designs (RWD) for its IBM Service Engage, IBM Watson, and IBM MobileFirst sites, the IBM MobileFirst for iOS strategy obviously focuses on the enterprise mobile-app experience, not the enterprise web-application experience. This undercuts the topic-based IBM Knowledge Center.

To illustrate the looming challenges even further, let’s ask the sharper questions. Can I edit an IBM Knowledge Center topic on-the-fly after it’s published in XHTML format? No, I can’t. Can I view a topic in XHTML format from a mobile app instead of a mobile browser? No, not yet. If the onslaught of mobile apps forces the retirement of topic-based websites in 7 years, will IBM be ready in time?

As an IBM information developer, I could try to convince myself that structured XML authoring and topic-based technical information at IBM are fundamentally isolated and immune from the “rapid evolution of social-media and socio-mobile technology”. But it would be a dangerous conceit. After all, how valuable is published information that isn’t easily editable and sharable? Not valuable at all.

Instead, since “all mobile apps are expected to be internet-ready by their very nature”, I’m beginning to convince myself that from a certain imaginative perspective, web-ready smartphones, tablets, and other mobile devices can be considered “physical browsers” in which mobile apps behave like browser bookmarks. From this view, the role of the website experience is undermined even further.

Apple + IBM

What are my suggestions for future IBM authoring?

Earlier, I asked: “Would it be poetic if IBM, which introduced DITA-XML authoring in 2001, helped to kill it by 2021?” Although the demise of DITA-XML or topic-based authoring might be treated as a tragedy by career-long advocates, the unavoidable socio-mobile challenges also offer unparalleled authoring opportunities. Why not explore enterprise-class WordPress.com VIP or WordPress.org?

For example, after searching for only a few minutes, I was able to confirm WordPress plugins for enterprise-class requirements like workflow and translation. Specific WordPress.com VIP plugins include Edit Flow (workflow) and Transfluent (translation). Meanwhile, specific WordPress.org plugins include Oasis Workflow and Transposh WordPress Translation. The potential is limitless!

As an acquired IBMer who is expected to embody the IBM values such as “innovation that matters”, I’m surprised that there aren’t any counter-balancing exploratory teams investigating other non-XML, PHP-based, or social-media authoring strategies in the event that the IBM Knowledge Center ages faster than anticipated. If such teams exist, I’d certainly love to see their WordPress strategies!

Of course, it’s entirely possible that IBM has invested so many dollars, hours, and resources in building, populating, and optimizing the ever-growing IBM Knowledge Center that any criticisms in favor of a more-flexible alternative technology could be interpreted as ingratitude or reciprocated with rejection. Understandable. But knee-jerk interpretations wouldn’t be responsible or innovative.

In a world where “anything can happen” between Apple and IBM, we must assume that in 7 years, standalone browsers might be forced into extinction, topic-based websites might be forced into retirement, and IBM might “be forced to migrate its product information once again”. Even if none of these events actually occur, we should innovate as if they will. Not just IBM developers. All IBMers.

IBM: Our Values at Work (19 Feb 2014)

Innovation that matters – for our company and for the world.

IBMers…

are forward thinkers. We believe that the application of intelligence, reason and science can improve business, society and the human condition.

love grand challenges, as well as everyday improvements. Whatever the problem or the context, every IBMer seeks ways to tackle it creatively — to be an innovator.

strive to be first — in technology, in business, in responsible policy.

take informed risks and champion new (sometimes unpopular) ideas.

Apple + IBM

Do I have an update?

Two months after killing XML authoring with IBM apples, I bounced into responsive IBM design!

IBM Design

Later, 9 months after killing XML authoring with IBM apples, I poured Polymer onto TRIRIGA docs!

Polymer Topeka Demo

Benchmark study on XML editors (www.fontoxml.com)
IBM logo: IBM international recognition (www-03.ibm.com)
Cook ‘not worried’ about iPad declines: Here’s why (www.zdnet.com)
Is tool fragmentation in tech comm a good thing? (www.idratherbewriting.com)
IBM’s first enterprise apps for iPad to launch next month (www.appleinsider.com)
Tim Cook talks about Apple’s drive for business penetration (www.citeworld.com)
Apple and IBM could solve enterprise mobility headaches for IT (www.citeworld.com)

11 thoughts on “Killing XML authoring with IBM apples”

Tom Johnson says:

August 18, 2014 at 7:53 pm

Very insightful and interesting post. I’m actually working on a WordPress publishing strategy for my DITA content right now, testing it out in a pilot implementation. I agree 100% with all your points here, and am particularly in agreement that editing on the fly, sharing information, collaborating, etc., are all key aspects of information exchange that structured authoring is well-suited for. If you keep at eye on my site, in the next few days/weeks I’ll be exploring how to publish DITA content into WordPress.

- jaymanalotoibm says:
  
  August 18, 2014 at 8:52 pm
  
  Hi Tom! Haha, you beat me to the punch. Through an STC-related tweet, I just caught your recent posts on tool diversity earlier today, and in addition to my quick pingback, I planned a more formal reply with some of my darker DITA conclusions from my own recent post. But I’m glad to hear that my thoughts rang true to someone with your experience and influence! In fact, about 6 months ago, I also discussed your fascinating reassessment of Mike Little’s aging DITA-to-WordPress import tool. So I’m definitely looking forward to your next DITA-to-WordPress exploration. Eyes peeled!
  
  - Tom Johnson says:
    
    August 18, 2014 at 9:31 pm
    
    The main difficulty in converting from DITA to WP is figuring out page order. Little’s tool only manages hierarchy, not order within hierarchy.
    
    No doubt page order can be added in as an enhancement to the tool, but the larger problem is that the massive TOC isn’t the same paradigm on the web. I plan to use the Ubermenu Megamenu plugin to provide a better navigation experience, but I am not sure how to map this programmatically since the Ubermenu plugin creates code specific to page IDs. Anyway, I’ll flesh out more details in the coming weeks.
  - jaymanalotoibm says:
    
    August 19, 2014 at 12:47 am
    
    [Thanks for the tip on controlling the number of nested replies!]
    
    While I won’t pretend to understand the full context, I like how juicy this problem sounds. If I had a two-cent thought to offer, I was going to ask about inserting numeric prefixes into the DITA topic IDs as a possible way to order pages. But if the UberMenu plugin only looks at the page IDs, then even if you converted numeric topic IDs into numeric page slugs, UberMenu wouldn’t recognize the slugs anyway. Hmm. Either way, you’ve now inspired me to look into the plugin and UberMenu in general. So I’ll be better prepared for your discoveries. Good luck! :)
markgiffin807085920 says:

September 3, 2014 at 2:04 am

Thanks for the interesting post. This is a big area of interest for me.

I have some question on the graphical layout of the post, which confuses me. There are a lot of orange links followed by gray boxes. Do these gray boxes have quotes taken from the orange links? Are some of them your own words quoted from your own posts?

Also, there are many graphics of Apple logos and IBM logos repeated, with no explanation that I could find. I clicked on a few of them and they just went to larger versions. Are these repeated for graphic effect? Or is there some other reason?

Regards,
Mark

- jaymanalotoibm says:
  
  September 3, 2014 at 3:33 am
  
  Hi Mark, you’re welcome! Oh, no problem. The orange links point to the source of the quotes in the gray boxes. These sources might be an external site or one of my own posts. If it’s an external link, I typically add the owner at the beginning of the link. For example, for the link “IBM: What is big data?”, the external source is IBM. Meanwhile, if there’s no owner in the link, it’s probably one of my posts. On other posts, I might explicitly mention the sources too.
  
  The primary reason for this layout is for flow. Personally, I like to show my cards on the table, instead of forcing readers including myself to jump or dodge hoops. Then I can present my thoughts based on the context and flow of these cards. Secondly, if a site disappears, I still have a record of its text, so the flow remains intact. Now regarding the graphics, I had a lot of fun creating them, but I couldn’t decide on which to cut out. So I guess it was a bit of selfish indulgence. Haha, plus they look cool too! Thanks for your interest and curiosity!
  
Jay Manaloto says:

October 22, 2014 at 2:07 am

Update from AppleInsider (21 Oct 2014): “IBM’s first enterprise apps for iPad to launch next month as iPad reaches 90 percent tablet share in U.S education…”

Pingback: Tackling stacks of Korean paperbacks | jay.manaloto.ibm
Pingback: Rewriting the rules for TRIRIGA docs | jay.manaloto.ibm
Pingback: Delighting users by killing XML robots | jay.manaloto.ibm
Pingback: Killing XML and IA with social media | jay.manaloto.ibm