The Promise of XML
Achieving the promise of XML may be a 'bumpy ride.'
XML (extensible markup language) gets your content from print to Web to mobile to syndication—write once, use anywhere. It’s the language that allows any platform to recognize a particular piece of content and it provides the ultimate in content flexibility. Yet converting content into XML is initially not an automatic, invisible process. It’s another step in the production workflow and publishers planning to share their content across multiple platforms need to understand that a new process will need to be established—and new software will need to be purchased. XML can be automated, but getting there does require some manual labor—think of XML as the last step before sharing content with your CMS. But fully exploiting the technology requires three steps inherent in building out a publishing technology platform—capital expenditures in hardware and software, building bridges and patches to integrate yet another system, and a substantial workflow reconfiguration.
The Print-to-Web Pathway
For many publishers, content still originates in a print-centric world, meaning it starts with the magazine and then makes its way to the Web. “We still produce magazines and newsletters that originate on the print side and we need to get that to other outlets as quickly as possible,” says Rob Paciorek, chief technical officer at b-to-b publisher Access Intelligence. “What we’ve done is create a process to get the content from its print format into XML, and from that we take the XML and either import it into a Web CMS or transform it into other types of media.”
This general process that Paciorek describes is the crux of any XML content strategy—but it’s not an automated process that launches at the push of a button. For Paciorek, the ability to take content and reuse it across platforms was an undeniable benefit. Two years ago he purchased the K4 system, which has an XML export feature, but this investment—which Paciorek would only describe as “sizable” due to the extra hardware, software and training to go along with the K4 purchase—was only the start. “It required hardware, software and implementation. It was a pretty sizeable investment. But it was important for us because all of the old processes we were using to convert some of the print content to Web-friendly or third-party content were on the verge of breaking,” he says.
But, like most publishers, there’s one content management system for print and another for Web and patches need to be created to bridge the two. Further, in order for content to be converted into XML, stylesheets need to be developed that identify which elements of a piece of content must be tagged, or styled, as XML.
A Big Bullet to Bite
But for many smaller publishers, investment in a robust print-side CMS and the subsequent process of producing well-formed XML is a big bullet to bite. “I think if you’re at a certain threshold of publications it might make sense to invest in this. Otherwise, the software is not off-the-shelf enough that I know of—you can’t just hit a button in InDesign and have it magically go into XML and then have your Web site slurp it in,” says David Newcorn, vice president, e-media at Summit Publishing Company. “So there are two struggles, getting the data out of Quark or InDesign and then getting it into the Web site.”
So even after an investment in a commercial system that facilitates the transformation of content from print to Web, publishers face the challenge of the conversion process from both a technology and workflow standpoint. “It affects your print designers—they have to style everything consistently, it affects their jobs. You have to tell your art director content has to be styled a certain way because it makes the Web team’s job easier,” says Newcorn, who revisits the XML issue every few years and has so far resorted to paying an intern $15 per hour to copy and paste Quark and InDesign files into the Web content management system. “At the end of the day, when we do the math and calculate the opportunity cost and scarce Web development dollars, it’s always cheaper to pay someone $15 per hour.”
Bridging the Gap
And even if the art director can absorb the conversion process, there’s still the technological hurdle of bridging the print system to the Web system. “Now we have to go to our Web department, which is tasked with 5,000 other projects that, hopefully, are oriented toward making money rather than reducing costs, and ask them for this open-ended project to get the XML to work,” says Newcorn.
Even Paciorek admits to a “bumpy” road to well-formed XML. “There have been bumps because of the difficulty of transforming the XML correctly. When we export stories from K4 into XML, unless you have everything tagged properly, like author name, a subject name, a subhead, company names, and so on, it gets complicated and you might miss a few things. It’s the complexity of tagging and trying to automate something that’s not always the same.”
As mentioned, K4 does have the ability to export content in XML, but since publishers have a dizzying array of Web CMS platforms, bridges often need to be built to ease the transport of XML out of K4 and into the Web CMS. “It was very complicated thing to do,” says Paul Maiorana, director of technology at Mansueto Ventures, publisher of Fast Company and Inc. “The hard part is getting XML out of K4. There is a lot of custom code that has to be written.”
Yet once that code is written and the ability to pull XML out of the CMS is created, there’s still the process of getting print production staff to implement the tagging properly. “There are other challenges besides technical ones—getting our print production staff to format things in appropriate ways so that we can derive value out of what they’re tagging,” says Maiorana. “They just have to use a system that will map well to the XML that we need so we can import everything without having to go through it and manually retag it.”
Currently, about 50 magazine articles per month are uploaded onto the Fast Company and Inc. Web sites. The Web team keeps a production staffer on hand to oversee the process, filling in faulty XML tagging when necessary. Overall, says Maiorana, the process is still a bit faulty, and development resources had previously been devoted to relaunching the Fastcompany.com. Now that that’s complete, he says they can turn their attention back to smoothing out the print-to-Web transition. “It’s a little bumpy still. So far, it’s just the technical parts of it, but it’s a priority for us. We’ve put a lot of our development resources into the work that we’ve been doing on our Web site so we haven’t spent a lot of time with debugging that recently, but it’s something we’re working on finalizing.”