The Truth about XSL-FO ‘Open Standards’

We at ISIS Papyrus can say that we support ‘open standards’ such as C++, HTTP, Linux, AFP, PDF and so on. We do even support many XML based standards despite my scepticism about its widely promoted benefits. SOA is one of them and there are many others.

All vendors in the enterprise document composition market who use XSL-FO claim that because it is an ‘open standard’ that makes their solution superior, more modern and customer oriented than proprietary ones. The use of an XML-like structure does not prove a kind of elevated status of intelligence or a future oriented technology. It means that the vendor took the easy route and used an existing format. Yes, we did the same thing with AFP many years ago. The main difference is that we truly implemented IBM’s AFP format all the way. Vendors who use XML formats mostly don’t do that. The reason is that you can’t innovate based on open standards. Here is my take:

‘Open’ means that all defining structures of the complete application are published formats. ‘Standard’ means that the file formats and program functions are FULLY compatible to a large percentage of solutions in the market. The current vendors using XSL-FO do not fall in these categories! I found only one vendor who fully publishes on his website the details of which version of which XML specifications are supported and which function in which specification is implemented and which are not. But even this one vendor does NOT specify how much proprietary code creates how many additional functions that make the product not compatible with the ‘open standard’. That vendor claims three PATENTS in relationship to XSL-FO. How in the world can someone have the chuzpe to call that OPEN? All other vendors choose a simpler route from the outset by saying that their product IS BASED on the ‘open standard,’ which already implies that the standard is a foundation ONLY on which any number of proprietary extensions were implemented.

I propose that NONE of the formatting applications based on XSL-FO are fully portable between the vendors. Obviously those vendors do not want to be THAT OPEN, because it would mean that you can take the application from one vendor to the next at any time. Consider that a complete correspondence solution is not just the formatted XSL-FO file but include the metadata entities, the resource assets, the external data interface, the user interface definition with data mapping, and the process and post-processing definitions. If portability is a key requirement for you, ask for a written guarantee that the above complete application can be imported into other products. You will not get that. If it is not a key requirement why would you bother with XSL-FO?

Many years ago when ISIS was the first and only one to propose that a standard ought to be used, our competitors claimed that using AFP was a drawback and limitation. The only published document standard at the time was IBM’s AFP and we used it and still do. Then we also supported IBM’s OGL and PPFA. Today we do support the import and use of XML formatted elements today (CSS, XSL, and can output XSL-FO) but we would not claim that to be a standard software function.

Here is a list of problems with XSL-FO functionality, performance and compatibility:

  1. All non-XML data formats have to parsed, validated and XML converted.
  2. Multiple XML input files require specific XSLT definitions for each combination.
  3. The design of XSL to XSL-FO to print format IS NOT at all WYSIWYG.
  4. Different XSL-FO processors for identical output formats are not fully compatible.
  5. XSL-FO page regions are positioned by dummy-tables and can overlap.
  6. XSL-FO processors for different targets produce VERY DIFFERENT results.
  7. XSL-FO processors cannot handle relative posititioning of items.
  8. No access to current formatting position for white-space management.
  9. XSL to XSL-FO to PRINT format cannot guarantee a certain number of pages.
  10. XSL-FO is not able to provide the true number of pages of a complete document.
  11. Most XSL-FO formatters need therefore to predefine rigid page masters.
  12. Web pages or emails need special XSLT’s that bypass XSL-FO formats.
  13. XML needs to use URI as substitute for embedded resources or binaries.
  14. Relative external URI references reduce the portability of an XML file.
  15. NON-standard functions produce barcodes and charts embedded as SVG.
  16. SVG conversion to target formats for printing is VERY inaccurate.
  17. XSL-FO products do not use any of the XML business rule ‘standards’.
  18. Many custom Java, .NET or scripting functions are used to generate XSL-FO.
  19. FIVE to SEVEN XML parsing and writing steps are needed to the printed page.
  20. XSL-FO is thus not suitable for high-speed production (millions of pages)
  21. Numerous codepage issues on input, handling and mapping to print fronts.
  22. XSL-FO does not support for print-time mapping of layouts into pages.
  23. All post-processing, print management and resource functions are proprietary.

If a vendor claims that his implementation of XSL-FO does not have the above problems that may be actually true, but it means that the product is not longer compatible with XSL, XSL-FO and the ‘standard’ XSL-FO drivers.

A XSL-FO product could only be standard if built from Open Source, but vendors have to use non-standard components such as GUI, rule, barcode or chart functions. Open Source continues to evolve rapidly while an enterprise solution needs stable and tested functionality. That means that in a very short time, your solution is far away from the ‘standard’ that everyone else uses. When you then run into a problem the vendor cannot easily apply the Open Source fixes. We use a few licensed Open Source libraries but would never claim that they represent a standard and only use them if we are in absolute control of bug-fixing ourselves.

Finally, I recommend that potential users should not only consider the above issues with products based on XSL-FO but also whether the lack of original vendor know-how and the inherent problems of XML structures and open source make the solution aceptable for the enterprise.

References:

XML-Extensible Markup Language (XML) 1.0. W3C http://www.w3.org/TR/1998/REC-xml-19980210
XML Names - Namespaces in XML. W3C Recommendation. See http://www.w3.org/TR/REC-xml-names
XPath - XML Path Language. W3C Recommendation. See http://www.w3.org/TR/xpath
CSS2 - Cascading Style Sheets, level 2 (CSS2). http://www.w3.org/TR/1998/REC-CSS2-19980512
DSSSL - ISO/IEC 10179:1996. Document Style Semantics and Specification Language (DSSSL).
HTML- HTML 4.0 specification. W3C Recommendation. See http://www.w3.org/TR/REC-html40
IANA- Character Sets. See ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets.
RFC2278 - N. Freed, J. Postel. IANA Charset Registration Procedures. IETF RFC 2278. See http://www.ietf.org/rfc/rfc2278.txt.
RFC2376 - XML Media Types. IETF RFC 2376. See http://www.ietf.org/rfc/rfc2376.txt.
RFC2396 - Uniform Resource Identifiers (URI) IETF RFC 2396. See http://www.ietf.org/rfc/rfc2396.txt.
UNICODE TR10 - Unicode Consortium. See http://www.unicode.org/unicode/reports/tr10/index.html.
XHTML-  XHTML 1.0: The Extensible HyperText Markup Language. W3C, See http://www.w3.org/TR/xhtml1
XPointer- XML Pointer Language (XPointer). W3C Working Draft. See http://www.w3.org/TR/xptr
XML Stylesheet - W3C. See http://www.w3.org/TR/xml-stylesheet
XSL - Extensible Stylesheet Language (XSL). W3C Working Draft. See http://www.w3.org/TR/WD-xsl
XSL-FO - http://en.wikipedia.org/wiki/XSL_Formatting_Objects

I am the founder and Chief Technology Officer of Papyrus Software, a medium size software company offering solutions in communications and process management around the globe. I am also the owner and CEO of MJP Racing, a motorsports company focused on Rallycross or RX, a form of circuit racing on mixed surfaces that has been around for 40 years. I hold 8 national and international championship titles in RX. My team participates in the World Championship along Petter Solberg, Sebastian Loeb and Ken Block.

Tagged with: , , ,
Posted in Business Architecture, IT Concepts
8 comments on “The Truth about XSL-FO ‘Open Standards’
  1. Max, your most recent posting ‘jives’ (if you will pardon the Americanism) very well with what we at acadami have been teaching in our ‘Document Production Fundamentals’ and ‘Document Production Best Practices’ courses. In short, since the use of XSL-FO cannot guarantee 100% fidelity at print time, even companies that want to send electronic documents to end users in lieu of paper must be very wary because they cannot guarantee that when the user prints the document at home, that the resulting document will match what the company has on file.

    PDF is obviously one solution, given that PDF is inherently a print stream…but PDF has its own drawbacks, as you well know.

    In the situation in which a company wants to distribute electronic copies of documents to end users (i.e., B2C), and in which the company is very concerned about the quality and fidelity of the print at the remote location, what other suggestions might you have other than PDF and PDF/A? Might the day come when we have ‘AFP/PD’ – a ‘portable document’ format for AFP, that can be easily digested and printed in remote and B2C locations on Windows and other platforms?

    Bill

    Like

    • Max Pucher says:

      Bill, thank you for your comment. I would even do one better. I propose that AFP is already a much better archive and portable document format than PDF/A. It is much better in handling and referencing print resources for example. I do not think that we need a special format, but a few more extensions like compression would already do the trick. What AFP is missing mostly is a free viewer like Acrobat. We have proposed one to IBM many years ago, but maybe with the AFPC that might now happen.

      Like

  2. sevenkids says:

    Technology, open standards, AFP, PDF, Pascal, Cobol, Web 2.0, all so very nice. Standards, with the emphasis on the last s already mean there are multiple. Every technology vendor likes have its development as a standard. Either low level technology or high level applications.

    In the end the decision by the customer is made on a diversity of reasons. What is the knowledge of the IT manager, what are friends saying about it, references, costs, and many other criteria.

    I don’t say that pure technology is not important in the deployment. The industry is full of examples where organisations have chosen a certain hyped technology and are now confronted with massive costs in maintenance and a need to buy additional tools.

    The other side is that technology discussions only become relevant when the full implications are becoming clear. For some organisations a deployment in a XSL-Fo based document composition engine could be very beneficial, for an other it will eb a disaster. For an organisation a deployment in a cheap variable data printing package could be best, for an other there is a need for a enterprise deployment, with fault tolerancy, version management and workflow.

    Lets just keep Open Standards, with the emphasis on the last S.

    Like

  3. Max Pucher says:

    Thanks for the comment. I am not against standards, rather the opposite. I am against people claiming that something provides the benefit of an open standard when it doesn’t, and that something is better just because it is a standard.

    Like

  4. William L Broddy says:

    In my mind, the next generation of computer science grads have just discovered the value of tagging information to optimize document layout: XSL-FO. Unfortunately, they are the 8th generation of programmers to do so.

    I recently put together a ‘history of tagged composition”, which goes back to the mid-1960s (and I bet there are even earlier examples that might be found).

    Before there was XSL-FO there was:
    7. XML
    6. HTML
    5. SGML
    4. DCF / ISIL
    3. ATMS / XICS
    2. ATS (an early newspaper composition language)
    1. Watch Tower formatter (used for creating bible tracts in the late 1960s)

    My first experience with tagging was DCF in 1978, and I believed that it was a radical new way to build documents with reusable content. But that’s because I knew nothing about the previous 3 generations.

    And about every 8 years, a new generation of programmers ‘invent’ tagging all over again. The problem is that we old-timers see the same limitations and impending disasters that have plagued predecessor implementations.

    The problem with XSL-FO, in my opinion, is that its so generic that it takes significantly more processing to create documents that are faithfully rendered. To me its kindergarten composition.

    Eventually, the standards body controlling it will start to enrich its functions. My guess is that they will end up inventing something that looks almost identical to DCF :-).

    Like

  5. Max Pucher says:

    Bill, you may be right but it is worse than that. XSL-FO is just an intermediary format that really does not produce that much benefit. It is neither source nor final. it just adds problems, but because it is XML … IT MUST BE GOOD!

    Like

  6. […] The Truth about XSL-FO by Max J. Pucher […]

    Like

  7. Shoaib Anwar says:

    Reading both Max’s article and Bills comments and remembering the eternal quest for a common document standard to solve all problems has been like searching for the holy grail. I think in the diverse solution and business market a common standard in anything will suffocate creativity and innovation. I think having different solutions for different problems is okay and as the solutions become more capable they will eventually merge and the strongest solutions will evolve as winners thus reducing the need for devising one size fit all. Max your company and its solution is an example and Bill your work speaks for itself too. Hope to meet you guys one day again soon.

    Like

Leave a reply to Shoaib Anwar Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Max J. Pucher
© 2007-19

by Max J. Pucher. All rights reserved.

Real World Statistics
  • 239,768 readers

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 366 other subscribers
ISIS Papyrus on Twitter