The Truth about XSL-FO ‘Open Standards’
We at ISIS Papyrus can say that we support ‘open standards’ such as C++, HTTP, Linux, AFP, PDF and so on. We do even support many XML based standards despite my scepticism about its widely promoted benefits. SOA is one of them and there are many others.
All vendors in the enterprise document composition market who use XSL-FO claim that because it is an ‘open standard’ that makes their solution superior, more modern and customer oriented than proprietary ones. The use of an XML-like structure does not prove a kind of elevated status of intelligence or a future oriented technology. It means that the vendor took the easy route and used an existing format. Yes, we did the same thing with AFP many years ago. The main difference is that we truly implemented IBM’s AFP format all the way. Vendors who use XML formats mostly don’t do that. The reason is that you can’t innovate based on open standards. Here is my take:
‘Open’ means that all defining structures of the complete application are published formats. ‘Standard’ means that the file formats and program functions are FULLY compatible to a large percentage of solutions in the market. The current vendors using XSL-FO do not fall in these categories! I found only one vendor who fully publishes on his website the details of which version of which XML specifications are supported and which function in which specification is implemented and which are not. But even this one vendor does NOT specify how much proprietary code creates how many additional functions that make the product not compatible with the ‘open standard’. That vendor claims three PATENTS in relationship to XSL-FO. How in the world can someone have the chuzpe to call that OPEN? All other vendors choose a simpler route from the outset by saying that their product IS BASED on the ‘open standard,’ which already implies that the standard is a foundation ONLY on which any number of proprietary extensions were implemented.
I propose that NONE of the formatting applications based on XSL-FO are fully portable between the vendors. Obviously those vendors do not want to be THAT OPEN, because it would mean that you can take the application from one vendor to the next at any time. Consider that a complete correspondence solution is not just the formatted XSL-FO file but include the metadata entities, the resource assets, the external data interface, the user interface definition with data mapping, and the process and post-processing definitions. If portability is a key requirement for you, ask for a written guarantee that the above complete application can be imported into other products. You will not get that. If it is not a key requirement why would you bother with XSL-FO?
Many years ago when ISIS was the first and only one to propose that a standard ought to be used, our competitors claimed that using AFP was a drawback and limitation. The only published document standard at the time was IBM’s AFP and we used it and still do. Then we also supported IBM’s OGL and PPFA. Today we do support the import and use of XML formatted elements today (CSS, XSL, and can output XSL-FO) but we would not claim that to be a standard software function.
Here is a list of problems with XSL-FO functionality, performance and compatibility:
- All non-XML data formats have to parsed, validated and XML converted.
- Multiple XML input files require specific XSLT definitions for each combination.
- The design of XSL to XSL-FO to print format IS NOT at all WYSIWYG.
- Different XSL-FO processors for identical output formats are not fully compatible.
- XSL-FO page regions are positioned by dummy-tables and can overlap.
- XSL-FO processors for different targets produce VERY DIFFERENT results.
- XSL-FO processors cannot handle relative posititioning of items.
- No access to current formatting position for white-space management.
- XSL to XSL-FO to PRINT format cannot guarantee a certain number of pages.
- XSL-FO is not able to provide the true number of pages of a complete document.
- Most XSL-FO formatters need therefore to predefine rigid page masters.
- Web pages or emails need special XSLT’s that bypass XSL-FO formats.
- XML needs to use URI as substitute for embedded resources or binaries.
- Relative external URI references reduce the portability of an XML file.
- NON-standard functions produce barcodes and charts embedded as SVG.
- SVG conversion to target formats for printing is VERY inaccurate.
- XSL-FO products do not use any of the XML business rule ‘standards’.
- Many custom Java, .NET or scripting functions are used to generate XSL-FO.
- FIVE to SEVEN XML parsing and writing steps are needed to the printed page.
- XSL-FO is thus not suitable for high-speed production (millions of pages)
- Numerous codepage issues on input, handling and mapping to print fronts.
- XSL-FO does not support for print-time mapping of layouts into pages.
- All post-processing, print management and resource functions are proprietary.
If a vendor claims that his implementation of XSL-FO does not have the above problems that may be actually true, but it means that the product is not longer compatible with XSL, XSL-FO and the ‘standard’ XSL-FO drivers.
A XSL-FO product could only be standard if built from Open Source, but vendors have to use non-standard components such as GUI, rule, barcode or chart functions. Open Source continues to evolve rapidly while an enterprise solution needs stable and tested functionality. That means that in a very short time, your solution is far away from the ‘standard’ that everyone else uses. When you then run into a problem the vendor cannot easily apply the Open Source fixes. We use a few licensed Open Source libraries but would never claim that they represent a standard and only use them if we are in absolute control of bug-fixing ourselves.
Finally, I recommend that potential users should not only consider the above issues with products based on XSL-FO but also whether the lack of original vendor know-how and the inherent problems of XML structures and open source make the solution aceptable for the enterprise.
XML-Extensible Markup Language (XML) 1.0. W3C http://www.w3.org/TR/1998/REC-xml-19980210 XML Names - Namespaces in XML. W3C Recommendation. See http://www.w3.org/TR/REC-xml-names XPath - XML Path Language. W3C Recommendation. See http://www.w3.org/TR/xpath CSS2 - Cascading Style Sheets, level 2 (CSS2). http://www.w3.org/TR/1998/REC-CSS2-19980512 DSSSL - ISO/IEC 10179:1996. Document Style Semantics and Specification Language (DSSSL). HTML- HTML 4.0 specification. W3C Recommendation. See http://www.w3.org/TR/REC-html40 IANA- Character Sets. See ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets. RFC2278 - N. Freed, J. Postel. IANA Charset Registration Procedures. IETF RFC 2278. See http://www.ietf.org/rfc/rfc2278.txt. RFC2376 - XML Media Types. IETF RFC 2376. See http://www.ietf.org/rfc/rfc2376.txt. RFC2396 - Uniform Resource Identifiers (URI) IETF RFC 2396. See http://www.ietf.org/rfc/rfc2396.txt. UNICODE TR10 - Unicode Consortium. See http://www.unicode.org/unicode/reports/tr10/index.html. XHTML- XHTML 1.0: The Extensible HyperText Markup Language. W3C, See http://www.w3.org/TR/xhtml1 XPointer- XML Pointer Language (XPointer). W3C Working Draft. See http://www.w3.org/TR/xptr XML Stylesheet - W3C. See http://www.w3.org/TR/xml-stylesheet XSL - Extensible Stylesheet Language (XSL). W3C Working Draft. See http://www.w3.org/TR/WD-xsl XSL-FO - http://en.wikipedia.org/wiki/XSL_Formatting_Objects