LibDocBundle – Pentaho Reporting meets ODF

One of the main problems with dealing with report-definitions we encountered during the last year was the ugly and error prone process of deploying (or in Pentaho Speak: publishing) reports to a server or applications.

Many report-definitions consist of several files – the main XML-file, images, sub-report and data-source definitions and resource-bundles for the report localization. Keeping all these files in sync and dealing with them without running into conflicts was more art than science.

Our release plans include some heavy refactorings to the file format to allow the report-designer, report-wizard and engine to share a single file format. Going forward with creating just a bunch of new XML-schemas would have worsen the problem to the point where we – as developers – would have to work around the bugs caused by that file-system chaos instead of spending time for useful developments.

One major truth in software engineering is: No matter what your problem is, someone else probably solved it already. So instead of thinking of a new and overly fancy way to create a container file-format, we just have to look around how other systems asdress these problems.

Looking to Microsoft, Business Objects and most other “old” vendors yielded no usable results. Using obscure binary formats like OLE-containers might have been a good idea in the last century, but as we dont intend to build yet another island solution, we quickly forgot about that path.

But if you followed the news, then the solution is rather obvious: The OpenDocument  file format (ODF) used by OpenOffice and others provides a good framework for storing document content. But ODF is more than just storing some files in a ZIP file.

The ODF-Archives contain a standard way of describing document wide meta-data, the structure of the archive is well defined and the standard even provides means to encrypt the contents of the file.

Although we do not intend to implement (or use) the core XML schema defined for the actual document contents, the supporting framework around that format is to good to miss.

After two weeks of studying, planing and coding, we are now able to show the first result on our long way to version 1.0.

It Just Works(tm): Plug-and-play bundle reading

LibDocBundle hooks seamlessly into the LibLoader and LibRepository libraries and provides transparent access to the documents inside the ZIP file. An application that used to work with XML files directly now can work with the ZIP-archives in exactly the same way. There are no changes to the code needed – just point your “File-Open” dialog to the zip-document-bundle instead of the raw-xml file and it “just works(tm)”.

Of what use is a good standard, if it is to complicated to be implemented easily

For bundle-editing applications like the report-designer or report-wizard, LibDocBundle also provides bundle-implementations and utility classes which make it easy to create standard-conforming document bundles.

And now that we have a clean storage container, lets fill it with life. Next stop: The Unified XML file format.

This entry was posted in Development on by .
Thomas

About Thomas

After working as all-hands guy and lead developer on Pentaho Reporting for over an decade, I have learned a thing or two about report generation, layouting and general BI practices. I have witnessed the remarkable growth of Pentaho Reporting from a small niche product to a enterprise class Business Intelligence product. This blog documents my own perspective on Pentaho Reporting's development process and our our steps towards upcoming releases.