Tidying up our HTML exporter

When it comes to flexibility, Pentaho Reporting always had a knack for erring on the obsessive side. With calculation formulas and scripting everywhere, a OEM or implementation partner has plenty of options to get the report just right. 

Our HTML export makes no exception here. Last year I talked a bit about the ability to inject custom HTML or JavaScript into the output to produce a richer web-experience, like Fancy Tooltips.

Injecting Scripts happens via two special attributes (“html::append-body” and “html::apend-body-footer”) which insert the raw content either before or after the generated content. So far nothing new.

Writing proper JavaScript is a art on its own, and is a lot easier when the resulting HTML document has a clean and digestible structure. The output of the Pentaho Reporting Engine was usually filled with numerous spans and divs, making it hard to wade through the elements generated by the report.

With the latest bug-fix for Pentaho Report Designer 3.9 the report generator produces clean and minimalistic HTML. Over the last two days, I implemented a filter to check for inherited CSS styles.

CSS (Cascading Style Sheets) defines two classes of style attributes: Local attributes and inherited attributes. The normal attributes are only defined for the current element. Attributes like “border” or “margin” only make sense for current element and would cause visual disturbances if passed on to the child elements. Inherited styles, like all font properties (color, family or size) get inherited. If a child element does not define its own settings for these properties, the child uses the parent element’s style as its own.

By encoding this logic into the HTML report output we can now omit all inherited styles if the same style has been defined on a parent element already. At the same time we can omit all local styles if the style is empty (no border, 0-pt padding etc.). After applying all these optimizations, most elements actually have no own style definitions anymore. This alone makes the report more readable, but we can do better.

As long as the style is empty and the element does not define local HTML attributes (“id”, “class” or any of the “html::on” attributes, we now can safely omit writing the element’s tag. For most cases, this now reduces the complexity of the HTML-DOM greatly and navigating the DOM becomes a lot easier.

This entry was posted in Development on by .
Thomas

About Thomas

After working as all-hands guy and lead developer on Pentaho Reporting for over an decade, I have learned a thing or two about report generation, layouting and general BI practices. I have witnessed the remarkable growth of Pentaho Reporting from a small niche product to a enterprise class Business Intelligence product. This blog documents my own perspective on Pentaho Reporting's development process and our our steps towards upcoming releases.