Introducing the Pentaho Reporting compatibility mode

Every time I worked on the heart of Pentaho Reporting, the layout system, in the past, I wondered how the heck am I going to ensure that I do not break our customers existing reports – again.

Before we started work on the crosstab mode, we put a tiny layer of safety onto the engine by creating a set of ‘golden sample’ reports. A golden sample is the pre-rendered output of a report. Each time we make a change, our automated tests generate the output again and compare it with the known good output we have stored.

Over the last four long weeks (where I expected only two to spend) I rewrote large parts of the layout system. Crosstabs are a more dynamic structure than banded list-reports. While banded reports only grow downwards to fill an endless stream of pages, crosstabs grow both horizontally and vertically. The crosstab expands to the right for each new value of the column dimensions it finds, and it expands vertically, when the engine prints more row-dimension values.

The newly introduced table-layout system that powers the crosstabbing requires stricter rules for the layout elements to arrange them in a sensible fashion. Ordinarily, we want the resulting layout to be minimal (use as little space as possible, within the constraints set by the designer), stable (produce the same layout every time) and performant (don’t make me wait).

The old layout rules, however, were historically grown. They evolved around bugs, misunderstandings and the desperate need to not break reports already created ages ago. Breaking reports is fun – if fun includes loosing customers or getting angry calls. I value my sleep, so no more breaking reports for me, if I can avoid it.

From now on, Pentaho Reporting contains a brand new compatibility layer. This layer emulates all the old and buggy behavior to get a report output that is as close to the original release as possible. Our main concern with the compatibility is not necessarily to emulate show-stopper bugs, but to avoid those subtle changes where your report elements start slightly shifting around. When that happens, you can end up with either more pages than before, overlapping elements (and thus lost data in Excel and HTML exports) or anything in between.

How does it work?

Since Pentaho Reporting 3.9.0-GA, each report file contains a version marker in the “meta.xml” file contained in each PRPT-file. When we parse a report, we also read that version number and store it as the default compatibility setting. The report-designer preserves this setting over subsequent load/save cycles, so editing an old report in PRD-4.0 does not automatically erase or replace that marker.

We consider reports without a marker to be old reports that must have been created with PRD-3.8.3 or an even earlier version. Of course, the reporting engine treats any of the ancient xml-formats and the PRD-3.0 “.report” files as ancient and gives them the version number “3.8.3″.

When a report is executed, the report processor checks the compatibility marker. If it is an pre-4.0 marker, we enable our first compatibility mode. The mode changes how elements are produced and how styles are interpreted.

The most important part change is, that a defined ‘min-width’ or ‘min-height’ automatically serves as a definition for a ‘max-width’ and ‘max-height’. There are additional rules, for instance, we ignore layout settings on structural sections, like groups or the group-bodies.

The most important rule, however, is: If you have a legacy report, it cannot contain tables, and thus cannot contain any crosstabs. Tables require a proper interpretation of the layout rules. The old rules tend to contradict each other from time to time, which causes great distress to the table-calculations.A distressed table-layout calculation may commit suicide or may throw away your data, so we better do not allow that to happen.

So before you can start to use newer features in the reporting system, you have to

Migrate your reports

Report migration is the process of rewriting a report’s layout definition to match the new layout rules. During that rewrite, we try to keep the layout as closely as possible to the original. While we are at it, we remove some invalid properties (like layout styles on groups) and migrate the sizing to the updated width/height system (not using the min-max height hack).

You can initiate the migration by entering the migration dialog via “Extra->Migration”. The dialog will list what will happen to your report, and will prompt you to save your report before the migration starts. The migration cannot be undone with the “Edit->Undo” function, so this saved report is your security blanket for the migration.

If you are sure that your report will be fine without the rewrite, you can manually force a report to a different compatibility level via the “compatibility-level” attribute on the master-report object. Be aware that this voids your warranty – your report may run just fine, or it may blow up completely. All bets are open.

Once the migration is done, your report should work as before, but within the corset of the new, and stricter, layout rules.

And to be sure: Let me repeat it – you only need to migrate reports when you want to use new features on them. Your old and already published reports will continue to work just fine without any manual intervention.

Bonus Content: Min/Max and Preferred width and height

Until recently, the layout system was not able to handle the layout constraints for minimum, maximum and preferred sizes correctly. The safe and default option was to rely on the minimum sizes only. The system magically treated all minimum sizes as maximum sizes for most cases, unless the element had a dynamic-height flag set or had an explicit maximum width or height.

With PRD-4.0, the layout system uses better rules with less contradictions. Therefore it is safe now to rely on the preferred size for most cases.

In reports in PRD-4.0 compatibility, a minimum size defines an absolute minimum, and the element will not shink below that size. The maximum size defines the absolute maximum, if defined, then the element will never grow larger than that. The preferred size defines a flexible recommended size. In most cases this is the size your box will use.

But if your element has content that requires more space, it will get it (up to the limit imposed by the maximum size). Each element computes a ‘minimum chunk size’ – think of it as the largest word in text – and uses the maximum of the chunk-size and the defined preferred-width as effective size.

Try our new compatibility mode. See whether it preserves your reports, and if not, please, please, file me a bug-report!

One thought on “Introducing the Pentaho Reporting compatibility mode

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>