Category Archives: Development

A quick introduction to crosstabs in Pentaho Reporting 4.0

Crosstabs have been on the horizon for several years now. They lived a happy, undisturbed life along with the unicorns and gnomes guarding the pot of gold at the end of the rainbow.

With an endless recession and central banks selling off their gold reserves, the unicorn has been sold to a meat factory and the gnomes now assemble luxury cell-phones in a chinese factory.

So the day had to come that crosstabs have to work for a living. That day is now.

The added crosstab capabilities are the head feature of Pentaho Reporting 4.0 (now to be released in April 2013). To make crosstabs possible, the reporting engine had to move away from the strict banded approach of top-down layouting. The engine now has a true table layout that ensures that even misbehaving crosstab definitions will look and behave good.

Some sort of crosstab abilities were in experimental mode for very much of the last 4 years. The good news: The current implementation fixes all the problems we had with that experimental stuff. The bad news: Your old crosstab definitions will most-likely not work in 4.0 – you will have to recreate them properly with the new version.

To spin off crosstabs quickly, the Pentaho Report Designer ships with a quick-and-dirty crosstab creation dialog. This dialog pops up whenever you add a crosstab group.

Until we update the report designer with a proper user interface, you will have to edit all crosstab properties and elements in the structure tree.

Be aware: The code is pretty much happy-path – stray too far away from it and the big bad wolf will eat you, your children and your data. Wolves are a protected species, so bear your share and help to feed them properly. Before you are fully digested – file bug reports and feature requests in our JIRA system.

Enough prose: Let me just list some of the notable current behaviours and general ramblings.

  • normalization works now, but it is not all-knowing. It will only normalize on existing data items. If you whole dataset has Q1, Q2 and Q4 but fails to mention Q3 at least once, you will not see a Q3 anywhere.
  • The engine wants its data raw and bloody. Denormalize your data! If you use OLAP datasources, use Denormalized version of these datasources.
  • Sort your data to match your group structure.
  • column-headers currently do not repeat on the subsequent pages.
  • Setting your cell-contents (or any other row-layout child) to a percentage width will yield a zero-width in the final layout. (PRD-3970)
  • the report wizard really does not know how to handle anything crosstab related. (PRD-3860)
  • you can add charts into a crosstab cell.
  • if there is more than one row of data for a single crosstab cell, you have the choice to print either the first item, the last item or all items in a large list.

Features that will come, but are not implemented yet:

  • butterfly headers: Move row-headers into the middle of the data. (PRD-4005)
  • Subreports on all crosstab cells (PRD-4006)
  • style- and attribute-expressions for all crosstab elements (PRD-4007)
  • The no-data-band on the crosstab-group has no effect. (PRD-4008)
  • Summary-rows and columns will be positioned either at the start or the end of the respective group. (PRD-4009)
  • Details header will be printed. (API is defined, but not used yet). Printing of them will be optional. (PRD-3949)
  • Header cells can either be spanned cells, or can repeat its content for each child. (PRD-3386)
  • There will be a switch to make row- and column-headers disappear. You can then use the crosstab purely as a layouting utility. (PRD-4010)
  • sorting of data. This will be optional and will blow up if you expect us to do big-data sorting. Get a real database with SQL support 😉

Now go and download a CI build of PRD-trunk, will you!?

Linking to a report with the same export type

When you create reports connected with each other by links, you want to stay in the same output mode as your source report. When viewing a PDF report, you want the linked report to be PDF too, for instance.

So how would you do that in Pentaho Reporting?

(1) You need to know your current export type.

When you export a report, each output type has a unique identifier similar to this one “pagable/pdf”. The identifier consists of

(a) the basic export type:
  * pagable for paginated reports or
  * table for reports exported as layout-tables
and
(b) the content type
  * pdf for PDF export
  * html for HTML
  * .. and so on.

The BI-Server uses the same identifiers in the reporting plugin to select the correct output target for your reports. The parameter for this is called “output-target” and is documented in the Pentaho Wiki.

You probably know about the “ISEXPORTTYPE” function. This formula function allows you to test for a specific output target when the report runs. To get the export type you now could write a ugly long formula with many nested IF functions.

Or you can use this small BSH-Function instead:

Object getValue()
{
  return runtime.getExportDescriptor();
}

to get the export descriptor string directly.

(2) You need to feed this export identifier into your links.

Use the Drill-Linking functionality to add a manual parameter to your link. Name this parameter “output-target” and link this to your BSH-Function. If your function is named “ExportType”, then you would write “=[ExportType]” into the value field.

With that connection made, your reports will now be linked with the same output-target as the source report. Set the “Hide Parameter UI” option if you want to link against the target file without having to manually submit parameter.

Tidying up our HTML exporter

When it comes to flexibility, Pentaho Reporting always had a knack for erring on the obsessive side. With calculation formulas and scripting everywhere, a OEM or implementation partner has plenty of options to get the report just right. 

Our HTML export makes no exception here. Last year I talked a bit about the ability to inject custom HTML or JavaScript into the output to produce a richer web-experience, like Fancy Tooltips.

Injecting Scripts happens via two special attributes (“html::append-body” and “html::apend-body-footer”) which insert the raw content either before or after the generated content. So far nothing new.

Writing proper JavaScript is a art on its own, and is a lot easier when the resulting HTML document has a clean and digestible structure. The output of the Pentaho Reporting Engine was usually filled with numerous spans and divs, making it hard to wade through the elements generated by the report.

With the latest bug-fix for Pentaho Report Designer 3.9 the report generator produces clean and minimalistic HTML. Over the last two days, I implemented a filter to check for inherited CSS styles.

CSS (Cascading Style Sheets) defines two classes of style attributes: Local attributes and inherited attributes. The normal attributes are only defined for the current element. Attributes like “border” or “margin” only make sense for current element and would cause visual disturbances if passed on to the child elements. Inherited styles, like all font properties (color, family or size) get inherited. If a child element does not define its own settings for these properties, the child uses the parent element’s style as its own.

By encoding this logic into the HTML report output we can now omit all inherited styles if the same style has been defined on a parent element already. At the same time we can omit all local styles if the style is empty (no border, 0-pt padding etc.). After applying all these optimizations, most elements actually have no own style definitions anymore. This alone makes the report more readable, but we can do better.

As long as the style is empty and the element does not define local HTML attributes (“id”, “class” or any of the “html::on” attributes, we now can safely omit writing the element’s tag. For most cases, this now reduces the complexity of the HTML-DOM greatly and navigating the DOM becomes a lot easier.

Killing XActions for Reporting – slowly and with pleasure

The new Pentaho-Metadata data-source scripting-extension I produced recently seems to be well-received. We now have a great opportunity to fully cut back on XActions for plain reporting uses.

Many users still have to stick with their old XAction driven reports. I assume that they do not do that because they enjoy the pain, or love programming in XML. No, the majority needs to drive parameter queries where the query itself is computed. Reasons for that are many:

  • If everything is placed in one query, the query gets insanely complex and unmaintainable.
  • The query access legacy systems with no sane data models or weird partitioned tables.
  • The data-source needs to be configured based on some other parameter before it is used.

 Now users who have been locked in by these cases can now free themselves from the slavery of weird XML-programming. I happily announce that

Pentaho Reporting now ships with sane scripting support for JDBC, Pentaho Metadata, all Mondrian and all OLAP4J data-sources.

The data-source scripting system of the old days is now a dark memory of the past. The new scripting is integrated into the data-source itself, so scripts become more reusable. Editing large scripts on the report’s “query::name” attribute was never really fun anyway.

The scripting extension allows you to configure the data-source before it is used for the first time. You can now configure all aspects of the data-source via your script, including but not limited to any supported connection-parameter like the JNDI name or additional JDBC-connection properties:

var jndiConnectionProvider = dataFactory.getConnectionProvider();
jndiConnectionProvider.setConnectionPath("java://myjndiconnection");

You cannot get away with not configuring the data-sources at all, as the report designer and all other tools love to have something static to work with. But you can always reconfigure them to your liking before the report is run. The scripts are always called before the a query is executed on the data-source, even inside the Pentaho Report Designer or the data-source editors.

I don’t believe that we will need similar scripting support for the Kettle, Java-Method-Invocation, Table or the Scriptable data-sources. These data-sources either do their own scripting anyway or would not profit from any scripting abilities.

Doing the performance dance (again)

I just changed another bit of the table-export while integrating a patch for PRD-3631. Although the patch itself did take a few illegal shortcuts, it showed me a easier way of calculating the cell-backgrounds for the HTML, Excel and RTF exports of Pentaho Reporting.

After a bit more digging, I also fixed some redundant calls in the HTML and Excel exports for merged cells and row-styles. Both resulted in repeated calls to the cell-background calculator and were responsible for slowing down the reporting engine more than necessary.

The performance of my test reports improved a bit with those changes. But if any, then this case has shown me that clean report design is the major driver of a fast export.

The performance for the reports went up by 15 to 30 percent, with the larger changes on the reports with larger row-counts. However, the reports I test are surely non-representive, as there all elements are perfectly aligned and the report is designed to avoid merged cells.

The patch specifically claims to address performance problems in the cell-style calculation. Agreed, there were problems, and the patch addressed them. But there was no way I could see a 100% improvement on normal reports. Well, not reports that are well-designed and use the powerful little helpers that the Pentaho Report Designer offers to make reports well-aligned.

When I receive production reports for debugging, the picture is usually more bleak. Fields are place rather randomly, and usually misaligned by a few points. They start and end on rather random positions, and usually elements are not aligned to element boundaries across different sections.

Let’s take this visually fairly report as an example:

The many fine grey lines you can see mark element boundaries. The more lines you see, the more cells your report will have. Each cell not only means larger HTML files, it also means more processing time spent on computing cell-styles and cell-contents. Thick grey lines spanning across the whole section usually indicate elements that are off by less than one pixel.

These lines are produced by the Report Designer’s “View->Element Alignment Hints” feature. When this menu item is selected, you will get a better idea on how your report will look when exported into a table-export. If you cannot see the details clearly, zoom in. The Report designer happily magnifies the working area for you.

When exported to HTML, this report here created a whopping 35 columns
with another 35 rows. That is potentially 1225 cells. The resulting
HTML file has a size of 21,438 bytes. For a report with just a few items of text, this is a lot.

In general you’ll want to avoid having to many of these boundaries. In the basic design courses, teachers tell fairly early on that layout where element edges are aligned look cleaner and more pleasing for the eye. When you look at adverts or magazines, you can see this on how articles and images seem to sit along visual boundaries or dividing lines. For a well-designed report this is no different.

To help you design your reports in a well-designed fashion, the report designer comes with the “View->Snap to Elements” feature.

To clean up a report, I usually start by aligning elements that are sitting close together. Visually, it makes no difference whether a element starts at x=24 or x=24.548. For the reporting engine, this makes a difference, as a dumb little engine cannot decide whether the user was just lazy or had a very good reason to have a cell at exactly these positions (or whether some visual design would break by attempting to blindly fix it). 

With the “Snap To Elements” enabled, just select one element and drag the mis-aligned edge until it snaps off its current position. Then move it back into position. This time it will snap to one of the other elements. If your edges are very close, I drag the current edge towards the top (for the y-axis) or the left (x-axis) until it leaves the crowded area. When I return with it, it will snap to the first (top-most or left-most) edge in the group of elements by default. Repeat that with all elements that sit near the edge and very soon you should only see one thin line indicating a perfect alignment.

Additionally, you can also change the element’s position and size to a integer number in the “style” table on the right-hand side of the Report Designer. When you do that for all elements, your alignment task will become a lot easier. Now elements are either aligned or at least one point apart (and the mis-alignment is easier to spot).

The quickly cleaned up version of my sample report now has only 24 columns and 16 rows, but visually, you cannot tell the difference between the two of them. Theoretically, the resulting table can have 384 cells, compared to the mis-aligned report a reduction to a quarter of the original 1225 cells. And finally, the generated HTML file shrunk to a size of mere 8,853 bytes, one third of the original size. In my experience and with those numbers in mind the computing time for this optimized report should be roughly 10% to 15% better than the optimized version. In addition to that slight boost, your report will download faster and rendering the report in the browser will be a lot quicker as well.  

So remember, performance optimization starts in your report designer: When you optimize your report designs it instantly pays off with quicker rendering and smaller downloads.

Further optimization

That report uses line-elements to draw borders around the statistic sections. By using sub-bands with border definitions, that report could be simplified further, but I’ll leave that as an exercise for the class.

Pentaho Reporting’s Metadata DataSources now with more scripting power

Pentaho Reporting is rather flexible. No, astonishingly flexible. Calculations for styles, attributes and even queries make sure that there is always a way to tweak a report exactly the way you need it. But until now, there were a few things that were not exactly easy. Custom Data-sources and their method of calculated queries were one of them.

With today’s code drop, this story has fundamentally changed. The Pentaho Metadata data-source is the first data-source that combines static queries with optional scripting features. There is no need to have a “Custom-Metadata” datasource anymore. No need to cram your query calculation logic into a tiny field on the report’s query-attribute.

Here it is: The new and greatly improved Pentaho Metadata Data-Source:

The scripting extensions hide beneath the two new tabs on the top and on the query-text area. Those who don’t need scripting or who are scared of having to program can safely continue with their life – just ignore the scripting-tabs and you will be safe in your static world. But if you feel you are born to greatness and therefore you cannot be limited to predefined queries, then follow me on a exciting journey into the land of power-scripting.

The scripting extensions are split into a global script and a per-query script. The global script can be used to define shared functions or global variables that can be seen by all query-scripts. The init()function can also be used to change the configuration of the data-source itself. The per-query-scripts allow you to (a) customize the query string, (b) calculate the “additional fields”  information for the query-caching and (c) allows you to post-process the returned table-model.

The original “Custom-*-Datasources” always had the problem that you can only add one of these datasources per report or subreport. Now that we combine scripts with named queries, we can have a independent script for each of the queries and of course as many datasources as necessary.

The scripting backend uses the JSR-223 (javax.script) scripting system. By default we ship with both JavaScript and Groovy support. There are quite a few JSR-223 enabled languages out there, so if you have a favourite language, just drop it into the classpath on both the Pentaho Platform server and the Pentaho Report Designer and you are ready to code.

The picture above shows you the global script pane. For the two languages (JavaScript and Groovy) we ship with, we have a prepared template script that contains some documentation as well as the empty function declarations of the functions we expect to be able to call.

It is absolutely OK to delete any function you dont need – in that case Pentaho Reporting simply ignores that part. This will help you to keep your scripts small and simple. You can load scripts from external sources as well, but be aware that it is your responsibility to make these external scripts available to the report at runtime. So if you load a script from a local directory, you have to ensure that the same works on the BI-Server (or your production environment) too.

For the upcoming Pentaho Reporting 3.9.0 release, the Pentaho Metadata datasource is the only datasource with this sort of scripting. Once that design has proven itself and I have more feedback on how well it works, we will add this scripting system to all major data-sources in Pentaho Reporting 4.0.

PRD-3639: On Adding Version checks into Pentaho Reporting

From time to time I get support requests from users who try to create reports in the latest Pentaho Report Designer and then try to run it in a ancient installation of the Pentaho BI-Server. I already wrote about the limitations Einstein placed on us software developers, so lets all agree – it’s not possible and there will be no fix.

Along with a Voodoo-security change that obscures all passwords stored in a PRPT file, I now prevent the horse from ever drinking petrol again. I added a strict check while a report is parsed so that newer versions of a PRPT file cannot be opened in older servers. Whenever you try such a thing, you will now get a very clear error message telling you:

The report file you are trying to load was created with Pentaho Reporting 12.0 but you are trying to run it with Pentaho Reporting 3.9.0. Please update your reporting installation to match the report designer that was used to create this file.

and if you just missed a patch release, you will be able to continue, but get a warning urging you to upgrade.

The report file you are trying to load was created with Pentaho Reporting 3.9.5 but you are trying to run it with Pentaho Reporting 3.9.0. Your reporting engine version may not have all features or bug-fixes required to display this report properly.

Older releases that did not store version information in a usable format inside the PRPT files will not be checked. This check only applies to released versions. All development versions do not store any version information and do not validate version information they may find in the files. If you are using development versions I assume you know what you are doing.

For the upcoming 3.9.0 release, you won’t see any impact. The first time this check will be hit will be Pentaho Reporting 4.0 later next year and from that moment on I will never have to talk about poisoning my poor old horse with petrol again (it prefers bio-diesel anyway).

A Message from the Trenches ..

Over the last 6 weeks I finally found the time to dive into the crosstab related development. Crosstabbing as a data manipulation exercise is a rather easy and straight forward as an algorithm. Printing simple crosstabs without regard for user defined calculations is not hard either – if you are willing to stick to the simple model for eternity. But integrating the crosstabbing code so that the layouting uses our existing capabilities of style- and attribute-expressions, flexible layouts and decent scalability even when processing massive amounts of data – that takes more than a two-weeks prototype hacking.

Step zero on my quest happened earlier this year when I wrote a testing framework to validate the layouter output on a very low level. The idea of this testing framework is based on “golden samples” – known good data-dumps of the layout results. For every change I make, I now can validate if and how the final layout is affected. Sometimes I want to see changes (especially when fixing bugs), but for most parts I want to add new functions without breaking existing reports. The testing framework helps me to detect changes in the layout that I did not foresee.

The framework shields me from the system components, like the local font renderer (a real bugger that changes with every update of the operating system) and from changes in the local font files (Arial on Mac is not the same as Arial on Windows, for instance).

Step one of implementing crosstabs then involved getting a true table layout into the reporting engine. Our reporting engine does is a banded engine in the tradition of the old COBOL and RPG/400 reporting tools. Each band is fairly separate from other bands. On paper you get the illusion of dealing with tables, as normally row after row of data is printed at the same horizontal position. But our primitive model cannot adjust for changes in the column sizes, as once a band is printed it cannot be altered any more.

This is similar to trying to create a table in a word processor by only using tabs. Once you try to put a overly long item into your ‘cell’, the layout goes all wrong, as previous and following rows do not alter their column sizes to match the large item.

But if we use a proper table, with real columns and rows, the problem goes away almost immediately. A table allows us to place elements relative to other elements in other rows and to maintain that relationship as long as we want. For small tables, this may be for the whole data-set, but similar to the “table-layout: fixed” CSS style attribute, we can also define a cut-off point after a certain number of rows and thus balancing the need for keeping the table flexible and the need of not buffering too much layout data.

During a long interlude during the summer time I was busy working on two rather large bug-fix releases (3.8.1 and 3.8.3) with no time whatsoever to do any new development. (I just managed to sneak in a week or two of table-layout related work.). And now, finally, since Mid-September I am back into the layout system. By eliminating all distractions – including writing articles here – I managed to get some private cuddle-up time with the rendering system.

The reporting engine comes from a banded background, and thus our existing layout system was built around the assumption that the world is easily consumable in banded chunks. With the Citrus-rewrite two years ago I opened the layouter to a more flexible world view by recreating the layout system as a CSS/DOM oriented layout system. But without a need for cross-band layouts, I still ended up introducing many assumptions that only work in a banded world.

In the old (3.x) layout system, global structures like groups, subreports and root-level bands are produced by the “Renderer” class. The renderer is the central point of the layout calculations and manages both the creation of the layout nodes and the calculation of the final layout. The contents of the bands themselves are then computed by a class named “DefaultLayoutProducer”. This model fails horribly if the banded layout is just a subset of a larger structure, like a table, for instance. The “DefaultLayoutProducer” is not aware of the outside model, and the “Renderer” does not care what the “DefaultLayoutProducer” does within his own band. I created the model of a completely dysfunctional family here.

With the new system in place, there is only one point that produces layout nodes. The model is no longer a collection of local sub-models but one big model with a global state. That not only simplifies the code, it also opens up a new set of capabilities.

So far, the layout system rewrite is nearly complete. The “golden sample” tests of the engine-core project are running fine on my box now, but some of the integration tests in the “testcases” project still fail. Once they work, I can rewire the layouter to accept global table definitions across groups. I can also finally open up the group layouts to support more layout options, and thus allow to print groups horizontally instead of vertically, or even print the header, group-body and footer side-by-side.

For the next few weeks, I will be back to bug-fixing for the upcoming 3.9.0 release. This release will contain mostly bug-fixes and will be shipped with the BI-Server 4.5.0.

A CDF based parameter viewer

At our community conference in Frascati yesterday I gave a talk on how to replace the old GWT report viewer with a slim CDF based report viewer.

Giving Granny a Face-Lifting

This is the slightly edited full-text version of this talk.

Are you tired of our trusted GWT report viewer

When we introduced Pentaho Reporting 3.5, one of the major new features we added was the ability to run Pentaho Reports directly in the BI-Server without the need for writing or generating XActions. This feature instantly removed the number one headache our users had with reports on the server – the need for an additional runtime file, the XAction. The file contained the same information they already specified in the report designer. But to edit the file later, they would need to go into a totally different editor to do some sort of magical programming. Ordinary business users could and would not do that.

We created the report viewer with Google’s Webtoolkit, which promised an easy way of creating rich JavaScript UIs without having to resort to homegrown libraries.

Where has all the love gone?

There are no simple solutions. GWT turned out to create monolithic code that lived on its own island. GWT applications were nearly impossible to extend for normal (non-GWT-using) web-developers. The code was hard to debug (as normal debugger like Firebug cannot help much to make sense out of the autogenerated code). And GWT was slow. Slow to compile and slow to run (compared to other JavaScript alternatives).

Our partners had no particular love for GWT, and bit by bit we grew tired of it as well. Over time we realized that GWT would not be the silver bullet.

Our report viewer implementation also suffered from a few deficiencies. There is no easy way to create alternative layouts for the parameter UI, the date parameter input is simple and limited and long parameter texts can cause problems.

In the mean time, our consultants and partners worked around these limitations by using  CDF to build custom parameter pages for reports.

And then replace the GWT Viewer
… with a normal CDF-Dashboard?

CDF instantly solved several of their problems. In CDF you are free to design your parameter page the way you like it. CDF is by far more flexible than the GWT viewer (as you have simple code with loads of extension points available). CDF is made to create interactive dashboards with a rich user experience.

And CDF comes with a PRPT component, so it already knows how to drive a report.

… but you know you pay the price

But once again: There is no magic silver bullet. Writing an extra CDF file suffers from the same problem that XActions have. Suddenly you have to duplicate the parameter information from the report designer into the dashboard. You have to replicate all parameter dependencies.

CDF requires some technical skill to create a dashboard. Similar to XActions, the report designer cannot read a CDF file for editing (and given that CDF is JavaScript that you can program in any way you want, there is no way we could ever hope to build such an editor). CDFs duplicate the information that is already in the report, and any change to the report parameters must also be applied to the dashboard.

Creating CDF files is not free – your IT department or an external consultant has to do it for the ordinary business user. So from a business point of view, that sort of “premium parameter viewer” would only be feasible for critical reports where you can justify the high development and maintenance costs.

To make CDF work for all reports, we need to solve the “duplication of parameter information” problem.

A simple solution:
Let PRPT’s information drive CDF

Mike D’Amour architected the GWT reporting plugin as a RESTful service. We do intentionally avoided all of the GWT server side libraries to communicate with the server. The report viewer uses standard HTTP-GET or PUT calls to query the server, which responds either with content or XML files.

The report viewer only uses the server’s public URLs to get information about the report’s current parameters. In fact, anyone can call these URLs to get the same information. We do not have any limits on what kind of client you use to interact with the reporting plugin.

Flow, Report Viewer, Flow

The GWT report viewer uses a very simple algorithm to communicate with the server.

  1. First we query the parameter XML by passing all known parameter to the server.
  2. We parse the XML and render the UI
  3. We check whether the server found any problems with the parameter we given. If everything is OK, we ask the server for the report content.
  4. We wait for input from the user.
  5. On any new input or if the user hits submit, we go back to the start and query the parameter-XML again.

You can find more information about this cycle in one of my previous postings.

Action time

Click here to see a simple form-based parameter page. This demonstrates how to communicate with a Pentaho BI-server and shows the basic steps to parametrize an existing report. If you see a login window in the lower frame, then login and restart the demo.

The form itself is simple:

<html>
<head>
  <title>Report Viewer

<body>
<div style="border: 1px solid black; margin-bottom: 20px">
  <h1>Parameter Input
  <hr />
  <form action="http://demo.pentaho.com/pentaho/content/reporting" method="GET" target="viewer">
    <div>
      Report to load:
      <input name="solution" value="steel-wheels"/>
      <input name="path" value="reports"/>
      <input name="name" value="Invoice Statements.prpt"/>
    </div> 
    <h2>System parameter</h2>
  <label for="renderMode">Render Mode</label> 
  <select id="renderMode" name="renderMode" size="1"> 
       <option value="REPORT">REPORT</option> 
       <option value="XML">XML</option> 
  </select>
  <label for="output-target">Output Target</label> 
  <select id="output-target" name="output-target" size="1"> 
     <option value="table/html;page-mode=stream">Single page HTML (table/html;page-mode=stream)</option> 
     <option value="table/html;page-mode=page">Paginated page HTML (table/html;page-mode=stream)</option>
   </select>
  <h2>User parameter</h2>
  <label for="Customer">Customer</label>
  <input type="text" id="Customer" name="CustomerNo" value="242"/> 
  <label for="ReportStamp">Report Stamp</label> 
  <input type="text" id="ReportStamp" name="Report Stamp" value="Review"/>
  <div/> 
  <input type="submit"value="Go!"/>
</form>
  <h1>Report</h1>
  <iframe name="viewer" width="100%" height="50%"/>
</div>
</body>
</html>

In the first section we setup a few system level parameter. The form contains the path to the report we want to render (expressed as Pentaho Standard Triple – solution, path, name), the renderMode (that defines whether we query parameter information (XML) or whether we render the report (REPORT)) and finally the output target that defines what output the server should generate.

This form is already a valid method to supply parameter to the reporting plugin and shows that there is no magic involved.

Now, lets do the same again … in CDF

Jordan Ganoff wrote a prototype of a CDF dashboard reads the parameter information from the Pentaho reporting plugin to construct a dashboard.

Due to some security restriction in the JavaScript execution in browsers (same origin rule) I cannot provide a one-click example. Download the zip file and copy the contents into your BI-Server’s solution directory.

You can then switch to the new parameter viewer by replacing the “reportviewer/report.html” part with “web/reportviewer.html”.

So the original URL for your GWT report viewer

http://localhost:8080/pentaho/content/reporting/reportviewer/report.html?solution=steel-wheels&path=%2Freports&name=Invoice+Statements.prpt&locale=en_US

becomes this URL

http://localhost:8080/pentaho/content/reporting/resources/web/reportviewer.html?solution=steel-wheels&path=%2Freports&name=Invoice+Statements.prpt&locale=en_US

(Download the CDF based report viewer)

Lets hand over the microphone to Jordan to explain the architecture of this implementation.

Here’s a quick introduction:

The new report viewer is a collection of CDF components. You can follow
the logic starting in reportviewer.html’s load() function. We set up a
div to inject the prompt panel into and then call:

pentaho.common.prompting.createPromptPanel({
          destinationId: "promptPanel", 
          paramDefn: reportViewerParameterLookup(),
          refreshParamDefnCallback: reportViewerParameterLookup,
          extraComponents: [{type: "SubmitReportComponent", htmlObject: 'report-div'}]
});
    • destinationId: the element Id where the prompt panel will be injected into.

 

  • paramDefn: this is the parameter definition (parsed Parameter XML into an object)

 

 

  • refreshParamDefnCallback: function called whenever a parameter has changed its value. For now we will hit the ParameterXmlContentGenerator for a new parameter xml and parse it to a parameter definition every time a parameter value has changed.

 

 

  • extraComponents: Any additional cdf components you’d like initialized. I have a quick prototype component defined in reportviewer.html called “SubmitReportComponent” that will listen for the parameter “submit” to change (which is fired by the submit button on the prompt panel). When this parameter changes the update() method of the SubmitReportComponent is called. We build a valid reporting url and set the iframe’s src to that url to load the report. Pretty straight forward and is exactly how the existing report viewer works today.

 

Core architecture: the parameter panel itself is a CDF component which defines a layout for all widgets provided. The submit button widget is configured to listen to all CDF components created from any non-hidden parameter. Any time the submit component receives a parameter change event its update() function is called and we check if all parameters are valid. If they are all valid we fire a change event for the parameter “submit” which someone else can listen to and do what they need to do.

Once all components are created we pass them to CDF to initialize them. This will register them CDF and eventually calls update() on all components. It’s this update() that will inject the CDF components into the page.

That being said, the thinking here is that in the future anyone can create a prompting panel from a parameter xml and provide their own callback when the submit button is clicked (or any parameter is changed for that matter).

Extension points:

    • The mapping for widget types to parameter types is done in: parameter-prompting-builders.js in the object: pentaho.common.prompting.builders.ParameterWidgetBuilder.paramTypeToWidgetMapping. That’s the internal widget mapping that we use when looking up a widget type in pentaho.common.prompting.builders.ParameterWidgetBuilder.lookupCDFWidgetBuilder.

 

  • Most of the javascript is structured so it can be extended at any point. Hopefully I provided enough functions that can be overridden to make it easy for a hacker to tear it apart!

 

One of the items I’m expecting to change is the delegation of widget creation to layout panels. I’d like to pass the parameter definition to the layout panel and let it create any widgets it needs instead of creating them ahead of time for each non-hidden parameter. This should be a bit more extensible.

Thank you Jordan for creating this amazing marriage of dashboards and reporting!

Even though this is a prototype build, it proves the point that we can tweak CDF to become the new report viewer.

At this point, Gretchen asked me whether this will have any impact on any existing integration with the reporting plugin. Changing server side URLs is always a bad thing and Gretchen voiced the concerns of all our OEMs and partners who built an existing solution for the reporting plugin.

The changes we propose are purely client side changes. There is no need nor any plan to change server side APIs or change URLs or returned formats or any server side behaviour. You are safe.

As far as I know, this new report viewer will be part of the upcoming Pentaho BI-Server 4.5.

A world of new possibilities

This new report viewer adds a bunch of new possibilities to our reporting system. We can easily extend it, we can add new components and new ways of parametrizing reports. Parameters can look sexy and can be visually rich.

With CDF’s flexibility and ability to style the dashboard in any way you want, we can produce more flexible layouts that match existing corporate style guidelines of our customers. With the ability to quickly integrate new components we solve more business cases.

On top of my head, I can imagine a Google-Maps widget to select locations, clickable charts to select customers or product lines. Our Drill-linking can be used in new ways. Why not use a report or analyzer view to present your selection?

With CDF and the power of JavaScript in our hands we can also easily show or hide parameter as needed, or even produce selective parameter input paths based on the user’s selection.

With better parametrization, our reporting system will look more sexy, which means more people will be willing to use it. More customers is always a good thing.

But to make this bird fly, we need all the help that we can get.

How can you help to get it right from the start?

First and foremost: Give us your requirements. For us from engineering it is hard to know what obstacles you hit in the fields or what your clients ask you to solve. So instead of producing a system that is based on our limited “Steel-wheels” world, I would see an open discussion that comes up with a set of true requirements that match what you see from your customers every day.

If you already use CDF to drive parametrized reports, tell us all about it? What is the problem you are solving here? What extra work did you have to do to make it work for your use case? If you wrote some parameter input that might be useful for other: Would you share it with us?

Help us to expand the parameter definition dialog in the report designer so that we can easily add additional attributes to the report parameters. This way we can prototype faster and you can use this to pass additional configuration settings to the CDF parameter viewer.

And if you cannot give anything, then at least test what we write with your data. The earlier you test, the better we will be able to react to the results. And please, please: Test the early builds as well. If the product is already in RC (Release-Candidate) state then it is very hard to make major architectural changes. Your tests of the early builds help us to know whether we move in the right direction and allow us to correct our course when we are not.

So lets start the discussion here and now

What would you need from a parameter viewer? What requirements did you meet that forced you to implement your own dashboard-parameter-viewer?

Creating your own Parameter UI for the Pentaho BI-Server

Our BI-Server ships with a default GWT parameter UI for the parameter defined on a report. If you had been around for a while, then you will remember the sigh of relieve when we freed everyone from the tyranny of XActions for running simple reports. Since then the parameter capabilities of the reporting module grew and grew with every release making these parameters easier to use than the XActions’s original design.

GWT is nice for a old and grumpy Java developer like me, as I do not have to worry about JavaScript (untyped, for heaven’s sake, untyped!). But the hardcore nerds like Pedro “JavaScript is my life” Alves do not like the monolithic garbage the GWT compiler spits out. To slow, to heavy, and foremost: Not really extensible unless you recompile the beast for each change. And worst of all: I agree to these complaints.

However, there is a silver lining on the horizon. Our architecture is open, so you are able to replace the GWT code with your own magic with no problems at all. And here is how you would do it:

Basics: The parameter UI architecture

The parameter UI works as a REST service. The server receives calls and sends responses based on the parameters given, without holding any server side state elsewhere.

(1) The browser loads the report.html page and initializes the GWT parameter application (GWT from now on).

(2) GWT calls the server’s report handler with “renderMode=PARAMETER” or “renderMode=XML”. If there are values for any of the parameters known, then these values are given on the URL using ordinary HTTP-GET or POST requests.

The URL that is called is something like this:

http://demo.pentaho.com/pentaho/content/reporting?renderMode=XML&solution=steel-wheels&path=%2Freports&name=Product+Sales.prpt

(3) The server responds with the parameter XML document. If the renderMode is XML, this document also contains the number of pages in the report. The server only returns page numbers if the parameter validate correctly and if the pagination does not cause any other errors.

(If you are logged into the demo server, call the URL from step 2 in your browser to see the XML document the server returns.)

(4) GWT creates a UI for all parameters based on the Parameter XML document. All information is given as attributes on the parameter. The parameter’s possible values and current value are given in that document as well. These sets of values can change if a other parameter changes.

(5) If all parameters validated correctly (according to the Parameter XML document), it now sends a request to retrieve the rendered report. Again, this is a ordinary HTTP-GET call with all parameters attached onto the URL.

http://demo.pentaho.com/pentaho/content/reporting?renderMode=REPORT&solution=steel-wheels&path=%2Freports&name=Product+Sales.prpt

(6) The Browser displays the report content in the IFrame below the GWT parameter UI.

(7) Paging through the report jumps back to step (6) and updates the report frame.

(8) Changing a parameter value jumps back to step (2) and updates the parameter information.

The mystical Parameter XML

The parameter XML document is a description of all known parameters that the reporting plugin understands. The same format is also used by the Analyzer component and you can even get parameter information in this format out of XActions.

<?xml version="1.0" encoding="UTF-8"?>
  <parameters accepted-page="-1" autoSubmitUI="true" is-prompt-needed="false" layout="vertical" page-count="1" paginate="true" subscribe="false">
    <parameter is-list="true" is-mandatory="false" is-multi-select="false" is-strict="true" name="PROD_LINE" type="java.lang.String">
      <attribute name="role" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="user"/>
      <attribute name="parameter-layout" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="horizontal"/>
      <attribute name="parameter-render-type" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="togglebutton"/>
      <attribute name="label" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="Line"/>
      <attribute name="mandatory" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="true"/>
      <values>
        <value label="Classic Cars" null="false" selected="true" type="java.lang.String" value="Classic Cars"/>
        <value label="Motorcycles" null="false" selected="false" type="java.lang.String" value="Motorcycles"/>
        <value label="Ships" null="false" selected="false" type="java.lang.String" value="Ships"/>
        <value label="Planes" null="false" selected="false" type="java.lang.String" value="Planes"/>
      </values>
    </parameter>
    <parameter is-list="true" is-mandatory="false" is-multi-select="false" is-strict="true" name="PROD_CODE" type="java.lang.String">
      <attribute name="role" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="user"/>
      <attribute name="parameter-visible-items" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="6"/>
      <attribute name="parameter-render-type" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="list"/>
      <attribute name="label" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="Product"/>
      <attribute name="mandatory" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="true"/>
      <values>
        <value label="1952 Alpine Renault 1300" null="false" selected="true" type="java.lang.String" value="S10_1949"/>
        <value label="1972 Alfa Romeo GTA" null="false" selected="false" type="java.lang.String" value="S10_4757"/>
        <value label="1962 LanciaA Delta 16V" null="false" selected="false" type="java.lang.String" value="S10_4962"/>
        <!-- loads of parameter values removed -->
        <value label="1961 Chevrolet Impala" null="false" selected="false" type="java.lang.String" value="S24_4620"/>
        <value label="1982 Camaro Z28" null="false" selected="false" type="java.lang.String" value="S700_2824"/>
      </values>
    </parameter>
    <parameter is-list="true" is-mandatory="true" is-multi-select="false" is-strict="true" name="output-target" type="java.lang.String">
      <attribute name="role" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="system"/>
      <attribute name="preferred" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="true"/>
      <attribute name="parameter-group" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="parameters"/>
      <attribute name="parameter-group-label" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="Report Parameters"/>
      <attribute name="label" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="Output Type"/>
      <attribute name="parameter-render-type" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="dropdown"/>
      <attribute name="hidden" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="false"/>
      <values>
       <value label="HTML (Paginated)" null="false" selected="true" type="java.lang.String" value="table/html;page-mode=page"/>
       <value label="HTML (Single Page)" null="false" selected="false" type="java.lang.String" value="table/html;page-mode=stream"/>
       <value label="PDF" null="false" selected="false" type="java.lang.String" value="pageable/pdf"/>
       <value label="Excel" null="false" selected="false" type="java.lang.String" value="table/excel;page-mode=flow"/>
       <value label="Excel 2007" null="false" selected="false" type="java.lang.String" value="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet;page-mode=flow"/> 
       <value label="Comma Separated Value" null="false" selected="false" type="java.lang.String" value="table/csv;page-mode=stream"/> 
       <value label="Rich-Text-Format" null="false" selected="false" type="java.lang.String" value="table/rtf;page-mode=flow"/>
       <value label="Text" null="false" selected="false" type="java.lang.String" value="pageable/text"/>
     </values>
   </parameter>
   <!-- many many more parameter -->
 </parameters>

To create a usable UI, you need to look at some critical information in the parameter UI.

The parameter element itself contain the most critical information for a parameter.

<parameter is-list="true" is-mandatory="true" is-multi-select="false" is-strict="true" name="output-target" type="java.lang.String">

The parameter name gives the parameters internal name. Usually each parameter also has a “label” defined as attribute. The “label” is shown to the user, the “name” is sent to the server. The parameter element also indicates whether a parameter is a list or plain parameter. List parameter have a predefined list of values from which the user can choose from. Some list parameter allow the user to select multiple values (“is-multi-select”) or control whether the user can specify additional values (“is-strict”) or whether the parameter can be empty (“is-mandatory”). You can use this to perform client-side validity checks if you want to. But no matter what you do, the server checks the input anyway, as client side input is untrusted input. Not all parameter that are listed in the parameter document are shown to the user. Some are system level parameter that are only useful in certain situations (like all subscription parameter). A parameter is hidden if the hidden attribute is set to true.

 <attribute name="hidden" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="false"/>

How a parameter is presented to the user is defined by the “parameter-render-type”. Right now, this attribute can have the following values:

  • dropdown
  • list
  • radio
  • checkbox
  • togglebutton
  • textbox
  • datepicker
  • multi-line
 <attribute name="parameter-render-type" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="dropdown"/>

Data formats for sending and receiving data

Sending parameter information over the internet can be a funny exercise. HTTP only allows to send text, so objects like dates and numbers need to be encoded properly. The Pentaho Reporting plugin expects all parameters in a locale-independent standard format. Numbers must be encoded as decimal numbers in plain english format (point as decimal separator, no thousands separators) using the format string “#0.#########”. Dates and Times must be given in the standard ISO format using the format string “yyyy-MM-dd’T’HH:mm:ss.SSS” or (if encoded with timezone information “yyyy-MM-dd’T’HH:mm:ss.SSSZZZZ”). (Also see: “Its about time – better Date parameter handling in Pentaho Report Designer“) Multi-selection parameter must be given by repeating the parameter name for each value. The order of multi-selection values is taken into account when processing the parameters on the server side. Example:

../report?multi-select=Value1&multi-select=Value2

Now all we need is a bit of JavaScript magic on the client side to replace the grumpy old GWT parameter application with a lightweight and smooth pure JavaScript/AJAX solution. Stay tuned ..

Additional Reading Material