Category Archives: Parameter

Linking to a report with the same export type

When you create reports connected with each other by links, you want to stay in the same output mode as your source report. When viewing a PDF report, you want the linked report to be PDF too, for instance.

So how would you do that in Pentaho Reporting?

(1) You need to know your current export type.

When you export a report, each output type has a unique identifier similar to this one “pagable/pdf”. The identifier consists of

(a) the basic export type:
  * pagable for paginated reports or
  * table for reports exported as layout-tables
and
(b) the content type
  * pdf for PDF export
  * html for HTML
  * .. and so on.

The BI-Server uses the same identifiers in the reporting plugin to select the correct output target for your reports. The parameter for this is called “output-target” and is documented in the Pentaho Wiki.

You probably know about the “ISEXPORTTYPE” function. This formula function allows you to test for a specific output target when the report runs. To get the export type you now could write a ugly long formula with many nested IF functions.

Or you can use this small BSH-Function instead:

Object getValue()
{
  return runtime.getExportDescriptor();
}

to get the export descriptor string directly.

(2) You need to feed this export identifier into your links.

Use the Drill-Linking functionality to add a manual parameter to your link. Name this parameter “output-target” and link this to your BSH-Function. If your function is named “ExportType”, then you would write “=[ExportType]” into the value field.

 
With that connection made, your reports will now be linked with the same output-target as the source report. Set the “Hide Parameter UI” option if you want to link against the target file without having to manually submit parameter.

A CDF based parameter viewer

At our community conference in Frascati yesterday I gave a talk on how to replace the old GWT report viewer with a slim CDF based report viewer.

Giving Granny a Face-Lifting

This is the slightly edited full-text version of this talk.

Are you tired of our trusted GWT report viewer

When we introduced Pentaho Reporting 3.5, one of the major new features we added was the ability to run Pentaho Reports directly in the BI-Server without the need for writing or generating XActions. This feature instantly removed the number one headache our users had with reports on the server – the need for an additional runtime file, the XAction. The file contained the same information they already specified in the report designer. But to edit the file later, they would need to go into a totally different editor to do some sort of magical programming. Ordinary business users could and would not do that.

We created the report viewer with Google’s Webtoolkit, which promised an easy way of creating rich JavaScript UIs without having to resort to homegrown libraries.

Where has all the love gone?

There are no simple solutions. GWT turned out to create monolithic code that lived on its own island. GWT applications were nearly impossible to extend for normal (non-GWT-using) web-developers. The code was hard to debug (as normal debugger like Firebug cannot help much to make sense out of the autogenerated code). And GWT was slow. Slow to compile and slow to run (compared to other JavaScript alternatives).

Our partners had no particular love for GWT, and bit by bit we grew tired of it as well. Over time we realized that GWT would not be the silver bullet.

Our report viewer implementation also suffered from a few deficiencies. There is no easy way to create alternative layouts for the parameter UI, the date parameter input is simple and limited and long parameter texts can cause problems.

In the mean time, our consultants and partners worked around these limitations by using  CDF to build custom parameter pages for reports.

And then replace the GWT Viewer
… with a normal CDF-Dashboard?

CDF instantly solved several of their problems. In CDF you are free to design your parameter page the way you like it. CDF is by far more flexible than the GWT viewer (as you have simple code with loads of extension points available). CDF is made to create interactive dashboards with a rich user experience.

And CDF comes with a PRPT component, so it already knows how to drive a report.

… but you know you pay the price

But once again: There is no magic silver bullet. Writing an extra CDF file suffers from the same problem that XActions have. Suddenly you have to duplicate the parameter information from the report designer into the dashboard. You have to replicate all parameter dependencies.

CDF requires some technical skill to create a dashboard. Similar to XActions, the report designer cannot read a CDF file for editing (and given that CDF is JavaScript that you can program in any way you want, there is no way we could ever hope to build such an editor). CDFs duplicate the information that is already in the report, and any change to the report parameters must also be applied to the dashboard.

Creating CDF files is not free – your IT department or an external consultant has to do it for the ordinary business user. So from a business point of view, that sort of “premium parameter viewer” would only be feasible for critical reports where you can justify the high development and maintenance costs.

To make CDF work for all reports, we need to solve the “duplication of parameter information” problem.

A simple solution:
Let PRPT’s information drive CDF

Mike D’Amour architected the GWT reporting plugin as a RESTful service. We do intentionally avoided all of the GWT server side libraries to communicate with the server. The report viewer uses standard HTTP-GET or PUT calls to query the server, which responds either with content or XML files.

The report viewer only uses the server’s public URLs to get information about the report’s current parameters. In fact, anyone can call these URLs to get the same information. We do not have any limits on what kind of client you use to interact with the reporting plugin.

Flow, Report Viewer, Flow

The GWT report viewer uses a very simple algorithm to communicate with the server.

  1. First we query the parameter XML by passing all known parameter to the server.
  2. We parse the XML and render the UI
  3. We check whether the server found any problems with the parameter we given. If everything is OK, we ask the server for the report content.
  4. We wait for input from the user.
  5. On any new input or if the user hits submit, we go back to the start and query the parameter-XML again.

You can find more information about this cycle in one of my previous postings.

Action time

Click here to see a simple form-based parameter page. This demonstrates how to communicate with a Pentaho BI-server and shows the basic steps to parametrize an existing report. If you see a login window in the lower frame, then login and restart the demo.

The form itself is simple:

<html>
<head>
  <title>Report Viewer

<body>
<div style="border: 1px solid black; margin-bottom: 20px">
  <h1>Parameter Input
  <hr />
  <form action="http://demo.pentaho.com/pentaho/content/reporting" method="GET" target="viewer">
    <div>
      Report to load:
      <input name="solution" value="steel-wheels"/>
      <input name="path" value="reports"/>
      <input name="name" value="Invoice Statements.prpt"/>
    </div> 
    <h2>System parameter</h2>
  <label for="renderMode">Render Mode</label> 
  <select id="renderMode" name="renderMode" size="1"> 
       <option value="REPORT">REPORT</option> 
       <option value="XML">XML</option> 
  </select>
  <label for="output-target">Output Target</label> 
  <select id="output-target" name="output-target" size="1"> 
     <option value="table/html;page-mode=stream">Single page HTML (table/html;page-mode=stream)</option> 
     <option value="table/html;page-mode=page">Paginated page HTML (table/html;page-mode=stream)</option>
   </select>
  <h2>User parameter</h2>
  <label for="Customer">Customer</label>
  <input type="text" id="Customer" name="CustomerNo" value="242"/> 
  <label for="ReportStamp">Report Stamp</label> 
  <input type="text" id="ReportStamp" name="Report Stamp" value="Review"/>
  <div/> 
  <input type="submit"value="Go!"/>
</form>
  <h1>Report</h1>
  <iframe name="viewer" width="100%" height="50%"/>
</div>
</body>
</html>

In the first section we setup a few system level parameter. The form contains the path to the report we want to render (expressed as Pentaho Standard Triple – solution, path, name), the renderMode (that defines whether we query parameter information (XML) or whether we render the report (REPORT)) and finally the output target that defines what output the server should generate.

This form is already a valid method to supply parameter to the reporting plugin and shows that there is no magic involved.

Now, lets do the same again … in CDF

Jordan Ganoff wrote a prototype of a CDF dashboard reads the parameter information from the Pentaho reporting plugin to construct a dashboard.

Due to some security restriction in the JavaScript execution in browsers (same origin rule) I cannot provide a one-click example. Download the zip file and copy the contents into your BI-Server’s solution directory.

You can then switch to the new parameter viewer by replacing the “reportviewer/report.html” part with “web/reportviewer.html”.

So the original URL for your GWT report viewer

http://localhost:8080/pentaho/content/reporting/reportviewer/report.html?solution=steel-wheels&path=%2Freports&name=Invoice+Statements.prpt&locale=en_US

becomes this URL

http://localhost:8080/pentaho/content/reporting/resources/web/reportviewer.html?solution=steel-wheels&path=%2Freports&name=Invoice+Statements.prpt&locale=en_US

(Download the CDF based report viewer)

Lets hand over the microphone to Jordan to explain the architecture of this implementation.

Here’s a quick introduction:

The new report viewer is a collection of CDF components. You can follow
the logic starting in reportviewer.html’s load() function. We set up a
div to inject the prompt panel into and then call:

pentaho.common.prompting.createPromptPanel({
          destinationId: "promptPanel", 
          paramDefn: reportViewerParameterLookup(),
          refreshParamDefnCallback: reportViewerParameterLookup,
          extraComponents: [{type: "SubmitReportComponent", htmlObject: 'report-div'}]
});
    • destinationId: the element Id where the prompt panel will be injected into.

 

  • paramDefn: this is the parameter definition (parsed Parameter XML into an object)

 

 

  • refreshParamDefnCallback: function called whenever a parameter has changed its value. For now we will hit the ParameterXmlContentGenerator for a new parameter xml and parse it to a parameter definition every time a parameter value has changed.

 

 

  • extraComponents: Any additional cdf components you’d like initialized. I have a quick prototype component defined in reportviewer.html called “SubmitReportComponent” that will listen for the parameter “submit” to change (which is fired by the submit button on the prompt panel). When this parameter changes the update() method of the SubmitReportComponent is called. We build a valid reporting url and set the iframe’s src to that url to load the report. Pretty straight forward and is exactly how the existing report viewer works today.

 

Core architecture: the parameter panel itself is a CDF component which defines a layout for all widgets provided. The submit button widget is configured to listen to all CDF components created from any non-hidden parameter. Any time the submit component receives a parameter change event its update() function is called and we check if all parameters are valid. If they are all valid we fire a change event for the parameter “submit” which someone else can listen to and do what they need to do.

Once all components are created we pass them to CDF to initialize them. This will register them CDF and eventually calls update() on all components. It’s this update() that will inject the CDF components into the page.

That being said, the thinking here is that in the future anyone can create a prompting panel from a parameter xml and provide their own callback when the submit button is clicked (or any parameter is changed for that matter).

Extension points:

    • The mapping for widget types to parameter types is done in: parameter-prompting-builders.js in the object: pentaho.common.prompting.builders.ParameterWidgetBuilder.paramTypeToWidgetMapping. That’s the internal widget mapping that we use when looking up a widget type in pentaho.common.prompting.builders.ParameterWidgetBuilder.lookupCDFWidgetBuilder.

 

  • Most of the javascript is structured so it can be extended at any point. Hopefully I provided enough functions that can be overridden to make it easy for a hacker to tear it apart!

 

One of the items I’m expecting to change is the delegation of widget creation to layout panels. I’d like to pass the parameter definition to the layout panel and let it create any widgets it needs instead of creating them ahead of time for each non-hidden parameter. This should be a bit more extensible.

Thank you Jordan for creating this amazing marriage of dashboards and reporting!

Even though this is a prototype build, it proves the point that we can tweak CDF to become the new report viewer.

At this point, Gretchen asked me whether this will have any impact on any existing integration with the reporting plugin. Changing server side URLs is always a bad thing and Gretchen voiced the concerns of all our OEMs and partners who built an existing solution for the reporting plugin.

The changes we propose are purely client side changes. There is no need nor any plan to change server side APIs or change URLs or returned formats or any server side behaviour. You are safe.

As far as I know, this new report viewer will be part of the upcoming Pentaho BI-Server 4.5.

A world of new possibilities

This new report viewer adds a bunch of new possibilities to our reporting system. We can easily extend it, we can add new components and new ways of parametrizing reports. Parameters can look sexy and can be visually rich.

With CDF’s flexibility and ability to style the dashboard in any way you want, we can produce more flexible layouts that match existing corporate style guidelines of our customers. With the ability to quickly integrate new components we solve more business cases.

On top of my head, I can imagine a Google-Maps widget to select locations, clickable charts to select customers or product lines. Our Drill-linking can be used in new ways. Why not use a report or analyzer view to present your selection?

With CDF and the power of JavaScript in our hands we can also easily show or hide parameter as needed, or even produce selective parameter input paths based on the user’s selection.

With better parametrization, our reporting system will look more sexy, which means more people will be willing to use it. More customers is always a good thing.

But to make this bird fly, we need all the help that we can get.

How can you help to get it right from the start?

First and foremost: Give us your requirements. For us from engineering it is hard to know what obstacles you hit in the fields or what your clients ask you to solve. So instead of producing a system that is based on our limited “Steel-wheels” world, I would see an open discussion that comes up with a set of true requirements that match what you see from your customers every day.

If you already use CDF to drive parametrized reports, tell us all about it? What is the problem you are solving here? What extra work did you have to do to make it work for your use case? If you wrote some parameter input that might be useful for other: Would you share it with us?

Help us to expand the parameter definition dialog in the report designer so that we can easily add additional attributes to the report parameters. This way we can prototype faster and you can use this to pass additional configuration settings to the CDF parameter viewer.

And if you cannot give anything, then at least test what we write with your data. The earlier you test, the better we will be able to react to the results. And please, please: Test the early builds as well. If the product is already in RC (Release-Candidate) state then it is very hard to make major architectural changes. Your tests of the early builds help us to know whether we move in the right direction and allow us to correct our course when we are not.

So lets start the discussion here and now

What would you need from a parameter viewer? What requirements did you meet that forced you to implement your own dashboard-parameter-viewer?

Creating your own Parameter UI for the Pentaho BI-Server

Our BI-Server ships with a default GWT parameter UI for the parameter defined on a report. If you had been around for a while, then you will remember the sigh of relieve when we freed everyone from the tyranny of XActions for running simple reports. Since then the parameter capabilities of the reporting module grew and grew with every release making these parameters easier to use than the XActions’s original design.

GWT is nice for a old and grumpy Java developer like me, as I do not have to worry about JavaScript (untyped, for heaven’s sake, untyped!). But the hardcore nerds like Pedro “JavaScript is my life” Alves do not like the monolithic garbage the GWT compiler spits out. To slow, to heavy, and foremost: Not really extensible unless you recompile the beast for each change. And worst of all: I agree to these complaints.

However, there is a silver lining on the horizon. Our architecture is open, so you are able to replace the GWT code with your own magic with no problems at all. And here is how you would do it:

Basics: The parameter UI architecture

The parameter UI works as a REST service. The server receives calls and sends responses based on the parameters given, without holding any server side state elsewhere.

(1) The browser loads the report.html page and initializes the GWT parameter application (GWT from now on).

(2) GWT calls the server’s report handler with “renderMode=PARAMETER” or “renderMode=XML”. If there are values for any of the parameters known, then these values are given on the URL using ordinary HTTP-GET or POST requests.

The URL that is called is something like this:

http://demo.pentaho.com/pentaho/content/reporting?renderMode=XML&solution=steel-wheels&path=%2Freports&name=Product+Sales.prpt

(3) The server responds with the parameter XML document. If the renderMode is XML, this document also contains the number of pages in the report. The server only returns page numbers if the parameter validate correctly and if the pagination does not cause any other errors.

(If you are logged into the demo server, call the URL from step 2 in your browser to see the XML document the server returns.)

(4) GWT creates a UI for all parameters based on the Parameter XML document. All information is given as attributes on the parameter. The parameter’s possible values and current value are given in that document as well. These sets of values can change if a other parameter changes.

(5) If all parameters validated correctly (according to the Parameter XML document), it now sends a request to retrieve the rendered report. Again, this is a ordinary HTTP-GET call with all parameters attached onto the URL.

http://demo.pentaho.com/pentaho/content/reporting?renderMode=REPORT&solution=steel-wheels&path=%2Freports&name=Product+Sales.prpt

(6) The Browser displays the report content in the IFrame below the GWT parameter UI.

(7) Paging through the report jumps back to step (6) and updates the report frame.

(8) Changing a parameter value jumps back to step (2) and updates the parameter information.

The mystical Parameter XML

The parameter XML document is a description of all known parameters that the reporting plugin understands. The same format is also used by the Analyzer component and you can even get parameter information in this format out of XActions.

<?xml version="1.0" encoding="UTF-8"?>
  <parameters accepted-page="-1" autoSubmitUI="true" is-prompt-needed="false" layout="vertical" page-count="1" paginate="true" subscribe="false">
    <parameter is-list="true" is-mandatory="false" is-multi-select="false" is-strict="true" name="PROD_LINE" type="java.lang.String">
      <attribute name="role" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="user"/>
      <attribute name="parameter-layout" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="horizontal"/>
      <attribute name="parameter-render-type" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="togglebutton"/>
      <attribute name="label" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="Line"/>
      <attribute name="mandatory" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="true"/>
      <values>
        <value label="Classic Cars" null="false" selected="true" type="java.lang.String" value="Classic Cars"/>
        <value label="Motorcycles" null="false" selected="false" type="java.lang.String" value="Motorcycles"/>
        <value label="Ships" null="false" selected="false" type="java.lang.String" value="Ships"/>
        <value label="Planes" null="false" selected="false" type="java.lang.String" value="Planes"/>
      </values>
    </parameter>
    <parameter is-list="true" is-mandatory="false" is-multi-select="false" is-strict="true" name="PROD_CODE" type="java.lang.String">
      <attribute name="role" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="user"/>
      <attribute name="parameter-visible-items" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="6"/>
      <attribute name="parameter-render-type" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="list"/>
      <attribute name="label" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="Product"/>
      <attribute name="mandatory" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="true"/>
      <values>
        <value label="1952 Alpine Renault 1300" null="false" selected="true" type="java.lang.String" value="S10_1949"/>
        <value label="1972 Alfa Romeo GTA" null="false" selected="false" type="java.lang.String" value="S10_4757"/>
        <value label="1962 LanciaA Delta 16V" null="false" selected="false" type="java.lang.String" value="S10_4962"/>
        <!-- loads of parameter values removed -->
        <value label="1961 Chevrolet Impala" null="false" selected="false" type="java.lang.String" value="S24_4620"/>
        <value label="1982 Camaro Z28" null="false" selected="false" type="java.lang.String" value="S700_2824"/>
      </values>
    </parameter>
    <parameter is-list="true" is-mandatory="true" is-multi-select="false" is-strict="true" name="output-target" type="java.lang.String">
      <attribute name="role" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="system"/>
      <attribute name="preferred" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="true"/>
      <attribute name="parameter-group" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="parameters"/>
      <attribute name="parameter-group-label" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="Report Parameters"/>
      <attribute name="label" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="Output Type"/>
      <attribute name="parameter-render-type" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="dropdown"/>
      <attribute name="hidden" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="false"/>
      <values>
       <value label="HTML (Paginated)" null="false" selected="true" type="java.lang.String" value="table/html;page-mode=page"/>
       <value label="HTML (Single Page)" null="false" selected="false" type="java.lang.String" value="table/html;page-mode=stream"/>
       <value label="PDF" null="false" selected="false" type="java.lang.String" value="pageable/pdf"/>
       <value label="Excel" null="false" selected="false" type="java.lang.String" value="table/excel;page-mode=flow"/>
       <value label="Excel 2007" null="false" selected="false" type="java.lang.String" value="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet;page-mode=flow"/> 
       <value label="Comma Separated Value" null="false" selected="false" type="java.lang.String" value="table/csv;page-mode=stream"/> 
       <value label="Rich-Text-Format" null="false" selected="false" type="java.lang.String" value="table/rtf;page-mode=flow"/>
       <value label="Text" null="false" selected="false" type="java.lang.String" value="pageable/text"/>
     </values>
   </parameter>
   <!-- many many more parameter -->
 </parameters>

To create a usable UI, you need to look at some critical information in the parameter UI.

The parameter element itself contain the most critical information for a parameter.

<parameter is-list="true" is-mandatory="true" is-multi-select="false" is-strict="true" name="output-target" type="java.lang.String">

The parameter name gives the parameters internal name. Usually each parameter also has a “label” defined as attribute. The “label” is shown to the user, the “name” is sent to the server. The parameter element also indicates whether a parameter is a list or plain parameter. List parameter have a predefined list of values from which the user can choose from. Some list parameter allow the user to select multiple values (“is-multi-select”) or control whether the user can specify additional values (“is-strict”) or whether the parameter can be empty (“is-mandatory”). You can use this to perform client-side validity checks if you want to. But no matter what you do, the server checks the input anyway, as client side input is untrusted input. Not all parameter that are listed in the parameter document are shown to the user. Some are system level parameter that are only useful in certain situations (like all subscription parameter). A parameter is hidden if the hidden attribute is set to true.

 <attribute name="hidden" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="false"/>

How a parameter is presented to the user is defined by the “parameter-render-type”. Right now, this attribute can have the following values:

  • dropdown
  • list
  • radio
  • checkbox
  • togglebutton
  • textbox
  • datepicker
  • multi-line
 <attribute name="parameter-render-type" namespace="http://reporting.pentaho.org/namespaces/engine/parameter-attributes/core" value="dropdown"/>

Data formats for sending and receiving data

Sending parameter information over the internet can be a funny exercise. HTTP only allows to send text, so objects like dates and numbers need to be encoded properly. The Pentaho Reporting plugin expects all parameters in a locale-independent standard format. Numbers must be encoded as decimal numbers in plain english format (point as decimal separator, no thousands separators) using the format string “#0.#########”. Dates and Times must be given in the standard ISO format using the format string “yyyy-MM-dd’T’HH:mm:ss.SSS” or (if encoded with timezone information “yyyy-MM-dd’T’HH:mm:ss.SSSZZZZ”). (Also see: “Its about time – better Date parameter handling in Pentaho Report Designer“) Multi-selection parameter must be given by repeating the parameter name for each value. The order of multi-selection values is taken into account when processing the parameters on the server side. Example:

../report?multi-select=Value1&multi-select=Value2

Now all we need is a bit of JavaScript magic on the client side to replace the grumpy old GWT parameter application with a lightweight and smooth pure JavaScript/AJAX solution. Stay tuned ..

Additional Reading Material

Integrating PHP/ASP/Ruby-Web-Applications with Pentaho Reporting

From time to time we get the same question on the Pentaho Reporting Forum:
“How do I integrate the Pentaho Bi-Server and reporting capabilities with my own (PHP/ASP/JSP/Ruby/whatever-) web-application?” Thrilled by the BI-Server’s blazing features, many users want to enrich their applications with rich reporting and analysis capabilities, and unlike the dinosaurs of the BI-Market, our Open-Source solution makes this easy and cost effective.

The techniques I am going to describe in this blog-post are no magic and can be applied to other web-application (like our Data-Integration server) as well.

First, you need to get a Pentaho system up and running. For now, and for the sake of easiness, lets stick with the standard Pre-Configured Installation (PCI). The Pentaho PCI is a Apache Tomcat server that runs the Pentaho BI-Server as its only web-application.

Installing the Pentaho BI-Server

  1. Download Pentaho
  2. Unpack the ZIP or tar.gz archive into a directory. On Linux or Unix systems, I would recommend something like “/opt” or “/usr/local/” as target. On Windows .. well, in that case you are just experimenting and not planing for production, I assume. So any folder will do.
  3. Setup a publish password

    You need that later for publishing reports or other files to the server. This is the major obstacle for most new users when trying to get their own content up and running.

  4. Edit the file “bi-server-ce/tomcat/webapps/pentaho/WEB-INF/web.xml”

    Change the parameter “fully-qualified-server-url” to the public URL of your server.

    
    
      fully-qualified-server-url
      http://localhost:8080/pentaho/
    
    
  5. Optional: If some other tomcat installation runs on your system, your port 8080 may be in use already.

    You can change the port in the file “bi-server-ce/tomcat/conf/server.xml”:

  6. Start the Hypersonic database
  7. Start the Pentaho BI-server
  8. Optional: Install the BI-Server as a system service.

You should now be able to direct your web-browser to the BI-Server and should see the login screen when you type http://localhost:8080/ into your browser’s address bar.

Integration by Direct Linking

Integrating a BI-Server into a existing PHP application is easy now. All you need is a running BI-Server instance that is accessible by your users. From now on, I will simply use the Pentaho Demo Server for our little integration project.

http://demo.pentaho.com

The Pentaho Demo Server is a slightly tuned down installation of the Pentaho BI-Server. Anything I am going to show will work in the same way with the standard community edition.

When refering to the two applications, I will use “application” for the PHP, ASP or Ruby application that wants to utilize the Pentaho Platform. I will use the term “Pentaho” for the Pentaho BI-Server.

The most primitive way of connection two web applications together is to make the application include Pentaho in an frame or by opening up the specific report or analysis view in a new window.


  
    My Web Application
  
  
    

    This file could have been generated by any server side program. Use your phantasy here.    

   

    Reports can be parametrized by adding the parameters for the report to the URL. The parameters must be properly URL encoded. Use UTF-8 as character encoding.    

   

    This shows how to open up a report in a new Window:          A report    

     

Inside the Pentaho Platform, each report has its own page that can be accessed independently of the Pentaho User Console. The sample above links to a report on the system, using the predefined username and password. Note that the username and password is given in cleartext. To make this more secure you will need to set up a Single-Sign-On solution.

Pentaho Enterprise Edition comes with support for setting up Single-Sign-On for CAS rather easily via some setup scripts that take care of the ugly work. If you are using community edition and are willing to spend the time, you can always set it up yourself.

Alternatively you can set up Pentaho to work with Active Directory or LDAP instead.

Here are some sites with documentation to get you started.

And don’t forget: The Pentaho Wiki also contains a load of information.

Internally, Pentaho uses Spring Security, so with enough time and patience you should be able to connect Pentaho to anything on this planet for authentication purposes.

Using Queries in Formulas in Pentaho Report Designer

One of the most powerful features of Pentaho Reporting is our ability to combine data from a multitude of datasources into a single report.

A report can be driven by a a query that reads an XML file. The results then parametrize a Web-Service. The web-service response then gets fed into a standard SQL query. And that finally drives the queries to a OLAP cube.

Sure, that example is a bit extreme, but possible out-of-the-box with Pentaho Reporting downloaded from SourceForge.

But we can do more: In Pentaho Reporting 3.8, you can also use two formula functions to query report datasources to incorporate the query-result into your report.

The SINGLEVALUEQUERY function retrieves a single values from the resultset. It always reads it from the first row of the result-set, either from the given column or simply taking the first column it finds.

SINGLEVALUEQUERY()
SINGLEVALUEQUERY([query:string]; [column:string])
SINGLEVALUEQUERY([query:string]; [column:string]; [querytimeout:integer])

As you can see, both column and query-timeout are optional parameter. The “query” given must be a valid report query – but more on that later.

The MULTIVALUEQUERY function works similar to the SINGLEVALUEQUERY formula function. But instead of retrieving just one value, it returns all values as an list.

In addition to the parameters of the SINGLEVALUEQUERY formula function, it also accepts a “query-limit” Integer-parameter, that restricts the number of values returned.

MULTIVALUEQUERY([query:string])
MULTIVALUEQUERY([query:string]; [column:string])
MULTIVALUEQUERY([query:string]; [column:string]; [querytimeout:integer])
MULTIVALUEQUERY([query:string]; [column:string]; [querytimeout:integer]; [limit:integer] )

So what is that ominous “query” parameter about?

In Pentaho Reporting, most data sources use logical names (or handles) to reference queries. Queries can be quite complex. SQL and MDX queries can be huge, and seeing queries printed out that cover several pages is rather normal. They are complex beasts that were difficult to write and even more difficult to keep free of errors. You don’t want to seem them being spread over throughout the reports.

All except for the “Advanced/Custom” data-sources hide these monsters behind easy to remember names. This allows us to encapsulate the complexities of the actual queries while at the same time documenting the purpose of the query itself. It is just easier to use “Sales-Persons-By-Region” in a report than to repeat the SQL statement everywhere.

Each report holds the logical name in its query attribute. When the query needs to be executed the reporting engine checks all declared data-sources to find one that recognizes the handle and that then returns a result-set for the assigned query.

For the formula system, I followed the same approach. It is easier to maintain formulas that use

SINGLEVALUEQUERY("sales-top-performer")

than to use

SINGLEVALUEQUERY("SELECT name, Sum(Sales) AS 'Sales' FROM SalesData ORDER BY 'Sales' LIMIT 1")

especially if that formula is going to be used several times in the report.

As a side note: As with calculated queries in the report itself, you can add a Advanced/Custom data source to compute a query at runtime. At that point, the second query would be valid.

These are the steps needed to use a SINGLEVALUEQUERY or MULTIVALUEQUERY formula function in a report for a given query.

(1) Define the data-source
(2) Add a query, test that it gets your expected result back.
(3) Reference the query in the formula

How do I parametrize the SINGLEVALUEQUERY or MULTIVALUEQUERY formula function?

For moth formula functions, parametrization happens implicitly. At runtime the reporting engine passes the current data-row, including all report parameter, the values from the report’s current result-set and the calculated function/expression values to the data-factory that executes the query.

Queries need to get parametrized as usual. For instance for SQL use the ${param} syntax. The underlying data-source itself specifies the names of the data-row columns it wants to use as parameters.

Example:

Lets assume that your report contains a field called “CUSTOMERNUMBER” and you want to retrieve the accumulated sales for the user via a SINGLEVALUEQUERY function.

(1) Define the SQL datasource on your report (or reuse one you already have defined).
(2) Add the query, name it “Sales-For-Customer” and provide the following SQL:

SELECT
     SUM("ORDERDETAILS"."QUANTITYORDERED" * 
     "ORDERDETAILS"."PRICEEACH") AS "Sales"
FROM
     "ORDERS" INNER JOIN "ORDERDETAILS" ON "ORDERS"."ORDERNUMBER" = "ORDERDETAILS"."ORDERNUMBER"
     INNER JOIN "CUSTOMERS" ON "ORDERS"."CUSTOMERNUMBER" = "CUSTOMERS"."CUSTOMERNUMBER"
WHERE 
     "CUSTOMERS"."CUSTOMERNUMBER" = ${CUSTOMERNUMBER}
GROUP BY 
     "CUSTOMERS"."CUSTOMERNUMBER"     

(3) Add a label, and edit the “value” attribute-expression. Set the following formula

=SINGLEVALUEQUERY("Sales-For-Customer")

Download the sample report

As you can see, there is not much magic in using the SINGLEVALUEQUERY and MULTIVALUEQUERY formula functions, but the effect they can have is powerful.

PS: Multiple queries with the same parameter values will be cached by the Report Data Cache

Using Reporting Parameter for fun and profit

The new Pentaho Reporting 3.8 is the fourth release in a row where parameter played a important role. After ironing out all the easy problems, we are now in a state where we can think about creatively abusing the system.

There is one questions in the support centres that get repeated over and over again.

How do I limit the visible output types for a particular report?

Up until now, there was no sane answer. You either take it all, or you lock the report down to a single output type. But selecting just HTML and PDF, that was impossible.

The desired output for a report is controlled by the “output-target” parameter. The values for this parameter are defined somewhere deep inside the reporting engine. Every output target has its unique identifier, that tells both the type of the export (pageable or table) as well as the actual content type that is generated (text, html, pdf, and so on).

Output-target Description
table/html;page-mode=stream HTML as a single page, all report pagebreaks are ignored
table/html;page-mode=page HTML as a sequence of physical pages, manual and automatic pagebreaks are active
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet;page-mode=flow Excel 2007 XLSX Workbook
table/excel;page-mode=flow Excel 97 Workbook
table/csv;page-mode=stream CSV output
table/rtf;page-mode=flow Rich-Text-Format
pageable/pdf PDF output
pageable/text Plain text
pageable/xml Pageable layouted XML
table/xml Table-XML output
pageable/X-AWT-Graphics;image-type=png A single report page as PNG
mime-message/text/html Mime-Email with HTML as body text and all style and images as inline attachments

Since Pentaho Reporting 3.8, we check whether the current report defines one of the known system parameter. If you define your own representation of such a parameter, we use your definition instead of the built-in ones.

Using that knowledge, it is easy to create a list parameter that defines a subset of the available output types. Drop your selection into a table-datasource, and define the parameter as one of the single-selection list-parameter types.

Server Side Printing

You can also invoke server side printing directly from the parameter UI without having to go through an XAction. Define a boolean parameter by creating a table with the following entries:

ID Value
false Do Not Print
true Print

Make sure you set the default value of your parameter to “false” or make false the first selection in your list, or you are likely to trigger printing involuntarily. Save the trees and so on.

If you want to print to a specific printer, you can do so by defining the “printer-name” parameter as well. And with the magic of the scriptable datasource, you can populate it with all known printers:

import javax.print.DocFlavor;
import javax.print.PrintService;
import javax.print.PrintServiceLookup;
import javax.print.attribute.standard.PrinterName;

import org.pentaho.reporting.engine.classic.core.modules.misc.tablemodel.TableModelInfo;
import org.pentaho.reporting.engine.classic.core.util.TypedTableModel; 


    PrintService[] services = PrintServiceLookup.lookupPrintServices(
        DocFlavor.SERVICE_FORMATTED.PAGEABLE, null);
    TypedTableModel tt = new TypedTableModel();
    tt.addColumn("ID", String.class);
    tt.addColumn("Value", String.class);
    for (int i = 0; i < services.length; i++)
    {
      PrintService service = services[i];
      PrinterName displayName = service.getAttribute(PrinterName.class);
      if (displayName != null)
      {
        tt.addRow(new Object[]{service.getName(), displayName.getValue()});
      }
      else
      {
        tt.addRow(new Object[]{service.getName(), service.getName()});
      }
    }
    return tt;
 

ENV-Fields: Why, When and How to use them

The reporting-environment came to life as a simple way to inject runtime information from the platform into a report. Ideally, a report should not know about details from the server it runs on. Certain parts of the server’s configuration, like the server’s base-url, easily change when the report travels from development to the test and then to the production environment.

Earlier versions relied on magical string replacements during the parse process to inject that kind of information. Parse-time modifications are generally a bad thing. They are insecure,
as every String manipulation of XML files is. If you manage to get a valid XML-Fragment into your replacement string, you can do everything. There is no way a GUI tool could be written for it. The injection magically happens everywhere without giving the user a say on that matter. At design time, or outside of the legacy code of the Pentaho Platform’s JFreeReportComponent, this replacement does not happen. I don’t want to explain that mad system to anyone, and surely neither do you!

And finally: It really makes it hard to do caching inside the reporting engine. For the reporting engine, these replaced strings look exactly like static strings and all of the sudden the cache is poisoned with a invalid copy.

Moving that information from parse-time to runtime solves that problem for me.

The report-environment is a special interface that allows third party systems like the Pentaho BI-Server to inject configuration settings in a safe and pre-defined way. That configuration information is then available to all expressions and functions. It completely eliminates all magic from the report processing.

A report-environment entry is not a parameter. A parameter is a external value that has been provided by the user or that has been calculated from end-user or report-designer input. A
report-environment property comes solely from the local runtime environment.

Inside the reporting engine, a report-environment setting is a single string. The setting is either one of the well-known environment-keys provided by the runtime implementation, a session-entry or a user-defined environment setting.

Well-Known Keys

  • serverBaseURL:
    The server’s base URL without any servlet context. Use this one to link to other resources on the same server.

    http://127.0.0.1:8080/
    
  • pentahoBaseURL:
    The server’s base url including the local web-application context. Use this one to link to other Pentaho Services.

    http://127.0.0.1:8080/pentaho
    
  • solutionRoot:
    The solution-repository root directory.
  • hostColorPort:
    The host name and port, without the protocol prefix or any path information.

    localhost:8080
    
  • requestContextPath:
    The context path of the web-application

    /pentaho
    
  • username:
    The username of the currently logged in user.
  • roles:
    The set of roles the user is assigned to.

Session keys

A environment setting starting with the prefix “session:” will be treated as a local session lookup. A session lookup will only work inside the Pentaho BI-Server. The BI-Server will compute the session attribute name by removing the “session:” prefix and will lookup the value in the users HTTP-Session instance.

If the value found there is a string it will be returned as valid entry. If the value is a string array, it will be converted into a CSV text using the excel quoting syntax (double-quote-character
are quote characters, and text containing either the comma, a quote character or a line-break will be quoted.

The value Joe, the horse, says "Ho" will become "Joe, the horse, says ""Ho""".

You can use the CSVARRAY formula function to parse that text back into a String-array inside the reporting engine.

User-Defined Environment setting

If a key does not match either one of the well-known keys or if the engine cannot identify it as a session-entry, it will be treated as user-defined setting. The reporting engine will consult the global configuration and will try to find the value by appending the environment key to the configuration prefix “org.pentaho.reporting.engine.classic.core.environment.“.

Therefore a environment lookup for the key “Foo” will query the global configuration for a configuration entry called “org.pentaho.reporting.engine.classic.core.environment.Foo“.

How to use them

The report-environment can be accessed via the OpenFormula function
ENV“. Example:

=ENV("pentahoBaseURL")

The result of the lookup can be used in the same formula:

=ENV("pentahoBaseURL") & "/content/reporting/report.html?solution=steel-wheels&path=reports&name=BuyerReport.prpt"

You can also make the value accessible as a separate field. Place the ENV lookup into the post-processing formula of a hidden parameter or into the “formula” property of a OpenFormula-Expression and the value appears as field.

But there is a easier way to convert your environment lookups into fields!

Auto-mapping

The auto-mapping definitions allow the Pentaho Reporting Engine to automatically lookup a set of report-environment keys and to publish those values as “env::” fields. Which fields are auto-mapped is controlled by a set of configuration entries in the reporting engine’s global configuration.

A mapping rule follows the following syntax:

org.pentaho.reporting.engine.classic.core.env-mapping.=

If the environment string is a CSV string, you can tell the reporting engine about it and it will produce a field that contains a String-array.

org.pentaho.reporting.engine.classic.core.env-mapping.-array=

It is generally a good idea to keep your environment fields in a separate namespace by using the “env::” prefix for your fields. This way you will not run into conflicts with fields from your database or expressions later on. I also heavily recommend to try to keep the reference to the environment key in the name, this provides a sensible self-documenting entry.

Where to use Report-Environment fields

The predefined keys can be split into two groups. The first group, all path and URL related entries are used to calculate relative resource URLs, like logo images or HTML and JavaScript includes.

The second group ,”username” and “roles” is used to enforce security rules in datasources or on calculations.

The third use-case would be to provide system dependent parametrization, like printing static text. It is a easy way to include the department name or the name of the staging system (test, dev, production) in every report.

And last but not least: The environment is accessible to scripts and formulas, so it can be used as a system wide parametrization avenue.

Setting up session values

You can define the values for the session-keys of the report-environment inside the BI-Server with the help of a on-logon-XAction. Within that XAction, use a JavaScript-Rule to access the HttpSession object and to store the string-value on it.

Setting up global user-defined report-environment values

The global user-defined report-environment values are defined in the reporting engine’s global “classic-engine.properties” file. You will find that file in the Pentaho-Report-Designer’s “resources” directory. The BI-Server’s global reporting configuration file can be found in the WEB-INF/classses directory.

The report-environment is a simple and easy to setup way to make your reports more reusable. Whenever you find yourself changing reports manually  while transitioning them from development to testing or production, you now have the tools at hand to elimenate these manual steps.

BI-Server parameter input: An authoritative voice

Over the last few release cycles, we have spent some tremendous
time in making the parameter input easy to use. Along with the
community and user feedback the GWT client received new capabilities
on when and how to pass parameters around.

We strive for a release it quick and release it often approach,
as we rather solve pain in a suboptimal way today than to wait for
a perfect solution that comes next year, if ever. And thus it is a
sad (and rather natural) truth that not all attempts are hitting
the spot.

Especially the auto-submit feature is a creepy little bugger and
has seen numerous hacks, tweaks and finally a full make-over in
the upcoming 3.7 release.

Architecture: the inner workings of the reporting plugin in 3.7

The reporting plugin is a GWT application, and thus when first
invoked, it will initialize itself, inspect all parameters it
received from the URL and fires a non-paginating parameter-request
back to the server.

On the server, the Parameter-content-handler handles the request.
It read all values from the URL. URLs only transmit strings, so
in a second step, those raw values get parsed into Java-Objects.
The parsed values now get passed down into the reporting engine
for further validation. The engine validates the parameters and
either accepts the values as fully valid or spits out a bunch of
error messages.

If all values are valid and the request asked for pagination, we
also paginate the report to return the total number of pages for
the report.

The validation and pagination result, along with a description of
all available parameters and their accepted values are sent back
to the GWT UI.

The first parameter-response now initializes the GWT application
and sets the initial state of the auto-submit feature. If the
GWT UI detects that all parameters have been accepted as
valid, and if auto-submit is enabled, it now fires a full parameter
request that paginates the report.

Subsequent updates will then follow a slightly different protocol:

If auto-submit is disabled, any user-input triggers a lightweight
parameter validation request. This ensures that list-parameters
always get selections based on the actual user input. The response
only updates the parameter definitions and values and at no point
the report gets executed.

When auto-submit is enabled, we always fire heavyweight parameter
requests, which trigger a pagination run on the server. When such
a request returns without error, we do show the report content.
This finally invokes the report-content-generator and creates
the actual report content that is shown below the parameter input
pane.

The BI-Server 3.6.0 Situation

In this version all parameter validation requests were
heavy weight requests. So every parameter validation triggered
a pagination.

A pagination run is only triggered if all parameter were validated
successfully. To turn the heavy weight parameter requests into a
3.7-style lightweight request, you need to add a parameter that
always validates to false, until the user explicitly selects it.
One simple way of doing that is to add a mandatory check-box
parameter with the values “Wait” and “Run” and the post-processing
formula

=IF([p] = "WAIT"; NA(); [p])

that translates all wait into values and thus making the
parameter unvalidatable until the user selected “run”.

Fixed rules: More on Parameter processing

In the famous “lets do it right this time” release of Pentaho Reporting 3.5,
we introduced the ability to have parameter on a report. Well, it wasn’t quite
right, with parameters you need to pre and post process the data to make it
sound. That was release 3.6. At that point, so our theory, you, dear user,
should be happy. But apparently, the beast we created wasn’t all pretty.

Well, its PRD-3.7 now and guess what we are improving: Parameters.

So far, the date parameter processing was not quite right. I still wonder
why after 5 years of XAction no one complained about that. Sure, XActions
only have strings, and any processing or parsing is up to you – and so is the
blame if it does not work.

The various system level options on the parameter UIs was .. sub-optimal.
(I’m getting better at phrasing it more positively, don’t you think?)
The list of supported parameter (aka “the list of parameters you shall not
use in your report”) grew with every release. The Swing UI and the server UI
never quite agreed on what setting to accept and how to behave in border
cases. Thus creating formulas that worked in both settings was a chore.

And last but not least: Validating parameters and getting them run in a
consistent way was difficult. Give a Integer where a Long was expected
and you are screwed. Without a error message. Thus even for me working with
the parameters was more a easter-egg search than sane designing.

And last but not least: Even the parameter processing order was a bit
funny. It works for simple cases, but behaves rather funny for the
not so simple ones.

How Pentaho Reporting Processes Parameters

Each parameter in Pentaho Reporting carries at least two formulas that
eventually need to be evaluated.

The default-value-formula is used to produce a valid value if the user
provided no value.

The post-processing-formula is used to transform the user’s input into
something more usable or simply to validate arbitrary business rules
(a deposit cannot be negative, for instance).

And last but not least, if you reference an other parameter, you expect
it to contain the proper post-processed value.

In PRD-3.6, the order of the validation was largely out of sync with
those expectations. In fact, post processing was done in blocks, so that
parameter were not able to use other post-processed parameter values in
their queries. Now that’s bad, and I guess Gretchen will be able to share
a few unhappy tales in Lisbon about that.

In PRD-3.7, each parameter is now fully processed on its own before
the next defined parameter in the chain is processed.

Lets be more formal for a while:

For each parameter defined:

If the value for the parameter is <null>, we compute the parameter’s
default value and use that as untrusted parameter value. The default-value
formula only sees the previously validated parameters.

In a second step, we post process the parameter to get a trusted value.
The post processing formula sees the previously validated parameters
and the untrusted value. So be careful how you use the untrusted value
here, as you cannot trust users and SQL-injections or cross-site-scripting
troubles are never to far away.

If the post-processing formula fails with an error the
trusted value of the parameter will be <null>, a warning message will be
issued and last but not least we refuse to execute the report. The
parameter processing continues with this value set to <null>.

And finally we check the type of the parameter and compare the parameter
value against the list of valid key-values. If the value passes this test
it becomes a trusted value and will be used in the further parameter
processing and ultimately it will be used in the report.

If the parameter fails the test, we report an error, prevent any report
processing and continue to validate the remaining parameters using the
parameter’s default value.

Beginning with this version, the parameter validation also creates the
set of validated values after the validation is complete. For a report without
any parameter values set, this will yield the default values for all parameters.

So what does this mean for you?

The new schema brings a couple of changes to the way the system behaves.
Default values are now context sensitive and can change when the
selection for the previously declared parameters changes. Our parameter
UIs do not directly use that feature for usability reasons.Automatically changing
the user’s input is not very nice and confuses and/or upsets people. A lot.

The post processing formulas are now executed in a timely manor and before
the default-value or selection for a parameter is computed. This way, you
are now able to compute the mondrian-role array in a hidden parameter’s
post-processing formula and be sure that your datasource sees it.

And last but not least, your formulas wont be able to use values that have
not been validated, nor would the report ever include them. Especially with
the SINGLEVALUEQUERY and MULTIVALUEQUERY formula functions, this is mandatory.
Your database is yours and we all want to keep it that way.

New Formula Functions – power to the parameters!

You may have heard rumours that the next release is all about parameter and drill-down. Rest assured, its true. During the last months I started reading a bit about the inquisition and their reliable methods of customer surveys (“Are you a happy catholic? – Yes, ouch!, burn! – choose one”) I did keep an eye on what problems come up most.

Guess what. Parameters range pretty high on the scale. And more specifically, dealing with multi-selection parameters is problem zone number one. But hey, we all know that there is no problem that cannot be solved by simply adding more code.

So lets start with a cook-book:

0. Multi-select parameter return their values as arrays and the engine expects arrays when we get parameter values from the outside.

1. Count the number of selected values:

=COUNTA([mparam])

Based on that, check whether multi-selection parameters have a selection:

=IF(COUNTA([mparam]) == 0; "do something if empty"; "do something if not empty")

2. Array Manipulation:

Merge two arrays

=ARRAYCONCATENATE([mparam]; [mparam2])

or merge with static array values:

=ARRAYCONCATENATE([mparam]; {"value 1" | "value2"})

Extract four values from a array: (Leading, Trailing, in between)

=ARRAYLEFT([mparam]; 4);
=ARRAYRIGHT([mparam]; 4);
=ARRAYMID([mparam]; 2; 4);

and the highlight of the show:

3. Query a single value from a database

=SINGLEVALUEQUERY("query"; "result-column"; [query-timeout]) 

4. Query multiple values from a database

=MULTIVALUEQUERY("query"; "result-column"; [query-timeout]; [query-limit]) 

In combination with the hidden parameters with a post-processing function, this can be as powerful as a laser weapon in a medieval village.

On a side note: Post-processing and formula validation is a bit quirky in the current versions (PRD-3.5 to PRD-3.6). Stay tune for the next article to see how a well-defined version (like the one shipping in PRD-3.7) will clear up the fog of confusion.