Category Archives: BI-Server

How to upgrade Bi-Server 3.5.2 to the latest reporting release

One of the big fat questions lingering in the room with the latest release is (as usual):

Can I upgrade my existing BI-Server installation with the new reporting release?

Short answer: Yes.

With the latest bugfix release of Pentaho Reporting, we also had to upgrade both Kettle and Mondrian to their latest versions to make it run in the BI-Server 3.6.0 release. Due to the massive amount of work that went into Kettle 4.0, many of their APIs changed thus making it impossible to maintain backward compatibility.

Although we would love to see everyone migrate to BI-Server 3.6 immediately, the chances of that happening are fairly slim. Businesses seem to be a bit reluctant to change, once they have everything up and running. Heck, some people still run a 1.x release, which was released three days after the last dinosaurs died.

So how can I upgrade then? What will be the impact of this upgrade?

The upgrade is straight forward and can be done by deleting the following jar-files from the pentaho/WEB-INF/lib directory:

libbase-1.1.5.jar
libdocbundle-1.1.6.jar
libfonts-1.1.5.jar
libformat-1.1.5.jar
libformula-1.1.5.jar
libloader-1.1.5.jar
libpixie-1.1.5.jar
librepository-1.1.5.jar
libserializer-1.1.5.jar
libswing-1.1.5.jar
libxml-1.1.5.jar
pentaho-reporting-engine-classic-core-3.6.0-GA.jar
pentaho-reporting-engine-classic-extensions-3.6.0-GA.jar
pentaho-reporting-engine-classic-extensions-hibernate-3.6.0-GA.jar
pentaho-reporting-engine-classic-extensions-mondrian-3.6.0-GA.jar
pentaho-reporting-engine-classic-extensions-olap4j-3.6.0-GA.jar
pentaho-reporting-engine-classic-extensions-pmd-3.6.0-GA.jar
pentaho-reporting-engine-classic-extensions-reportdesigner-parser-3.6.0-GA.jar
pentaho-reporting-engine-classic-extensions-sampledata-3.6.0-GA.jar
pentaho-reporting-engine-classic-extensions-scripting-3.6.0-GA.jar
pentaho-reporting-engine-classic-extensions-xpath-3.6.0-GA.jar
pentaho-reporting-engine-legacy-charts-3.6.0-GA.jar
pentaho-reporting-engine-legacy-functions-3.6.0-GA.jar
pentaho-reporting-engine-wizard-core-3.6.0-GA.jar

and replacing them with

libbase-1.1.6.jar
libdocbundle-1.1.8.jar
libfonts-1.1.6.jar
libformat-1.1.6.jar
libformula-1.1.7.jar
libformula-ui-1.1.7.jar
libloader-1.1.6.jar
libpixie-1.1.6.jar
librepository-1.1.6.jar
libserializer-1.1.6.jar
libswing-1.1.7.jar
libxml-1.1.7.jar
pentaho-reporting-engine-classic-core-3.6.1-GA.jar
pentaho-reporting-engine-classic-extensions-3.6.1-GA.jar
pentaho-reporting-engine-classic-extensions-hibernate-3.6.1-GA.jar
pentaho-reporting-engine-classic-extensions-mondrian-3.6.1-GA.jar
pentaho-reporting-engine-classic-extensions-olap4j-3.6.1-GA.jar
pentaho-reporting-engine-classic-extensions-pmd-3.6.1-GA.jar
pentaho-reporting-engine-classic-extensions-reportdesigner-parser-3.6.1-GA.jar
pentaho-reporting-engine-classic-extensions-sampledata-3.6.1-GA.jar
pentaho-reporting-engine-classic-extensions-scripting-3.6.1-GA.jar
pentaho-reporting-engine-classic-extensions-xpath-3.6.1-GA.jar
pentaho-reporting-engine-legacy-charts-3.6.1-GA.jar
pentaho-reporting-engine-legacy-functions-3.6.1-GA.jar
pentaho-reporting-engine-wizard-core-3.6.1-GA.jar

Note that the “pentaho-reporting-engine-classic-extensions-kettle” jar remains at version 3.6.0-GA. This ensures that the older Kettle 3.2 is used when running reports with a Kettle datasource.

To make the most of this upgrade, I also recommend to add a few new settings to the “pentaho/WEB-INF/classes/classic-engine.properties” file:

#
# These settings control how pagination states are retained in the reporting
# engine. For the server, it is safe to scale down the number of states to a
# bare minimum. This reduces the memory footprint of reports considerably. 
org.pentaho.reporting.engine.classic.core.performance.pagestates.PrimaryPoolSize=1
org.pentaho.reporting.engine.classic.core.performance.pagestates.SecondaryPoolFrequency=4
org.pentaho.reporting.engine.classic.core.performance.pagestates.SecondaryPoolSize=1
org.pentaho.reporting.engine.classic.core.performance.pagestates.TertiaryPoolFrequency=1000

#
# Disable several assertations and debug-messages, which are cool for testing reports
# but slow down the report processing. You may want to enable them in your test system
# but want to make sure that they are disabled in your production environment.
org.pentaho.reporting.engine.classic.core.layout.ParanoidChecks=false
org.pentaho.reporting.engine.classic.core.modules.output.table.base.DebugReportLayout=false
org.pentaho.reporting.engine.classic.core.modules.output.table.base.ReportCellConflicts=false
org.pentaho.reporting.engine.classic.core.modules.output.table.base.VerboseCellMarkers=false

# Performance monitoring entries. Can generate quite a lot of text in the logs, so 
# keep them disabled for production environments 
org.pentaho.reporting.engine.classic.core.ProfileReportProcessing=false
org.pentaho.reporting.engine.classic.core.performance.LogPageProgress=false
org.pentaho.reporting.engine.classic.core.performance.LogLevelProgress=false
org.pentaho.reporting.engine.classic.core.performance.LogRowProgress=false

With both the upgrade of the libraries and the new configuration settings, you should see a good performance boost.

Pentaho, the Platform: How about not being a server anymore?

Out there in the wild, when it comes to talking about Pentaho, the first impression people have: Its large, its big, it’s heavyweight, it’s THE SERVER. Its a beast that eats CPUs at night and during the next day administrators make barbecue on the remaining CPUs.

You can really stun people by revealing the well-hidden secret, that Pentaho the Platform is not heavyweight at all. Quite the contrary, if used correctly.

My journey on that started a couple of years ago, when I had to debug the reporting integration in the platform. This was the first time in my life, I actually contemplated about retiring as a farmer somewhere in the north-american desert (not many plants, true, but one or two interesting exist). With every compile, redeploy and then restart of the heavy weight JBoss server, I wondered: Have I sinned that much to deserve that hell? I’m sure Dante Alighieri had a 10th circle in his Divine Comedy, which involved lots and lots of J2EE and JBoss-debugging. But obviously he considered that to cruel to be believable, and therefore cut it out. Even Satan would not be so low …

Obviously, as I’m sitting here and writing this article, I found salvation beyond Peyote or eternal torture. And salvation is to simply not to use JBoss* (or any other J2EE system).

Revelation One: The Pentaho Preconfigured Installation is Not the Pentaho Platform

The official documentation and whitepapers make a fine (and really not obvious) distinction between the “Pentaho Platform” and the “Pentaho Server”. The server is the big and heavy-weight thing, and actually relies on a working J2EE infrastructure to run. So from now on, we will ignore this big pink elephant. The interesting pearl is the Pentaho Platform buried in the Server.

The Platform is a couple of JARs, which are almost entirely made of of infrastructure and glue code. The Platform only has one purpose: To orchestrate the various components into a BI related symphony. If I would have to describe the platform in one single sentence, it would be: “The platform is a runtime environment for a XML-based process-language that additionally provides auditing, logging, configuration and other infrastructure needs to allow to run BI-Jobs.“

Revelation Two: The Platform does not require J2EE at all.

Now we are leaving the heavy-weight area and dive deep into the sacred land of resource-efficiency! Although the platform provides several implementations of its interfaces to integrate seamlessly into J2EE environments, at the same time it also ships with implementations that are not tainted by any J2EE related code. These clean implementations make it possible to integrate the platform into all kinds of Java-Applications.

For me, running the platform outside of a J2EE server allows me to debug the components I write from inside my IDE. I do not have to deal with a heavy-weight server that starts up and shuts down in 5 or more minutes. I do not have to dive through layers over layers of application server code before I come to the parts of the application that interest me. I do not have to deal with HTTP requests. I do not have to deal with configuring a server before I can work. I can start my work immediately.

When I have to deal with XActions and have to find out why the $%&&$ the thing is not working, I also tend to be faster to simply attach a debugger and see whats going on under the covers instead of performing an pen-and-paper analysis of the XAction file itself. Run, listen for the crash, jump to the crash, and search the burning ruins for hints on what happened. Fast and simple and since I started using the platform as embedded tool, I never had to deal with setting up JNDI datasources in JBoss or any other J2EE system and I never had to write a single XML-deployment descriptor again. This is how heaven must be like.

But having the platform as a embedded toolkit opens a whole new world of opportunities. Maybe you have to provide bursting capabilities (that is: generating lots of reports and sending them out to a predefined list of recipients. Much like what spammers do daily but clean and family friendly) then the platform can do this for you with minimal efforts. Maybe you need to integrate reporting in your application and at the same time you have to ensure (and prove later) that the reports have been generated and have been distributed correctly. Or in an extreme case you need to query a web-service to provide parameters to query on a OLAP server to feed a Kettle transformation to run a sequence of reports that are distributed via email, then the embedded platform allows you to run that XAction as easily as a simple report itself.

Revelation Three: Code!

Up to the Platform 1.2.0, there was a sub-project called the Pentaho-SDK, which contained a couple of examples on how to execute XActions in the standalone mode. A SDK on a OpenSource project (where the full sources are always available) was some sort of strange beast, so this project ceased to exist and only the SVN server knows where it’s spirit went. However, the death of the SDK cut of the audience that just wanted to run the platform and who did not want to deal with all the code of the platform.

So here we start again.

(1) Setup the project

Grab the latest sources and copy all JARs from “thirdparty/lib” and all its subdirectories and copy them into your project’s lib directory.

Add all the jars to your projects CLASSPATH.

Build the platform and add the generated JARs to your classpath as well.

Grab a configured copy of the solution directory.

Configure JNDI so that the components know how to access the database(s).

(Remove BIRT from the system-listeners, as it does not seem to initialize in standalone mode.)

Download the preconfigured standalone environment 🙂 (scroll down)

(2) Java: Initialize the platform.

Initializing the platform is easy, all you have to do is to provide a standalone-context and point it to your solution-directory.

  public static boolean initialize()
  {
    try
    {
      // We need to be able to locate the solution files.
      // in this example, we are using the relative path in our project/package.
      final File solutionRoot =
          new File("/home/src/pentaho/pentaho-demo/pentaho-solutions/");
      final File applicationRoot = new File("/home/src/pentaho/pentaho-demo/");
      final StandaloneApplicationContext context =
          new StandaloneApplicationContext(solutionRoot.getAbsolutePath(),
		applicationRoot.getAbsolutePath());

      // Initialize the Pentaho system
      return PentahoSystem.init(context);
    }
    catch (Throwable t)
    {
      // of course, you should have some better
      // error handling than I have ;)
      t.printStackTrace();
      return false;
    }
  }

(3) Execute your XAction. The XAction-Path should be relative to the solution-repositories root-directory. The parameters must be given in the HashMap and must match the declared parameters of the XAction. By adding more code it is possible to provide a UI on top of this process that queries the parameters in the same way as the Pentaho-Server’s HTML-UI does it.

    final String xactionPath =
        "samples/steel-wheels/reports/Income Statement.xaction";
    final HashMap parameters = new HashMap();
    parameters.put ("output-type", "pdf");

    final FileOutputStream out = new FileOutputStream ("/tmp/report.pdf");
    try
    {
      ISolutionEngine engine = SolutionHelper.execute
          ("Just a description used for logging ", "User (only for logging)",
		xactionPath, parameters, out);
      List messages = engine.getExecutionContext().getMessages();
      engine.getExecutionContext().dispose();

      // out contains whatever the XAction produced.
    }
    catch (Exception e)
    {
      e.printStackTrace();
    }
    finally
    {
      out.close();
    }

(4) Clean up. Always shut down the platform before you exit the application. You want to be sure that all data is written into the databases and that all buffers are flushed.

    PentahoSystem.shutdown();
    System.exit(0);

So go ahead, download the package and start walking the lightweight path.

Pentaho-Standalone (ZIP-Package)

Pentaho-Standalone (TAR.GZ-Package)

* Nitpickers corner: JBoss as used in this article actually represents all the evilness found in all J2EE servers. No matter whether you choose JBoss, WebSphere, BEA or whatever J2EE-Servers you prefer, they are heavyweight machinery and not meant to be used for developing applications. Once you are finished developing, they surely form a superior runtime environment for your J2EE code, but everything that makes them good in production makes them horrible for development. Slow startups, heavy footprint and lots of lots of XML descriptors – efficient development should look different than that.

The total brain-dump

Extract the brains of all reporting and charting developers, put them in a large blender, turn it on, wait, turn the blender off, and fill the brains back into their original containers to extract the new ideas. Repeat this process as often as needed.

The last two weeks were by all accounts insane. It all started very innocent, with a call to fly over to Orlando to bring the features of version 0.8.10 to the people of Pentaho.

The scenery was well set up. The charting team finished the first major step of the new charting engine, so that we now have a solid and well-laid out fundament for generating charts. Mike created Mantle, a new UI for the Pentaho Platform. Finally the solid infrastructural backend the platform provides becomes a face that is appropriate in the wake of the 21st century. Using Mantle instead of the old UI feels like driving a Ferrari vs. a horse chariot. You have to see it, feel it! In the meantime, the Reporting-Ease-Of-Use sprints were still on their way, changing the Report-Designer from a developer-centric tool into a by far more business user friendly designer.
The two weeks were filled with dialogs like this: ‘Look, thats one of the new things we put into [project]!’ – ‘Hey, I could use that in [Project] to do do [mind boggling feature].’ – ‘Wait, when you do that, I could use that to do [feature] over here.’.

So what did we create?

Mantle no longer needs a explicit XAction to run reports stored in the new unified fileformat. As long as your report does not require complex preprocessing or uses bursting, you no longer need to write and maintain a separate XAction.

Mantle now provides a Pageable HTML preview for the reports of the platform.

The engine now has a clean and controlled way of defining parameters. The report definition contains all information needed to build the most marvelous UIs on top of it. Which brings us to:

Mantle can parametrize reports and generates a sensible, fully and easily customizable (without using XSLT! I cant bear that stuff!) parameter UI.

The new Charting System now becomes integrated into the Classic-Engine. (For now, this support will be called experimental, as we need the freedom to twist the code and XMLs whenever we feel the need for it.)

The Interactivity-Extensions blew away the innocent (and also the not so innocent, of course) bystanders. The ability to inject any HTML/Script code into the resulting HTML files now allows a whole class of new reports. The Swing-UI allows similar features.

The Report-Designer now comes with a fully featured formula-editor, that makes Office-(Open and MS alike)-users feel at home.

We all agreed (with full heart) that SWT is a big WTF and should not have been born in the first place. SWT brings the insanity of low-level APIs combined with the inability to provide sensible platform independence (the main argument for running Java!). Swing is our future here!

After these two weeks, I really have to wonder: If we, with just a few developers, are able to drive development so fast, why do Monster-companies like Business Objects, Cognos and so on (with thousands of employees) appear nowadays nothing more than sitting ducks for targeting practice. Not that I would complain about it, as 65 million years ago, swift and agile critters already out-paced slow and huge adversaries. Let’s repeat that game ..

Reporting Tales

Pentaho Reporting Tips and Tricks