When we started with Pentaho Reporting 3.5.0, one of our goals was to kick out as many hard-coded dependencies as possible. Coding against private APIs is ugly, every upgrade breaks something and you always chase your own tail. In short: It is not fun.
The OLAP4J datasources are among the more promising new APIs that emerge. Originally, I had the hope to completely kick out every Java source file that contained the words “import mondrian.”. So far I am still waiting to pull the plug from the Mondrian datasource to make OLAP4J the primary OLAP datasource. Hurry up, sticking with 0.x versions for years and years is soooooo lame. Where do you guys pick up such bad behaviour?
Right now, the only viable use case for using OLAP4J is to access XML/A Cubes. Sounds not that exciting, but with it the Pentaho Reporting System can talk to SAP BW, Palo and Microsoft’s Analysis server.
How do you do connect Pentaho Reporting to a XML/A Datasource?
- Create a OLAP4J datasource in PRD
- Create a new connection definition. The Database dialog that pops up is full of JDBC databases. Ignore all of that useless gibberish and select the “Generic database”.
Use the following entries in the fields:
- Custom Connection URL: jdbc:xmla:Server=http://localhost:4242;Catalog=FoodMart;Cube=Budget
- Custom Driver Class Name: org.olap4j.driver.xmla.XmlaOlap4jDriver
- User: yourUser
- Password: yourPassword
Test the connection and once everything is fine, hit OK.
You are now set up to run MDX queries and to use the result in a report.
But Thomas, there are so many OLAP4J datasources? Which one shall I use?
Pentaho Reporting comes with three flavours of OLAP datasources.
The Banded Datasource is the one that most reports should use these days. This datasource provides a flattened view over the cube much like the results you see in Analyzer and JPivot. You can easily print that resulting dataset using the classical banded reporting techniques, like relational groups and detail bands.
The Denormalized Datasource provides a denormalized stream of data. It will be used for the crosstabbing in 4.0, but right now in the majority of cases, it is of not much use. It has its merrits if you need to do your own aggregation on that data.
The Legacy-Datasource behaves mostly like the Banded datasource. It exists to support PRD-3.0 and older reports. Ignore that, you are fresh and modern, after all. The Legacy-datasource names columns differently and cannot handle ragged hierarchies. Some ancient reports require that kind of weird behaviour to continue to work as before.
So use the Banded Datasource and you cannot go wrong.
Can I use OLAP4J datasources in the BI-Server?
Sure. Just do it. But stay clear from declaring them in JNDI, as connection pooling can easily fail. OLAP4J drivers cloak as JDBC drivers to try to make use of connection pools and JNDI systems. For code that was written for JDK 1.5 or earlier, “unwrapping” the driver does not work and then the pooled JDBC objects cannot be used to make a OLAP connection.
I read I can use Mondrian via OLAP4J as well. How?
Although it is technically possible to use Mondrian this way, I heavily recommend against it.
Mondrian encodes a system specific file name into its URL. When Mondrian tries to establish a connection, it tries to read this file. If the file name is a absolute file name, then you will run into troubles when you move your report onto the BI-Server. The referenced file will usually not be available there. If the file name is relative, then Mondrian will resolve the file using the application’s current working directory (for the BI-Server usually Tomcat’s ‘bin’ directory, for PRD either the PRD installation directory or your home directory).
So far I have not seen a safe way to specify the schema file so that it works both locally and when the report is published on a server.