I got a e-mail earlier this week, asking for some help in migrating reports.
Hey Thomas,
I have a poop load of reports with hard coded
ip addresses (yeah I know – but they didn't
have DNS entries until now) that I'd like to
globally change. I'm thinking of something
along the lines of "zcat | sed | gzip" but
something like this would only work on
compressed files, not an archive of files.
So – I was wondering if you had any ideas,
maybe even a utility based on code we already
have to create and open these bad boys.
We could even have some known locations to
modify
Like> prpt-replace –dbConnectionHost "192.168.1.100" "mydb.mycorporation.com"
Or maybe use file and xpath to denote the
attribute to change.
(The names involved were changed to protect the guilty.)
Well, hard-coding database connection information in every report is never a good idea in no environment. If your database host or credentials changes you are truly and well .. in trouble.
So how can you avoid this trouble in the first place?
Use JDNI to hold database connections in a external place
JDNI is a technique to hold database connections outside of the report definition. By holding this information in a central spot, you only have to change one place and all your reports get updated. The JNDI stored connection information is stored on the computer where your report runs. This also allows you to have separate definitions based on where the report runs. Your report designer can access a local copy of your production database, while your production server uses the real database server.
As a side note: On the BI-Server, this system of abstracting database information is called “Pentaho Datasources”, not JNDI. In this posting, I will forgo the branding exercise and simply continue to call it JNDI.
You can use JNDI in the following datasources:
- SQL
- Pentaho Analytics (Mondrian)
- OLAP4J
- Pentaho Metadata
For all of your production reports, you should always choose JNDI defined datasources. Manually defined datasources are great for testing out ideas, but really a nightmare to maintain and by storing the database credentials they potentially pose a security risk if you choose to share report definitions with others.
JDBC Connection definition basics
A database connection in Java requires four mandatory pieces of information:
-
driver: The name of the driver JDBC class.
-
url: The JDBC URL. This URL contains all information needed to connect to the database like the host name and port. The format of this URL is specific to a particular database. Consult your database manual for details on your particular database.
-
user: The username for the database.
-
password: The password for the accessing the database.
In addition to this, each driver allows you to specify additional properties. Again, these properties are specific to the database and (hopefully) documented in the manual. Most databases alternatively allow you to encode the properties onto the URL.
Defining data-sources in the Pentaho Report Designer
Within the report designer, there is no user interface to define the JNDI connections. The connections are defined in a file called "$HOME/.pentaho/simple-jndi/default.properties"
.
The file looks somewhat like this:
# Copyright 2008 Pentaho Corporation. All rights reserved.
# This software was developed by Pentaho Corporation and is provided under the terms
# of the Mozilla Public License, Version 1.1, or any later version. You may not use
# this file except in compliance with the license. If you need a copy of the license,
# please go to http://www.mozilla.org/MPL/MPL-1.1.txt. The Original Code is the Pentaho
# BI Platform. The Initial Developer is Pentaho Corporation.
#
# Software distributed under the Mozilla Public License is distributed on an "AS IS"
# basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. Please refer to
# the license for the specific language governing your rights and limitations.
SampleData/type=javax.sql.DataSource
SampleData/driver=org.hsqldb.jdbcDriver
SampleData/user=pentaho_user
SampleData/password=password
SampleData/url=jdbc:hsqldb:mem:SampleData
# NOTE: If you would like to access the running HSQLDB server, comment
# out the above line and uncomment the following line
# SampleData/url=jdbc:hsqldb:hsql://localhost/sampledata
SampleDataAdmin/type=javax.sql.DataSource
SampleDataAdmin/driver=org.hsqldb.jdbcDriver
SampleDataAdmin/user=pentaho_admin
SampleDataAdmin/password=password
SampleDataAdmin/url=jdbc:hsqldb:mem:SampleData
# NOTE: If you would like to access the running HSQLDB server, comment
# out the above line and uncomment the following line
# SampleDataAdmin/url=jdbc:hsqldb:hsql://localhost/sampledata
The JDNI definition file holds all connections known to the local JNDI system. This system has no connection to the BI-Server and does not access any connections defined in the Pentaho Administration Console.
A JNDI database definition in the report designer is defined by a set of properties. Each property name has two parts: The name of the data-source (‘SampleData’ and ‘SampleDataAdmin’ in the example above) and the JDNI property name itself. Each datasource declared in the “simple-jndi” definition file requires the following properties:
- Mandatory “type” entry set to the value “javax.sql.DataSource”
- Driver class name
- Connection URL
- A username and password
- .. and any additional property you want to pass in
The simplest way of creating a new definition entry is to copy an existing connection entry and to alter the name of the entry (the first part of the properties) and then to replace the connection information with your own details.
All JNDI connection names are case sensitive, so inside the report you have to use the same spelling for the name as in this properties-file.
And last but not least: Defining your connection information in the JNDI system does not relieve you from providing suitable database drivers. You can install new drivers by copying the jar files (and if needed any native libraries) into the “lib/jndi” directory of the Pentaho Report Designer.
Defining a JDNI connection in the Pentaho Administration Console
For the Pentaho BI-Server, the Pentaho architects chose not to use the JDNI subsystem that ships with all J2EE application servers. Sadly on the application servers JNDI has a few implementation flaws:
- Redefining a JNDI connection requires you to redeploy or restart the application. This is a nightmare for a mission critical server.
- Each server system comes with its own way to define JNDI definitions. Each of these ways is inherently incompatible with any other vendor.
- You are bound to the implementations provided by the server vendor. There is little hope to customize the database access for additional capabilities like security checks or multi-tenancy.
Before you can setup a connection, you have to install the correct JDBC driver into both the BI-Server and the Pentaho Administration Console. For that copy the jar files into the following directories:
-
Pentaho BI Server: pentaho/tomcat/webapps/pentaho/WEB-INF/lib
-
Pentaho Administration Console: administration-console/jdbc
To define the connection, start the Pentaho Administration Console. The console starts up on port 8099, and can be reached via the URL “http://localhost:8099/”. The user and password are “admin” and “password” (sadly not documented anywhere in the download). The console can only be run successfully if the BI-Server is up and running as well.
To define a datasource, activate the “Administration” mode on the left hand side of the window and switch to the “DataSources” tab. Hit the small white “+” to add a connection.
The Pentaho Administration Console is smart enough to detect all installed drivers. If your driver is not listed here, you probably have forgotten to add it to the “lib” directories outlined above.
The data-source names you define here must match the datasource names defined for the Pentaho Report Designer. I also heavily recommend that you use the same type of database system for designing reports and for running them in production. SQL is a widely interpretable standard and therefore SQL written for one database is not necessarily compatible with the SQL dialect understood by the others.
Let me close with: Death to all hard-coded settings, long lives the host provided setting!