Better Builds (2/3): Set up a private CI server to automatically test your work.

When you finished writing a feature or bug-fix, the big moment of truth arrives: Did I break something?

Of course, the obvious errors should be validated in the local unit-tests. And all programmers run unit and integration tests every time they make a change. Yeah!

I know I don’t. I should, but its a quick fix and what bad could ever happen? So I need a system to help me stay clean – I need a CI server that automatically picks up my changes so that I don’t have to worry about skipping tests any more. I want it automated, I want it to work in the background, and I want it fast.

Pentaho Reporting is a library that is embedded in many places, most importantly for readers of this blog, in the Pentaho Report Designer and the Pentaho BI-Server. Sometimes changes have far fetching results – and not good ones! Checking method parameter more strictly can cause errors pop up everywhere. Removing a seemingly unused library or renaming a internal method can break the downstream compile process in rather surprising ways.

This is the second part of a 3 part series on how to manage your local builds better. Today, we will setup a CI server to continuously validate your local commits. To work best, this server will require you to have an Artifactory repository running, so that you get results faster.

Contents of this series

In the first part of this series, we did create a faster build by setting up an Artifactory server as a strong cache. This did cut the build time from over 60 minutes to roughly 15 minutes. Although this is not exactly lightning fast, it is fast enough to give you feedback on your work before you have moved on to the next task.

This post will automate the testing, so that every commit you do is validated. I will also introduce you to a advanced set of override properties for subfloor that will speed up your build process even more.

And in the third part, we will look at how to set up a proper end-to-end intergration test for the Pentaho-BI-Server assembly, so that any change you do can be validated against the full stack.

Continuous Integration explained

Continuous integration is a method to automatically build software whenever there have been any changes to the system or the code. CI acts as a second line of defence to inform you of any errors you make as early as possible.

A CI system is only as good as your suite of automated tests. If your code does not use any automated tests, you gain only the bare minimum from a CI system: You will know when code that you publish does not compile.

With respect to tests, I tend to fall between the test-driven-development, have 100% code coverage crowd and the ‘pragmatic’ no tests – if there are errors a customer will scream – crowd. I subscribe to the design-driven-testing camp, which tests whatever is needed to validate the use-cases, but does not worry too much about testing code on the extreme ends of the spectrum. Likewise, for me code coverage is a nice tool, but not a yardstick to beat up code with it. I write tests the test-driven way for each bug I fix (failing test first, fix second, passing test as a result) and for features I test the user story via unit-tests.

Pentaho Reporting comes with two test suites: A fast suite of tests that runs in a few minutes or so, and a long running test suite that validates report rendering and thus falls into the integration tests spectrum.

A CI server only automates stuff that you should be able to do equally well on the command line.

Defining an build process for automation

When you build any modern software, you usually use one of the many build scripting systems that exist out there. For old folks from the C camp, this may be make, for dot-net it is MSBuild, and for Java it is ant, maven or any of the modern alternatives.

Pentaho Reporting uses Apache Ant, for historical reasons. To run the build process, you will need an ANT installation, download an install Ant by following these instructions here.

Our project is already pre-configured to make it as easy as possible to set up an CI server. To build the full reporting project with all libraries, just invoke

ant continuous-local

Once this has finished without errors, invoke

ant longrun-test

to run all tests. This target will resolve all libraries again, and thus cannot be run without first running continuous-local to publish all artefacts.

(You can find a list of callable targets in the first posting of this series.)

If you get a “OutOfMemoryError”, you will have to increase the memory of your ANT installation by setting the environment property “ANT_OPTS” to “-Xmx1024m -XX:MaxPermSize=512m”.

Our CI server will invoke the same two commands, whenever it detects changes in the repository it monitors.

Installing Jenkins as CI-Server

I will use Jenkins as the CI server of our choice. Jenkins is easy to install, and has barely any requirements on the host system – as long as you can run Java, you will be fine.

When downloading Jenkins, do make sure you download the latest stable release, as the development versions can be outright broken from time to time. Download the latest stable native package for your operating system and install it.

Jenkins on Windows

If you install Jenkins on Windows, this will install the server as “Local System” user, with permission to do any evil operation it likes. For a CI-server that is a bit dangerous. So please follow this guide to run Jenkins as normal user instead.  This will also make it easier for you to configure build tools without having to fight against the permission scheme of Windows.

Jenkins will run on port 8080, which may conflict with other servers you either have running or will run during development. I strongly recommend to change the port to an unused port, like port 28080. Last but not least, the Jenkins install happily mixes the installation files with your job configuration, which is bad. Ideally, the configuration data sits in a separate directory that you can reach easily.

On Windows, I tend to keep all build files close to the root of the disk, as Windows does not like long path names. Create a “build/jenkins” directory in C (or any other drive). We will use this as Jenkin’s workspace. On Unix or a Mac, I tend to use “/opt/ci-data” for the same purpose.

Edit the “jenkins.xml” file (usually in C:\Program Files (x86)\Jenkins) and point it to the new directory. The file should look similar to this one at the end:

<!--
Windows service definition for Jenkins

To uninstall, run "jenkins.exe stop" to stop the service, then "jenkins.exe uninstall" to uninstall the service.
Both commands don't produce any output if the execution is successful.
-->
<service>
  <id>jenkins</id>
  <name>Jenkins</name>
  <description>This service runs Jenkins continuous integration system.</description>
  <env name="JENKINS_HOME" value="C:/build/jenkins"/>
  <!--
  if you'd like to run Jenkins with a specific version of Java, specify a full path to java.exe.
  The following value assumes that you have java in your PATH.
  -->
  <executable>%BASE%\jre\bin\java</executable>
  <arguments>-Xrs -Xmx256m -Dhudson.lifecycle=hudson.lifecycle.WindowsServiceLifecycle -jar "%BASE%\jenkins.war" --httpPort=28080</arguments>
  <!--
  interactive flag causes the empty black Java window to be displayed.
  I'm still debugging this.
  <interactive />
  -->
  <logmode>rotate</logmode>

  <onfailure action="restart" />
</service>

Basic Setup

Once Jenkins is installed, start (or restart if already running) the Jenkins service and point your browser to http://localhost:28080 to configure Jenkins.

First, we must make sure that Jenkins only uses stable plugins. Plugins are updated regularly, and not all updates are free of bugs. To ensure you have a working system, we must change Jenkins’ update feed to the stable feed.

Go to Manage Jenkins->Manage Plugins->Advanced and change Update Site to http://updates.jenkins-ci.org/stable/update-center.json

This way you ensure to get the proper update notifications for LTS and LTS-compatible plugins instead of Latest&Greatest. After you do this you may need to remove the contents of ${JENKINS_HOME}/updates to ensure that Jenkins shows the correct updates for the LTS stream. (If you followed the instructions above, JENKINS_HOME is either C:\Build\Jenkins” or “/opt/ci-data”).

We now have to install some additional plugins to teach Jenkins how to handle Git repositories. Go to Manage Jenkins->Manage Plugins->Available Plugins and install the “git plugin”, and ask Jenkins to install it on the next restart.

jenkins-plugin-install

Now do the same to install the following plugins:

  • config file provider plugin – Ability to provide configuration files (e.g. settings.xml for maven, XML, groovy, custom files,…) loaded through the UI which will be copied to the job workspace. We will use this plugin to provide our ivysettings and common build properties to the ant process.
  • Parameterized Trigger plugin – This plugin lets you trigger new builds when your build has completed, with various ways of specifying parameters for the new build. We will need that to safely trigger the integration tests on the correct GIT revision.

After Jenkins is back up, lets configure a few tools. To access GIT repositories, Jenkins needs a valid GIT installation. And to build a Java project with Ant, Jenkins needs to know about where to find Ant and a valid JDK.

I assume that these tools are already installed on your system, (otherwise you would have had a hard time getting the sources and building the project so far), so lets point Jenkins into the correct position.

If you use a Unix system, Jenkins should be able to pick these tools automatically. On Windows, you will have to point Jenkins into the correct direction.

First, specify the installation location for your JDK. For Pentaho Reporting, this must be a JDK 1.7 installation. Click the “Add JDK” button and you will see input fields for Name and JAVA_HOME.

jenkins-ant

Next, point Jenkins to the path to your Git command. Click “Add Git” to see the input fields. On Unix and sufficiently recent MacOS versions, the defaults should be fine. On Windows, point the “Path to Git executable” to the Git.exe in your GIT installation.

jenkins-git

And last but not least, let Jenkins install itself an Ant installation.

jenkins-jdk

So we are ready, lets start with setting up the first project.

Prepare shared configuration files

Next, we configure some central configuration files that we need later for the build process. As I explained in the first post of this series, ivy is rather slow when resolving artefacts, and a strong cache helps.

For a safe CI build, it is important to resolve artefacts freshly from a know-to-be-safe source. The ivy-cache can be easily tainted, both by parallel resolve processes as well as when Ivy gets interrupted in its work.

Therefore we will rely on our local Artifactory server to keep build times down.

On the Jenkins front page, select “Manage Jenkins” to access the system configuration. Now that you have the “config file provider plugin” installed, you should see a new entry called “Managed files” in the list of configuration options.  jenkins-manage-files

Click on it to set up the shared configuration files. Now, in the menu on the left hand side, select the “Add a new Config” option and select the “Simple XML file” option.

jenkins-manage-files-add

In the next dialog, ignore the “ID” field – it is some internal identifier that probably should have never been visible here.

Fill in the Name as “standard-ivy-settings.xml” and the contents from the ivysettings.xml file from your home directory. This file is also available here on Github.

jenkins-manage-files-ivysettings

Hit submit and select the “Add a new Config” option and this time select the “Custom file” option. Once again ignore the “ID” field, and fill out the Name field as “standard-reporting-build-properties.properties“.

You can find the contents of this file in the first post of this series as well or you can download the contents from Github GIST here.

jenkins-manage-files-build

Now you have all shared files set up and are ready to configure the actual project.

Setting up a Pentaho Reporting Project

Create a new project by clicking “New Item”, and selecting to “Build a free-style software project”. Give it the name “pentaho-reporting”. This project will build the reporting engine and will publish all artefacts into a private local repository.

jenkins-project

The next screen that comes up will be the configure project screen. We will go through the configuration from top to bottom.

In short, this is what we are going to do:

We setup the build to poll your local working copy of the Pentaho Reporting project for changes every 2 minutes and to build the project automatically if needed. To make sure the build is not interrupting your work (and to ensure that neither does your work interrupt the build), we will set up a local ivy repository cache within the workspace of Jenkins. As Git’s cleanup and checkout routines can be a bit funny when others put data into their workspace, we keep all sources in a separate directory.

1. Discard old builds to preserve disk-space

Tick the checkbox in front of “Discard Old Builds” to delete old project artefacts. When building, we are mostly interested in the JUnit test results, and keep only the latest copy around for eventual manual testing. Click the “Advanced ..” button to see all configuration options, and select “Max # of builds to keep with artifacts”.

jenkins-discard

 

2. Configure the Source Code Management.

Select “Git” as your source code management tool. If you dont see Git here, check that you installed the correct plugin and that you restarted your Jenkins server.

Enter the local path to your current working directory as Repository URL. These are the checkout sources on which you work normally. We let the CI server monitor your local commits and let it build the software in the background.

As you access your sources locally, you don’t need any credentials. Next add some additional behaviours to clean out the workspace and to check out the code into a sub-directory within the workspace.  Add the “Checkout to a sub-directory” behaviour and set the “Local subdirectory for repo” setting to “code”.

jenkins-sourcecode-management

You can set up a specific branch that this CI server should monitor. If you leave the “Branches to build” as it is with the value of “*/master”, it will monitor commits to your “master” branch. I usually set this to the current feature branch on which I am working. Alternatively you could point this to a “ci” branch and push or rebase to that branch to your latest changes whenever you want to trigger a build.

3. Configure the build triggers.

This is simple: Tick the “Poll SCM” check-box and set the schedule to “*/2 * * * *”. This tells Jenkins to look for new changes every 2 minutes.

jenkins-trigger

4. Prepare the Build Environment

To ensure that builds are valid, I usually take extra care to clean out the workspace. This makes sure that left-overs from previous builds do not affect the current build.

Check the “Delete workspace before build starts” check-box. To see the exclude options, hit the “Advanced..” button.

We delete everything except the “code” directory (git’s cleanup step takes care of that) and the provided configuration directory (we can guarantee the content of that one).

jenkins-clean-workspace

Next, we provide some configuration files for the build. We prepared these files earlier on. Set up the ivy-settings-file and set its “Target” to “bin/conf/ivysettings.xml”.

Second, set up the reporting-build-properties and set the “Target” for those to “bin/conf/.pentaho-reporting-build-settings.properties”

jenkins-config-files

We will configure the ANT build to pick up these files instead of using the files from your home directory. This again shield you and your build from unwanted interactions.

5. Finally, configure the actual build

Click the “Add build step” button and choose the “Invoke Ant” step. As usual, net select “Advanced..” to see all options. jenkins-build-ant

Set the “Ant version” to the ant installation you configured earlier on. Like we did on the command line, we will execute the “continuous-local-junit” target. As we checked out the source code into a sub-directory, we have to tell Ant how to find its entry point, by entering “code/build.xml” into the “Build file” text box.

Now set some properties for the build. This will tell Ant how to find its configuration and it will configure Ivy to store all downloaded and published artefacts within the Jenkins workspace.

# Point to the parent report's ivy cache so that we resolve 
# against the last successful build from there
ivy.default.ivy.user.dir=${WORKSPACE}/bin/ivy

# Point the build towards our configuration files.
user.home=${WORKSPACE}/bin/conf

And finally: We will have to give Ant a bit more memory than usual. The Pentaho reporting build process is complex and involves a lot of work, so set the “Java Options” to “-XX:MaxPermSize=256M -Xmx1024m” to avoid “OutOfMemory” errors.

6. Collect artefacts and test results

As a last step for now,  we keep both the final report-designer ZIP file as artefact and collect and aggregate the test-results from all modules into one nice looking report on the Jenkins project page.

jenkins-archive-artefacts

jenkins-archive-testresults

Now hit the “Save” button on the bottom and start your first build by choosing the “Build Now” link in the top left menu.

Congratulations, you now have a local CI server watching your commits and building and validating your project for you.

This entry was posted in Development, Report Designer & Engine on by .
Thomas

About Thomas

After working as all-hands guy and lead developer on Pentaho Reporting for over an decade, I have learned a thing or two about report generation, layouting and general BI practices. I have witnessed the remarkable growth of Pentaho Reporting from a small niche product to a enterprise class Business Intelligence product. This blog documents my own perspective on Pentaho Reporting's development process and our our steps towards upcoming releases.