A quick introduction to crosstabs in Pentaho Reporting 4.0

Crosstabs have been on the horizon for several years now. They lived a happy, undisturbed life along with the unicorns and gnomes guarding the pot of gold at the end of the rainbow.

With an endless recession and central banks selling off their gold reserves, the unicorn has been sold to a meat factory and the gnomes now assemble luxury cell-phones in a chinese factory.

So the day had to come that crosstabs have to work for a living. That day is now.

The added crosstab capabilities are the head feature of Pentaho Reporting 4.0 (now to be released in April 2013). To make crosstabs possible, the reporting engine had to move away from the strict banded approach of top-down layouting. The engine now has a true table layout that ensures that even misbehaving crosstab definitions will look and behave good.

Some sort of crosstab abilities were in experimental mode for very much of the last 4 years. The good news: The current implementation fixes all the problems we had with that experimental stuff. The bad news: Your old crosstab definitions will most-likely not work in 4.0 – you will have to recreate them properly with the new version.

To spin off crosstabs quickly, the Pentaho Report Designer ships with a quick-and-dirty crosstab creation dialog. This dialog pops up whenever you add a crosstab group.

Until we update the report designer with a proper user interface, you will have to edit all crosstab properties and elements in the structure tree.

Be aware: The code is pretty much happy-path – stray too far away from it and the big bad wolf will eat you, your children and your data. Wolves are a protected species, so bear your share and help to feed them properly. Before you are fully digested – file bug reports and feature requests in our JIRA system.

Enough prose: Let me just list some of the notable current behaviours and general ramblings.

  • normalization works now, but it is not all-knowing. It will only normalize on existing data items. If you whole dataset has Q1, Q2 and Q4 but fails to mention Q3 at least once, you will not see a Q3 anywhere.
  • The engine wants its data raw and bloody. Denormalize your data! If you use OLAP datasources, use Denormalized version of these datasources.
  • Sort your data to match your group structure.
  • column-headers currently do not repeat on the subsequent pages.
  • Setting your cell-contents (or any other row-layout child) to a percentage width will yield a zero-width in the final layout. (PRD-3970)
  • the report wizard really does not know how to handle anything crosstab related. (PRD-3860)
  • you can add charts into a crosstab cell.
  • if there is more than one row of data for a single crosstab cell, you have the choice to print either the first item, the last item or all items in a large list.

Features that will come, but are not implemented yet:

  • butterfly headers: Move row-headers into the middle of the data. (PRD-4005)
  • Subreports on all crosstab cells (PRD-4006)
  • style- and attribute-expressions for all crosstab elements (PRD-4007)
  • The no-data-band on the crosstab-group has no effect. (PRD-4008)
  • Summary-rows and columns will be positioned either at the start or the end of the respective group. (PRD-4009)
  • Details header will be printed. (API is defined, but not used yet). Printing of them will be optional. (PRD-3949)
  • Header cells can either be spanned cells, or can repeat its content for each child. (PRD-3386)
  • There will be a switch to make row- and column-headers disappear. You can then use the crosstab purely as a layouting utility. (PRD-4010)
  • sorting of data. This will be optional and will blow up if you expect us to do big-data sorting. Get a real database with SQL support 😉

Now go and download a CI build of PRD-trunk, will you!?

This entry was posted in Development, Report Designer & Engine on by .
Thomas

About Thomas

After working as all-hands guy and lead developer on Pentaho Reporting for over an decade, I have learned a thing or two about report generation, layouting and general BI practices. I have witnessed the remarkable growth of Pentaho Reporting from a small niche product to a enterprise class Business Intelligence product. This blog documents my own perspective on Pentaho Reporting's development process and our our steps towards upcoming releases.

13 thoughts on “A quick introduction to crosstabs in Pentaho Reporting 4.0

  1. Madhu

    Goodbye to my mondrian days for creating simple crosstabs!! Maybe I can re-design few prpts which has dirty MDX to create normal cross tabs! Bring-em on!

  2. Vineet Bansal

    IN 3.9.1 PRD, is there a way to add sparkline line chart as the last cell entry in the crosstab report based on the grouping? If not is it going to be available in 4.0?

  3. Thomas Morgner

    The 3.9-codeline’s crosstabs is a dead branch. We have rewritten the majority of the code for 4.0.

    In 4.0, you can add a column-summary and in that you can add your sparkline. Assuming that the sparkline shall display the values from the details, you will also need to create a collector function to get the values you want to display. But overall: Possible.

  4. jakub

    hello,
    I was trying to make cross tab in 4.0, as U said.
    A have data in postgres (cos of building a house):

    room, element, quantity, cost, year
    “kitchen”;”chair”;2.00;300.00;2012
    “kitchen”;”chair”;2.00;300.00;2012
    “livingroom”;”chair”;4.00;400.00;2012
    “livingroom”;”chair”;1.00;500.00;2013
    “kitchen”;”fridge”;1.00;1500.00;2013
    “livingroom”;”fridge”;1.00;500.00;2013
    “livingroom”;”fridge”;1.00;700.00;2013
    “kitchen”;”ferniture”;1.00;10000.00;2013
    “livingroom”;”table”;1.00;1000.00;2013

    After making the crosstab I got duplicate values in rows (I dont’t why but it’s incorrect)

                             chair,      ferniture,     fridge,      table
    kitchen              600            0                 0                  0
    livingroom         900           0                 0                  0
    kitchen                  0            0              1500              0 
    livingroom            0            0              1200             0
    kitchen                  0         10000            0                0 
    livingroom            0            0                  0             1000
    

    Could You help me?

    1. ThomasThomas Post author

      Simple: Make sure your data is sorted correctly. Sort by the order of your declared groups, from the outer-most to the inner most.
      For crosstabs, sort by the row-groups first, then by column-groups.

      .. ORDER BY room, element

      and you will be safe.

      PS: Adding automatic sorting from within PRD is on our to-do list for this (4.0) release.

  5. Shail

    I am trying to create Crosstab report using Hierarchy structure,have achieved most of the parts using Crosstab other group(Crosstab Page Group) Functionality,It is showing correct values at first time,but once I close the report i won’t be open it again and it is showing “CASTING” error and the error because of Crosstab Other Group functionality.

    Please find the below sample example for more details.

    Crosstab January February March
    APAC
    America 500 700 765
    Australia 555 453 654
    EUROPE
    Germany 300 600 700
    France 543 800 500
    ASIA
    India 500 300 200
    China 700 800 900

    Is their any other way to achive this functionality in PRD or any Other way to eliminate CASTING error so that,reopen my report again without any problem?

    Any help on this will be much appreciated.

    Thanks,
    Shail

Comments are closed.