Anonymized reports – Report bugs without exposing your business

When reporting a tricky bug, it is mandatory for us to get a sample of a report that shows the bad behaviour. The worst bugs are bugs that consistently show up at the customer’s system but refuse to be seen on my own computer. The best bug-reports are reports that contain a small sample report along with the data necessary to show the bug’s effects.

However, not everyone is willing or allowed to share sensitive data. If the bug occurs in your HR reports, its probably not the best idea to attach the report and data to a public JIRA case. When you live in the EU, disclosing personal data of non-consenting persons is a rather serious act.

With Pentaho Reporting 4.0 creating good bug-reports finally becomes easier.

Select “Extras->Anonymize Report” and your report’s text goes through a randomization process. All characters in the labels are replaced by a randomly chosen character, while preserving both word length, punctuation and capitalization.

The select the report’s query and select “Extras->Anonymize Query” and your data undergoes the same process. Numbers stay numbers, but are replaced by randomly chosen numbers of the same magnitude. Text and dates are scrambled too. Once this is finished, remove your old query from your data-source and your report now uses the new query.

Note that the query-anonymization works on the preview data. If your query has parameters the preview usually does not return data without proper parameter values. In that case you need to rewrite your query to remove the parameters before you can anonymize it.

With this selective process we preserve most of the characteristics of the report that are important for the layout calculation, but remove most of the sensitive data that was contained in the report.

This entry was posted in Basic Topic, Tech-Tips on by .

About Thomas

After working as all-hands guy and lead developer on Pentaho Reporting for over an decade, I have learned a thing or two about report generation, layouting and general BI practices. I have witnessed the remarkable growth of Pentaho Reporting from a small niche product to a enterprise class Business Intelligence product. This blog documents my own perspective on Pentaho Reporting's development process and our our steps towards upcoming releases.