Visual Comparison Test of pdf & excel reports?


I have been looking a little more on advanced testing features such as visually comparing a snapshot of a web page between tests that @kazurayam has mentioned in some of his github projects such as GitHub - kazurayam/VisualTestingInKatalonStudio: This is a Katalon Studio project which performs 2 ways of screenshot-comparison testing for WebUI in full automation. Twins mode compares 2 host names at a time. Chronos mode compares the current view of a AUT to the previous one. You can make your own Katalon Studio project capable of the same screenshot-comparison. approach. Instruction provided. but it got me thinking about comparison of reports. Our website can generate a report in PDF and Excel files and was wondering if this is even possible to compare visually or otherwise?


Katalon by itself is used for web testing, so unless your reports are generated and rendered in an HTML page, it cannot be done with existing keywords.

However, you can do pretty much anything with some custom keyword programming in Groovy, as @kazurayam has demonstraded.


You said you want to compare report in PDF. Katalon Studio does not provide any built-in support for it.

But If you write program for yourself you can do almost anything. But let me remind you, if you are going to write program, you should do more of requirement analysis before talking about which tool to use.

You want to compare 2 pdf files. But why you want to compare them? What kind of “Report” you are talking about? Are you talking about Katalon Studio’s Test Suite’s Report? or you have some kind of business report such as Annual Report 2018, Apple?

Can you get 2 files at once from some source storage? Or do you have to get 1 and store it somewhere, wait for some days,weeks,months, and get the updated version of the same title of pdf? How do you want to see the comparison result? — are you satisfied to see the result in the commandline message? or do you want to compile another PDF, or do you want to send it on e-mail? Quite a lot of things to think about you have.

Final question is — does it worth developing?

thanks for the replies.

So my idea was to automate the comparison of a report that is generated through a website. The report is literally an audit of what the Test Suite has done. I.e. login as an Administrator, change some settings, log out.
I can automate the html auditing as its just a grid however as we can export a PDF & an Excel document of the same I thought that it would be great to be able to automate those as well.
I was trying to see if anyone has created something similar for such a scenario.

In terms of developing something to do that I can see where it may be possible using Java and some libraries… If I get time in the next few weeks I will look into it (baby due soon!).

I suppose that your test would do 3 steps:

  1. visit your target website, find links to PDF and Excel files
  2. download PDF and Excel files into local disk, rather than opening them in browser
  3. do whatever verification over the downloaded PDF/Excel files

You can implement the step1 using WebUI.* keywords. Also you can implement the step2 using WS.* keywords. You can mix WebUI.* and WS.* keyword together in a test case in Katalon Studio. The following discussion might help you:

The step3 is something new. Enjoy development!

1 Like

We had a similar requirement to do a visual comparison of PDF files downloaded from our application. We stumbled across a very useful library to do this:

This works by doing a pixel-to-pixel comparison of the two files after being fully rendered, highlighting differences in various colors. As Kaz has hinted at, this is obviously not a built-in Katalon capability, but of course you can pull a jar for this library into Katalon, then utilize it within the studio that way.


Do you want to compare multiple versions of your website’s report taken at different timing (morning, lunch time, afternoon tea time, evening) and find differences between these versions?

Then I think you will find a big technical challenge. That is, you would need to store these versions in a folder tree like this:

<base>/<TestSuite Id>/<TestSuite Timestamp>/<TestCase Id>/<subpath>/<filename.pdf>

Your step2 script would store downloaded files into this folder tree. The tree format keeps downloaded files manageable/identifiable for the step3 script. You would need a set of access methods to the files in the tree.

I have ever been struggling with this problem (for screenshot images) in the last 9 months, and got a favorable solution for it. See the following post: