How to verify the text in PDF

hye, i am a beginner in katalon studio. how to verify the text in the pdf popup??

Use a Switch to Window Index (or Switch to Windows Url or Switch to Windows Title) step then use Verify Text Present stepPDFtext

I already did as what you show, but it didn’t work. the text unable to verify.

I am using Switch to Window Url and it is also not working for me. The url is framed as http : // mycompany.test/services/docoment/id-number (The url does not have a .jpg after the file name). The Switch to Window keyword is successful, but Verify Text Present fails to find text.

The PDF is displayed using a chrome extension and Katalon can not ‘see’ the contents of the PDF page, only the html attributes to display the embedded PDF.

Does Katalon have any way to inspect the actual contents of an embedded PDF?

If not, has anyone created a custom keyword to do this? From what I’ve read on Stack Overflow using Selenium, if the file can be downloaded, it can be parsed using PDFBox. I may look into that if it’s the only option.



There is currently no built-in way to inspect an embedded PDF, so I think you’d have to download, parse and then write Java code to verify the text.

1 Like

You can try this keyword : PDF Keywords

The most standard way to parse text from a pdf file is through Apache’s PDFBox. The idea would be to download the pdf file and parse the text content out of it (as @ThanhTo said, unfortunately you won’t be able to parse text directly from the pdf viewer in the browser). The syntax looks something like:

File file = new File("path/to/file");
PDDocument document = PDDocument.load(file);
PDFTextStripper pdfStripper = new PDFTextStripper();
String text = pdfStripper.getText(document);

Thanks, that’s the direction I was going.
I will also try the PDF Viewer plugin.

And of course, I’ll report back here what I find works.

1 Like

Hi Thanh,
Is there any update on how to extract the value from the PDF which is embeded in the URL as described above?