Katalon PDF Comparison

Hi Katalon Team,

I am planning to start working on Katalon studio tool but before starting I want to check is there any functionality provided by katalon to compare two pdf files if not can we use plug-ins like “BeyondCompare” with Katalon for comparison of two pdf files?

Thanks,
Shubham Joshi

Hi @shubhamjoshi12

Katalon doesn’t offer any native method to do what you’re asking for. Sorry about that.

@kazurayam, I vaguely remember you built something related to this, or was it image comparison ?

Regards !

You can use Groovy to call BeyondCompare CLI.

You can also import and use this lib:

It’s use is pretty straight forward, and provides a good visual comparison.

1 Like

Hello,

try with this util

1 Like

hi,

I tried and works fine, tried only without any images in pdf file

My pdf files content

This is first demo pdf page

human
cat
dog
snake
worm
eagle
rabbit

This is second demo pdf page

human
cat
dog
squirrel
worm
eagle
rabbit

.jar added to Driver folder

in test class
import com.testautomationguru.utility.PDFUtil;

PDFUtil pdfUtil = new PDFUtil();

// returns the pdf content from page number 1
String file_1 = pdfUtil.getText("C:/Users/xxxx/Desktop/data/file1.pdf",1);
String file_2 = pdfUtil.getText("C:/Users/xxxx/Desktop/data/file2.pdf",1);

def diff1 = CustomKeywords.'readPdfFile.verifyPdfContent.findNotMatching'(file_1, file_2)
def diff2 = CustomKeywords.'readPdfFile.verifyPdfContent.findNotMatching'(file_2, file_1)

println "pages have differences file1 have words "+diff1+ " which are not in file2, file2 instead have "+diff2
pages have differences file1 have words [first, snake] which are not in file2, file2 instead have [second, squirrel]


Custom Keyword
	@Keyword
	public List<String> findNotMatching(String sourceStr, String anotherStr){
		StringTokenizer at = new StringTokenizer(sourceStr, " ");
		StringTokenizer bt = null;
		int i = 0, token_count = 0;
		String token = null;
		boolean flag = false;
		List<String> missingWords = new ArrayList<String>();
		while (at.hasMoreTokens()) {
			token = at.nextToken();
			bt = new StringTokenizer(anotherStr, " ");
			token_count = bt.countTokens();
			while (i < token_count) {
				String s = bt.nextToken();
				if (token.equals(s)) {
					flag = true;
					break;
				} else {
					flag = false;
				}
				i++;
			}
			i = 0;
			if (flag == false)
				missingWords.add(token);
		}
		return missingWords;
	}
2 Likes

@ Timo_Kuisma

HI Timo,

i’ive used

This is second demo pdf page
human
cat
dog
squirrel
bernard
eagle

This is second demo pdf page
human
cat
dog
squirrel
worm
eagle
rabbit

my output for this

doesnt seem to detect the mismatch

i’ve tried other library but i think this pdfutil fits on what i needed to do.

this is the lines with using your keyword. but it doesnt seem to detect the difference.

String file1 = pdfUtil.getText(‘c:/temp_pdf/1.pdf’, 0)

String file2 = pdfUtil.getText(‘C:/Users/bernard.tan/BIP_REPORTS/BIP_REPORTS/temp_pdf/2.pdf’, 0)

def diff1 = CustomKeywords.‘findNotmatching.findNotmatching.findNotMatching’(“file1”, “file2”)

def diff2 = CustomKeywords.‘findNotmatching.findNotmatching.findNotMatching’(“file2”, “file1”)

println “pages have differences file1 have words”+ diff1 + " which are not in file2, file2 instead have "+ diff2

ive tried using this method but pdfutil doesnt even seem to get the difference

pdfUtil.compare(“c:/temp_pdf/1.pdf”,“C:/Users/bernard.tan/BIP_REPORTS/BIP_REPORTS/temp_pdf/2.pdf”, 0)

am i missing something?

Hi,

in here
String file1 = pdfUtil.getText(‘c:/temp_pdf/1.pdf’, 0)
you are typed 0, numbers there are what page should be verified (1…N)
change it to 1 then you are able to seen differences

read again this page

Thanks Mate.

will check it out.

Hi Timo,
I have downloaded .jar file of PDFUtil for doing pdf comparison but the jar file is corrupted and I can not use it.
Can you please provide me the .jar which is needed to execute this type of codes.

Thank You.

hello you,

what you will mean corrupted?
dowload againpdfutil.7z (3.9 MB)

Thank You.
I will check it out.

Hi Timo,

The above pdfutil jar works well in version 6.1.5 but in the latest version 7.2.1 this gives below error:

org.apache.pdfbox.pdmodel.font.PDType0Font toUnicode
WARNING: No Unicode mapping for CID+4 (4) in font 23E5617ArialUnicodeMSBold