Hi,
I have a PDF in the URL.
I am able to switch to the URL and read the full content of the PDF.
My requirement is
I need to fetch the value for the field Total from the PDF.
1 Like
hello
something like this way
PDDocument document = PDDocument.load(new File("C:\\Users\\xxxx\\Desktop\\pdf\\pdfcontent.pdf"))
document.getClass();
if (!document.isEncrypted()) {
PDFTextStripperByArea stripper = new PDFTextStripperByArea();
stripper.setSortByPosition(true);
PDFTextStripper tStripper = new PDFTextStripper();
String pdfFileInText = tStripper.getText(document);
//println("Text:" + pdfFileInText);
// split by whitespace
def lines = pdfFileInText.split("\\r?\\n");
//println("Textlines:" + lines);
//define list of lists
List<ArrayList<String>> listOfLists = new ArrayList<ArrayList<String>>();
for (String line : lines) {
println line
//create dynamic list
ArrayList<String> l = new ArrayList<>();
l.add(line);
listOfLists.add(l);
}
def word = listOfLists.get(0).get(0).split(" ");
println(word[3]);
def word1 = listOfLists.get(1).get(0).split(" ");
println(word1[5]);
def word2 = listOfLists.get(2).get(0).split(" ");
println(word2[5]);
List<String> total = new ArrayList<>();
//start from line 1 cause 0 line is header line
for(int i = 1; i < listOfLists.size()-1; i++){
def wd = listOfLists.get(i).get(0).split(" ");
//add here index what you will need
total.add(wd[5]);
}
println(total);
}
Outcome
GREEN
YELLOW
[GREEN, YELLOW]
my pdf content is
1 Like
Thank you for the code.
It worked well.
Hi,
It seems that with the new update of katalon this code returns an error:
unable to resolve class org.apache.pdfbox.pdmodel.PDDocument.
Any help?
Hi,
download correct package and add it to the project Drivers folder
https://pdfbox.apache.org/download.cgi
1 Like