Can I verify Excel data download from website

I hate to jump in here and bug you, but I’m really close to figuring out how to do this (at least in my mind), however no matter what I seem to try I’m never able to read the text of a cell in my excel file.

I do not have a Java/Groovy programing background (although my hope is to learn) so bear with me (please). So far I’ve been able to use the download manager to successfully grab the last file in a specific directory. I’m able to print out the file I’m expecting is being found, but when I try to pass this into the Excel reader and then output the text of the first cell (that’s the only thing in my file until I figure out how this works) I always get an error. I’m not sure what I’m doing wrong.

I hope I’m just making a stupid mistake here but I know this is a complex topic so I’m already in over my head but I really need to check excel file contents and I feel I’m really close to cracking this nut.

1 Like

OMG guys, I’m an idiot; there was no text in the cell I was referencing (1,1), I did not realize the first row and column are treated as 0,0. After playing around and head shaking I think I’ve got a basic place to start from. I am curious why it would fail to print out the value of an empty cell though? I guess I will have to play around and come up with a solution for dealing with that, although in most cases I’m going to be expected specific values in specific cells. Now all this said, I could still be missing something so if you have an wisdom to pass along please do so.

They key point here is ‘empty cell’. When using Excel, a cell may appear empty in the GUI, but behind the scenes, the cell doesn’t actually exist.

Think of it this way: when you create a new workbook in Excel, it creates a spreadsheet with a bunch of “empty” cells, at least as far as you’re concerned as a human looking at it visually. But of course, a spreadsheet can be of basically infinite size (not really, but hopefully you get my point). So instead of creating an “infinite” number of empty cells, Excel creates a spreadsheet with zero cells, and waits for you, the user, to enter data, at which point a cell “object” is created, with whatever data you give it. In other words, it isn’t until data is actually given to the cell (even if that data is an empty string) that the cell is even created in the first place.

So in your workbook, if you reference a cell that doesn’t technically exist yet (even though visually it looks like an “empty cell”), you will inevitably get a NullPointerException.

2 Likes

Makes complete sense. Thanks @Brandon_Hein for all the info you put on this topic, this was extremely helpful and a great learning experience for me.

1 Like

When I call “waitForDownload()” on Firefox, it works fine and it does wait (although very quickly and subtly) for the file to download. However, the same method call on Chrome gets stuck on waiting for download, despite seeing it appear in the downloads bar at the bottom of the screen.

Do you have any insight as to why it’s getting stuck on Chrome? Do you have any special Desired Capabilities set in Katalon Studio settings? I would hate to rely on a hardcoded “Delay” step to ensure the file is downloaded before I do anything with it.

EDIT: It turns out I set my Chrome download default directory to the wrong place. So the method “waitForDownload” kept waiting and waiting because the file was never sent to the directory I expected it to. Once I fixed the default download directory path, the method executed as expected.

Thanks again for providing that very helpful class!

Hi guys, newbie here.
I’m trying to verify the excel sheet but I got these errors.

mydocument = new exceldocument.ExcelDocument(dmf) FAILED.
Reason:
groovy.lang.GroovyRuntimeException: Ambiguous method overloading for method org.apache.poi.ss.usermodel.WorkbookFactory#create.
Cannot resolve which method to invoke for [null] due to overlapping prototypes between:
[class java.io.File]
[class java.io.InputStream]
[class org.apache.poi.openxml4j.opc.OPCPackage]
[class org.apache.poi.poifs.filesystem.POIFSFileSystem]
at exceldocument.ExcelDocument.(ExcelDocument.groovy:15)
at test.run(test:21)
at com.kms.katalon.core.main.ScriptEngine.run(ScriptEngine.java:194)
at com.kms.katalon.core.main.ScriptEngine.runScriptAsRawText(ScriptEngine.java:119)
at com.kms.katalon.core.main.TestCaseExecutor.runScript(TestCaseExecutor.java:331)
at com.kms.katalon.core.main.TestCaseExecutor.doExecute(TestCaseExecutor.java:322)
at com.kms.katalon.core.main.TestCaseExecutor.processExecutionPhase(TestCaseExecutor.java:301)
at com.kms.katalon.core.main.TestCaseExecutor.accessMainPhase(TestCaseExecutor.java:293)
at com.kms.katalon.core.main.TestCaseExecutor.execute(TestCaseExecutor.java:227)
at com.kms.katalon.core.main.TestCaseMain.runTestCase(TestCaseMain.java:114)
at com.kms.katalon.core.main.TestCaseMain.runTestCase(TestCaseMain.java:105)
at com.kms.katalon.core.main.TestCaseMain$runTestCase$0.call(Unknown Source)
at TempTestCase1567575881974.run(TempTestCase1567575881974.groovy:21)

this is my test case looks like
import static com.kms.katalon.core.checkpoint.CheckpointFactory.findCheckpoint
import static com.kms.katalon.core.testcase.TestCaseFactory.findTestCase
import static com.kms.katalon.core.testdata.TestDataFactory.findTestData
import static com.kms.katalon.core.testobject.ObjectRepository.findTestObject
import com.kms.katalon.core.checkpoint.Checkpoint as Checkpoint
import com.kms.katalon.core.cucumber.keyword.CucumberBuiltinKeywords as CucumberKW
import com.kms.katalon.core.mobile.keyword.MobileBuiltInKeywords as Mobile
import com.kms.katalon.core.model.FailureHandling as FailureHandling
import com.kms.katalon.core.testcase.TestCase as TestCase
import com.kms.katalon.core.testdata.TestData as TestData
import com.kms.katalon.core.testobject.TestObject as TestObject
import com.kms.katalon.core.webservice.keyword.WSBuiltInKeywords as WS
import com.kms.katalon.core.webui.keyword.WebUiBuiltInKeywords as WebUI
import internal.GlobalVariable as GlobalVariable
import downloadmanager.DownloadManager
import exceldocument.ExcelDocument

DownloadManager dm = new DownloadManager()
File dmf = dm.getDownloadedFile('Web Directory 2019-09-04.xls')
println(dmf)
ExcelDocument mydocument = new ExcelDocument(dmf)
println(mydocument.getCellText(0, 0))

Apologies for replying to an old post of yours, but I had some concerns and questions.

I’ve been using this DownloadManager class and it’s been really helpful except there’s been some timing issues and inconsistencies. I’ll click on an object which starts a download, but often times the download will finish before the waitForDownload() method has begun to execute, which leads to an “infinite” while loop due to ‘fileCount’ and ‘expectedFileCount’ being the same.

Do you have any ideas to workaround or fix this timing inconsistency (code-wise)? Some sort of retry mechanism perhaps? Any help in the right direction would be appreciated!

It would not loop forever, rather, it would wait at most 5 minutes:

public static void waitForDownload() {
	File downloadDirectory = new File(downloadPath);
	int fileCount = downloadDirectory.listFiles().length;
	int expectedFileCount = fileCount + 1;
	long startTime = System.currentTimeMillis();
	while(fileCount < expectedFileCount && (System.currentTimeMillis() - startTime) < 300000) {
		fileCount = downloadDirectory.listFiles().length;
		Thread.sleep(1000);
	}
}

The relevant code regarding this is:

(System.currentTimeMillis() - startTime) < 300000

You could mitigate this by either reducing the time down from 300000 ms, or you might even pass the max wait time in as a parameter to the method.

I knew there was an upper time limit so that’s why I put “infinite” in quotation mark since I was too lazy to convert all those seconds to minutes :sweat_smile: Edit: Actually, it didn’t occur to me that once the 5 minutes were up, it would simply grab the file that had downloaded too quickly. Silly me. Appreciate your help!

Spent several minutes trying to understand how your proposed solution would help solve the inconsistency in the ‘fileCount’ values until I actually realized what you were proposing. By decreasing the timer (say 10 seconds), it would wait 10 seconds, then I can grab the last download file.

Although, this does assume that the expected file was downloaded. By comparing the file counts, you ensured that a new file was downloaded and waited long enough for it to do so. But in my case, if I were to decrease the timer, I’d be assuming the expected file had downloaded but can’t be 100% sure since the ‘file count’ values are the same. So I might end up grabbing some miscellaneous file from my Downloads directory. But that’s probably not that serious of an issue since the next lines in my script verify the name of the last downloaded file, so if the expected file wasn’t downloaded, I’ll know then.

You are right, this is a limitation. You’ll notice though that this is why I have two methods to actually retrieve the file:

1.) getLastDownloadedFile()
2.) getDownloadedFile(final String fileName)

So if you have a priori knowledge of the file name, you can use #2 :wink:

Hi Brandon

When i am running my code , i am getting dmf = null? What could i be doing wrong here? Any solution to my problem will be really appreciated.

Thanks
Monty

Hi

I am getting the same error . is it because of any folder permissions where the file is getting downloaded? How did u solve this issue?

BR
Monty

what is the third party library need to download from the below link?
https://poi.apache.org](https://poi.apache.org/

https://poi.apache.org/download.html