KS 8.1.0 stucked in running (after more than 5 days of running long haul test) - need help to generate partial report

Hi,
I’m running a data driven test that does call WinSocket APi call (telnet) to an application (not sure if it is relevant but disclosing) and it will take about 30 seconds for 1 iteration to finish. The data (from data files) have 16K entries and I’m halfway in processing (needed to be sequential so it takes a lot of time). At halfway, the KS screen froze, CPU usage of Katalon is zero and the application we are testing again is still responding to the interface call (to rule out that the application being called is stucked). Memory usage of KS is at around 2 GB (I have a 64GB memory system) and there is plenty of memory space available(no pressure to memory). My question immediate question is if there is a way to capture the results of the test somehow so I don’t need to re-run the tests again from the beginning?
Thanks in advance!

  • David

If I understand your issue, the test is continuing but KS itself appears to have locked up?

If that’s the case, I would let it continue – it may recover.

:crossed_fingers:

If KS does eventually crash, I doubt it.

Having said that, there may be a temporary file being built somewhere – maybe these guys have an idea:

@kazurayam @Brandon_Hein @duyluong @ThanhTo

Feature Request

Devs – can I suggest that this is taken up as a feature request? A means to access on-going results as a test/suite is running? If KS is waiting until the test is complete to flush a report to disk, that’s really not good in long haul tests like this (@duyluong recall I have a test that runs for 13 hours overnight).

Thanks!

The KS test seems to be not running anymore (not crashed but stucked somewhere) via looking at the Runs number at the bottom. The number is not changing(e.g. “Runs: 8518/16381 Passes: 8476 Failures: 42”). CPU usage of KS (and child processes) is at zero. Just querying if there is a way to flush a report or results. I’m using the KeywordUtil to either Mark the test as a pass or a failure with information that is critical for my testing. Take note that this test was running for multiple days continuously (since the Oct 28th and seemed to get stucked yesterday - Nov.4). I have kept it running (no change in the UI) that hopefully it will recover(no luck so far).

Thanks again for the quick response.

  • David

I think there isn’t any way.


See another post:

Your problem might be the same as this one, isn’t it?

You just want to know which data row caused failure, don’t you?

are you sure about the ‘no pressure’ sentence?
check the heap size settings in your katalon.ini file (Xms and Xmx)
you may have 64 GB available on the system, but much less for JVM (most probably is limited to 2 GB :smiley: )

Are you really sure that 16381 entries in the data file must be processed as a single sequential batch?

Are you sure there is no chance to split this massive data into smaller sections? for example 160 entries per section, times 103 sections.

I’ve reduced mine to 1GB.

If THAT turns out to be the problem, that must be a really :poop: test case.

@david.casiano next time you run the test, turn the heap monitor on first.

image

At the point that Katalon gets stuck, take a screenshot of it:

image

@bionel Welcome back, stranger :beer:

Hi @david.casiano,

I’m running a data driven test that does call WinSocket APi call (telnet) to an application (not sure if it is relevant but disclosing) and it will take about 30 seconds for 1 iteration to finish.

Do you use a for-loop in this case?
How many iterations should be called to complete your test?
What does your script look like?

It does not need to be processed once but it is the least amount of work :slight_smile: I got some suggestion to separate it into different tests in which the data will only vary.

Ok, wrong assumption then on my part. I’m running on a 64 bit system so I wrongly assumed that the installed JVM will be 64 bit as well as KS would be 64 bit as well (which would be problem issues on that small memory). 32 bit application will have issues though.

I don’t have for loop in the code. It relies on KS to iterate to the data in the data files thru test suite.
Open connection
Do something
Close connection

Yes. I want to know the data variables(value being fed by data files) for that iteration that failed.

I put that information in the report via KeywordUtil.MarkFail as the string parameters so I could know the data that failed.

That will be great too if you can do something to show which data iteration fails. My “workaround” now is to use the KeywordUtil to mark the test pass or fail with the data that is relevant. Immediate data would not show the data and you have to look into the log details.

Do you guys have script to automatically create a testsuite for each iteration? Doing this by hand is kind of labor intensive. Take note that data file can change from system to system so it will not be a static data.

Thanks in advance!

You wrote “the KS screen froze”. But I think KS is not guilty for the freezing at all. I guess, your test case is not designed for failure.

For example, your test case script naively assumes that the telnet server should always respond something to the request, but in fact the server doesn’t necessarily respond due to some reason (bad request from client, etc). Even when the server does not respond, the client (your test case script) waits forever … 5 days passed.

I guess, your test case script still needs debugging. If I were you, I would start debugging the test case script while applying

  • 1 row of data
  • 10 rows of data
  • 50 rows of data
  • 100 rows of data
  • 500 rows of data
  • and more

Probably the test case will hang at some stage (the KS screen looks freezing). Then I would try identify which data input caused the test case to hang — KeywordUtil would be good enough for this purpose; I will do in-depth debugging to find out a way how to make the test case robust and capable of avoiding unwanted “freeze”. You shouldn’t leave your test case script “fragile”.

Your workaround is good enough for debugging, isn’t it?

This is exactly what the “Data-Driven testing” feature of Katalon Studio is meant for. You assign a “Data File” to a Test Suite. Katalon Studio automatically iterate over the rows of data while calling a Test Case with Variables supplied from the data rows.

The test has been running for 5 days so it did almost half of my data.
Based on previous result it would process 1 data iteration at around 30 seconds and there was some overhead in the other activity.
It would most likely yield ~2600/day which would be close to where I notice to the number KS froze. The application I’m testing was still responding. just a simple way of ruling out that the application it is testing was not working. It was running on my physical machine so I do check it every 2 hours when I’m working. I saw it frozen the first day in the morning (happened on my night time).
It is possible that there is a memory leak somewhere since I check that when I start my KS, it only uses around 1.4 GB. By the time froze, it uses around 2 GB(after 5 days). The code does open/close but I might need a try catch in there just to be defensive.
Doing it by 100 will have no problem.

Also, as an observation, I’m seeing that KS will use around 1.2 GB when you start up. Doing nothing, in 10 minutes it will grow to 1.4 GB(no tests running). I don’t know if it feature in KS (delayed loading or something).

In the katalon.ini file (it is located in the Katalon Studio’s installation directory), you can specify JavaVM parameter -Xmx. With it you can allocate larger maximum heap size. The following is my case on Mac where the default value of -Xmx is 2048 megabytes = 2GB.

$ pwd
/Applications/Katalon Studio.app/Contents/Eclipse
:/Applications/Katalon Studio.app/Contents/Eclipse
$ ls | grep ini
katalon.ini
:/Applications/Katalon Studio.app/Contents/Eclipse
$ cat katalon.ini
-startup
../Eclipse/plugins/org.eclipse.equinox.launcher_1.5.700.v20200207-2156.jar
--launcher.library
../Eclipse/plugins/org.eclipse.equinox.launcher.cocoa.macosx.x86_64_1.1.1200.v20200508-1552
-data
@noDefault
-vm
../../Contents/Eclipse/jre/Contents/Home/lib/jli/libjli.dylib
-vmargs
-XX:+UseG1GC
-XX:+UseStringDeduplication
-Xms256m
-Dfile.encoding=utf-8
-Xmx2048m
-XstartOnFirstThread
-Dorg.eclipse.swt.internal.carbon.smallFonts

You may try enlarging it, for example:

-Xmx20g

You may see your test occupies 20GB. Your greatness!

Memory leak, if any, is caused by your own test scripts; you should not doubt Katalon Studio. You should check your code more carefully.

I guess, your test script does not properly close() the network resources (OutputStream, InputStream). Are you sure that your test script explicitly calls .close() for WinSocket API?

If your code forget closing network resources appropriately, the Java Garbage Collector will not be able to free the memory resources used as buffer for network I/O.

In my previous Java project, I have ever experienced a Memory-leak stuff.

My application called javax.imageio.ImageIO.read(File) and javax.imageio.ImageIO.write(File) hundreds of times, that raised memory usage to the maximum specified by -Xmx. My app ran very slow, Garbage Collection ran frequently but GC was unable to free the heap. In the end, my app looked frozen.

The cause was that javax.imageio.ImageIO.read(File) and javax.imageio.ImageIO.write(File) does NOT implicitly perform “close()” operation for i/o streams. These methods leaves network resources (OutputStream, InputStream) unclosed. This API were not meant for repetitive use.

So, I changed my code to use javax.imageio.ImageIO.read(InputStream) and javax.imageio.ImageIO.write(OutputStream), and explicitly close the InputStream/OutputStream as soon as done.

By this change, the memory-leaky behaviour in my app disappeared. My application run far faster with minimum memory foot print.