A Test Suite Collection forks multiple OS sub-processes

I have made a GitHub project


Problem to solve

In the Katalon User forum, there is a topic “Stopping executing test collection in KRE”. The original poster wrote:

Tests are run manually via command line using KRE. Tests are in a test collection which comprises of around 100s of test suites which each test suite having around 200-300 data variations in data files. The tests would run more than 24 hours and sometimes there is a need to abort the remaining of the tests.
How do we cleanly stop the KRE tests? Observations by killing it via Ctrl-c would result to locked up Reports directory and sometimes tests are still running (by looking at resource monitor).

I presume that he did Ctrl-c to kill the OS process in which Katalon Runtime Engine are running.

It seems to me that he expects (wants) everything is gracefully terminated by killing that process by Ctrl-c, but things does not go like this. Why?

Why?

Katalon Runtime Engine (or similarly, Katalon Studio) runs in a single OS process.
When you run a Test Suite Collection which comprises 2 Test Suites,
the KRE process forks 2 subprocesses each of which Test Suite runs.
I am going to present you a demostration of this fact later.

OS will regard these 3 processes are independent.
Killing one of these 3 process does NOT automatically terminate other sub-processes.
Therefore, as the original post wrote, it is quite likely that we observe

“by killing it via Ctrl-c would result to locked up Reports directory and sometimes tests are still running”

Sending Ctrl-c to the parent process will not be a solution to the problem.

Demonstration

In this demo project, we have

  • a Test Case named “TC1”. See the source here. At startup, it emits a message “I am busy now”. It sleeps for a few seconds (10 to 20 secs). Then it emit a message saying Goodby.
  • a Test Suite named “TS1” which just calls “TC1”.
  • a Test Suite Collection named “TSC”. It will invoke 2 instances of “TS1” parallely.
  • a bash script named “./watch_ps.sh”. It will repeat executing a shell command with 5 seconds interval. It will execute “ps -A” command to list the running OS processes; it will filter lines for the lines with a string “Katalon”; it will shorten the filtered lines (just to make it easier to see).

I will run this set of code on macOS with bash commandline.

I will run the ./watch_ps.sh first. Then I will run the Test suite collection TSC.

When I did it, I saw the following out come in the console.

$ ./watch_ps.sh
 5991 ??         4:05.17 /Applications/Katalon Studio.app/Contents/MacOS/katalon

 5991 ??         4:05.38 /Applications/Katalon Studio.app/Contents/MacOS/katalon

 5991 ??         4:08.07 /Applications/Katalon Studio.app/Contents/MacOS/katalon

 5991 ??         4:12.97 /Applications/Katalon Studio.app/Contents/MacOS/katalon

 5991 ??         4:14.14 /Applications/Katalon Studio.app/Contents/MacOS/katalon

 5991 ??         4:15.87 /Applications/Katalon Studio.app/Contents/MacOS/katalon
 7418 ??         0:08.62 /Applications/Katalon Studio.app/Contents/Eclipse/jre/Contents/Home/bin/java -Dgroovy

 5991 ??         4:18.52 /Applications/Katalon Studio.app/Contents/MacOS/katalon
 7418 ??         0:09.66 /Applications/Katalon Studio.app/Contents/Eclipse/jre/Contents/Home/bin/java -Dgroovy
 7424 ??         0:05.53 /Applications/Katalon Studio.app/Contents/Eclipse/jre/Contents/Home/bin/java -Dgroovy

 5991 ??         4:19.68 /Applications/Katalon Studio.app/Contents/MacOS/katalon
 7418 ??         0:10.22 /Applications/Katalon Studio.app/Contents/Eclipse/jre/Contents/Home/bin/java -Dgroovy
 7424 ??         0:09.44 /Applications/Katalon Studio.app/Contents/Eclipse/jre/Contents/Home/bin/java -Dgroovy

 5991 ??         4:20.97 /Applications/Katalon Studio.app/Contents/MacOS/katalon
 7424 ??         0:09.45 /Applications/Katalon Studio.app/Contents/Eclipse/jre/Contents/Home/bin/java -Dgroovy

 5991 ??         4:21.86 /Applications/Katalon Studio.app/Contents/MacOS/katalon

 5991 ??         4:22.54 /Applications/Katalon Studio.app/Contents/MacOS/katalon

Here you can find 2 types of OS processes.

One is a process where /Applications/Katalon Studio.app/Contents/MacOS/katalon is being executed. This is the process where Katalon Studio is running.

Another is a process where /Applications/Katalon Studio.app/Contents/Eclipse/jre/Contents/Home/bin/java -Dgroovy is being executed. This is a process where Test Suite is running.

In the console, you can find 1 process of Katalon Studio plus 2 processes of Test Suites are bein executed parallely.

If I have a Test Suite Collection of 8 parallelism, then I will see 8 sub-processes in the ps command output.

Is a Test Suite Collection multi-threaded? — Not at all.

Sometimes, people in the Katalon Forum talk about “parallel execution”. Some people seems to have a misunderstanding.
They seem to think that a Test Suite Collection runs a set of multiple “threads” each of which Test Suites runs.
You should not confuse “multi threads” and “multi processes”.
These 2 models plays completely different.

Conclusion

When you run a Test Suite Collection with 2 Test Suites contained, you will have 3 OS processes running independently.
Killing one of them by Ctrl+C will not automatically terminate this set of processes gracefully.

I did an experiment on my Mac. While a Test Suite Collection is running, I killed the parent process in which Katalon Studio is running.

The following screenshot shows what I saw.

When I killed the parent process of Katalon Studio, the 2 subprocesses of Test Suites stopped together.

It was a surprise for me. This is not what I expected to see.

Just I wrote what I saw.

I looked at the documentation of the kill command and other resources. It seems that kill <PID> may kill all processes of a Process Group on Linux. Possibly Mac would be the same. So when I killed the processes of Katalon Studio, the child processes of Test Suites are killed together.

I do not know how the processes are controlled on Windows; could be quite different from Linux/Mac.

Actually the main issue is not killing the processes, but as the OP mentioned:

killing it via Ctrl-c would result to locked up Reports directory

This happens because when the process is killed, tests are not stopped gracefully therefore the data needed for reporting are not flushed / saved in the form of the final report.

and sometimes tests are still running

that is a secondary issue which seems to occur only in certain conditions.

So, the solution is in the development team hands.
A ‘stop’ method seems to already exist, since the Terminate Execution Conditionally feature is implemented which most probably triggers it.
So, the only thing missing most probably is to register a shutdown hook to call the same stop method.
That should work for both Ctrl-C (SIGINT) and standard kill (SIGTERM) at least on Linux and Mac which are POSIX compliant.

1 Like

Yes.

Do you mean

  1. you send SIGINT to the process of KRE by Ctrl+C
  2. KRE should handle the SIGINT; send some message to subprocesses and wait for all the subprocesses to terminate gracefully
  3. once confirmed that all subprocesses have terminated, KRE itself should quit

?

Well, it should be possible.

c/c

@duyluong
@Jass

Yes. The SIGINT signal is sent to the process when pressing CTRL-C by any sane (POSIX) tty session.
Not sure how command prompt / powershell works but should do the same (fingers crossed)
SIGINT and SIGTERM are ‘catchable’ in Java through the shutdown hooks (SIGKILL, sent by kill -9 is not)
See the references I posted in the originating thread.

From what I read here and there (e.g. https://superuser.com/questions/959364/on-windows-how-can-i-gracefully-ask-a-running-program-to-terminate) seems like taskkill works similar:

  • by default will send the SIGTERM signal
  • using the /f flag will send SIGKILL

and apparently there is also a kill command available:

Thus being said, the shutdown hooks should run nicely also on Windows, even from task manager (as far as I remember there are two option, stop process vs kill process … lazy to reboot my machine on windows to check it)