Katalon command line crash silently with several Jenkins job at the same time

Hello Community,

Here is my issue to sum up :

  • We are currently using Katalon 8.3.5 (but we face the same issue with 8.4.0 and 8.4.1) to test our websites and mobile applications (1 Katalon project for each website/application)
  • We start Katalon with Jenkins job and using shell command like :
katalonc -runMode=console -apiKey="***" -statusDelay=10 -projectPath="${WORKSPACE}/ProjectA.prj" -retry=0 -testSuiteCollectionPath="Test Suites/TestSuiteCollection1" -executionProfile=prod --config -webui.autoUpdateDrivers=true -proxy.system.applyToDesiredCapabilities=false -browserType=chrome -g_myGlobalVariableA=XYZ -g_myGlobalVariableB=XYZ...
  • we have about 20 jenkins job with almost the same command line, only the testsuitecollection change (one for each projet/application)
  • when only 1 or 2 jobs are running at the same time, everything works fine
  • but as soon as 3 or more jobs are running at the same time, some of them seems to crash silently (katalon stops and jenkins performs its post-build actions)

See attached few logs from jenkins (The === Begin Build === and === End Build === are printed just before and after the katalon command line) :

  • log1.txt (2.2 KB) : crash while getting the licence (no execution0.log file generated)
  • log2.txt (2.5 KB) : crash while generating the gloabal variables (no execution0.log file generated)
  • log3.txt (4.4 KB) : the suite was running and suddenly stopped (one execution0.log file generated for each test case, but just with script information, no error inside)

There is no error in the jenkins error log recorder, and I don’t know why these crashes append and what to do to fix it

If you need any other information, feel free to ask

Thank you for your help

I think @anon46315158 might be the guy to help here.

looks like a lack of hardware resources.
what is your jenkins setup? everything happens on a single node, master jenkins or do you have slave executors available?
what are the hardware resources available (cpu and ram) on master (and slaves if you have such)?
how many concurrent jobs are allowed on the slave nodes, if you have such?

Another thing to consider is the licences.
Each KRE execution require a licence. Do you have 20 available?

Thank you for your quick answer @anon46315158 and @Russ_Thomas

Others tools are using our Jenkins, but we have a dedicated slave node for running our tests

The slave node has 10 executors

I have 10 KRE licences, but when this crash happens, I’m sure I have enough available (and when there is no licence available, there is an explicit displayed)

For the other question about CPU and RAM, I will check with the OPS team that manage Jenkins and get back to you ASAP

mhm … i think what you are trying to say is, you have 10 nodes (executors) with the same label, providing a pool which is dedicated to your project (if not like that, something is screwed)
check the nodes configuration to be sure only one concurrent job is allowed at a time for each one (usually defaults to 3)
you are running your tests apparently directly, either by the jenkins plugin or with a preinstalled katalon on each node:

katalonc -runMode=console -apiKey="***" -statusDelay=10 -projectPath="${WORKSPACE}/ProjectA.prj" -retry=0 -testSuiteCollectionPath="Test Suites/TestSuiteCollection1" -executionProfile=prod --config -webui.autoUpdateDrivers=true -proxy.system.applyToDesiredCapabilities=false -browserType=chrome -g_myGlobalVariableA=XYZ -g_myGlobalVariableB=XYZ...

if, by bad chance, a certain node picks two jobs at the same time, you can have a resource conflict.

Side note, if you are using the jenkins plugin, there are some issues with it currently, see:
https://katalonsupport.force.com/katalonhelpcenter/s/article/Katalon-Plugin-issue-on-Jenkins

This i don’t understand.
In one log you have:

Activating...
Start activating offline...
Search for valid offline licenses in folder: /var/jenkins/.katalon/license
The number of valid offline licenses: 0
Start activating online...
Cleaning up workspace
Opening project file: /var/jenkins/workspace/JOB_NAME_1/ProjectNameA.prj

but in another one:

ctivating...
Start activating offline...
Project 'ProjectNameA' opened
Start reloading plugins...

so … you are using offline or online licences?

Yes, that’s it (sorry I’m not an expert with Jenkins)

Yes, only concurrent job is allowed

No, I’m using the classic Jenkins module “build” for generating a shell command

We are using only online licences. My bad, I don’t how but made a mistake when copy/pasting/cleaning the logs. The current log was :

Activating...
Start activating offline...
Search for valid offline licenses in folder: /var/jenkins/.katalon/license
The number of valid offline licenses: 0
Start activating online...
Cleaning up workspace
Opening project file: /var/jenkins/workspace/JOB_NAME_1/ProjectNameA.prj
Generating global variables...
Project 'ProjectNameA' opened
Start reloading plugins...
...

well … if you are sure that everything is right, and only one job can execute at a time on a given node … i have no other idea what to check, except some networking issues and perhaps this:

INFO: Java vendor: Oracle Corporation
INFO: Java version: 1.8.0_222

is that the oracle java? can you try to install openjdk instead on the nodes?
i think katalon doesn’t play nice with the oracle one.
must show something like:

[bionel@localhost pyhton]$ java -version
openjdk version "1.8.0_342"
OpenJDK Runtime Environment (build 1.8.0_342-b07)
OpenJDK 64-Bit Server VM (build 25.342-b07, mixed mode)

Thank you for your investigation : I will check with the IOPS team for the lack of hardware ressource and check if they can change the java provider

Alternate, you can try running your tests using a docker image, to not depend on the OS installed java version.
However, your nodes should have docker installed.

It was indeed an hardware issue : we have double the RAM and the CPU on the node and these crash don’t happen anymore