Script to create multiple test suites(clone) with different data iteration

Hi,
We need to split a single testsuites into multiple ones to facilitate faster execution. The tests needs to let say loop thru 15K of data in the datafiles (using csv). The data may change so that number could be higher. I’m looking at another data file and it seems to be around 50K and others might be more.
In the example above where we have a single testsuite, it will take weeks to execute (approximation based on previous runs-did not complete). With the data being spread out into 500 items each test suite, the execution was down to a few days instead of possible weeks. I have to do it manually since I’m not aware of scripts to automatically spread out the data iteration into multiple test suites.
Did anyone created a script to automatically create multiple test suites and distribute the data across multiple test suites?

Example:
TS Big(original) - data iteration is all ( 1 - 50,0000)

Into:
TS1 - data iteration (1-500)
TS2 - data iteration (501-1000)
.
.
.
TS100 - data iteration (49501-50000)

The 500 number could be changed and not fixed. All the TS1 - TS100 will be in a Test Collection TC so we run it instead of the single testsuite TSBig.

Thanks in advance!

No, this is not what you need to do. You want to split the magnum data file into smaller chunks first. Once you got them, as next step, you would think how to apply the chunks to a single common Test Suite.

I think there is no solution for you out of box. You need to learn how to write a script that splits the magnum data into chunks for yourself

The need to split the data (which KS has the functionality to tell the TS to which data to use - this is nice forward thinking approach), I believe, is not needed if that what you mean in your sentence about splitting magnum data. The main thing needed is a functionality to clone the test suite (so report generation could be done per batch instead of a single gigantic one) automatically into multiple ones(well it will be a great feature in KS if it can support over 10K of data in data driven scenario). It works well if you are dealing with less than 100 items in your data file.
It is good to have a big report(1 TS for all the data) but noticed that KS will have difficulty processing it (and slows down a lot over time). Chunking it into multiple TS and then lumping it in a test collection works magnificently improving performance a lot! Turning off some logging and closing log viewer works magically! Thank you for all your tips in making the runs faster by disabling a bunch of stuff.

By easy GUI operation as follows, you can copy & page an existing Test Suite to create multiple copies as many as you want. What do you want more?

Yes, that is what I am doing. It will be tiring if you will have to do it hundreds of times. Note also that KS will take it sweet time to paste, save, open, rename and you have to change the data to be iterated. I have to constantly close other TS windows to make it load faster. I also have to manually deal with data to be iterated and name it accordingly so people would know the data range for each test. You also need to put that in collection(which is the easiest job of all). Imagine that if you will have to deal with multiple data sets per instance(e.g. different sets of data for each project). Right now, I only have 2 tests to deal with so far…

It will be great if KS can handle the bigger data when doing data binding(and option to flush or create results by batch or something that won’t degrade execution performance).

Most of the tests are already in the KS projects (e.g. keywords, utility functions and other glue code) so I have to use KS for this kind of work. Putting it outside of KS will be costly at this time.

Thanks in advance!

I will pass this requirement to you, @duyluong

Instead of duplicating your test scripts, you can simply use the ‘data iteration’ feature.
In my example, I have a test suite where I added the same testcase twice (yes, you can do that)
First one it is set to run rows 1 - 10, the second is set to 11 - 20.
By this way you don’t need to duplicate any script or to split your source file in smaller chunks and create for each one a new test data.

Alternate, you can create multiple test suites and wrap all of them in a test collection.

I have found out a way to clone an existing Test Suite to make multiple Test Suites. I won’t do it in Katalon Studio. I will do it using my favourite text editor: Emacs + shell commands (cd, ls, cp, rm, etc).

In my Katalon Studio project, I have a Test Suite named “TS1”. In emacs, in the projectDir/Test Suite directory I found 2 files:

$ cd "<projectDir>/Test Suites"
$ ls -la
drwxr-xr-x   9 kazurayam staff     288 2021-12-11 21:03 .
drwxr-xr-x  33 kazurayam staff    1056 2021-11-25 20:46 ..
-rw-r--r--   1 kazurayam staff    2291 2021-11-21 10:47 TS1.groovy
-rw-r--r--   1 kazurayam staff    1309 2021-11-21 10:47 TS1.ts

The TS1.ts file is like this:

<?xml version="1.0" encoding="UTF-8"?>
<TestSuiteEntity>
   <description></description>
   <name>TS1</name>
   <tag></tag>
   <isRerun>false</isRerun>
   <mailRecipient></mailRecipient>
   <numberOfRerun>0</numberOfRerun>
   <pageLoadTimeout>30</pageLoadTimeout>
   <pageLoadTimeoutDefault>true</pageLoadTimeoutDefault>
   <rerunFailedTestCasesOnly>false</rerunFailedTestCasesOnly>
   <rerunImmediately>false</rerunImmediately>
   <testSuiteGuid>46c94a64-4227-420b-aff5-c9c599046c6d</testSuiteGuid>
   <testCaseLink>
      <guid>957f713a-5f1f-41fa-b3b0-f510a83e1f83</guid>
      <isReuseDriver>false</isReuseDriver>
      <isRun>true</isRun>
      <testCaseId>Test Cases/printID</testCaseId>
      <testDataLink>
         <combinationType>ONE</combinationType>
         <id>97b0971a-983c-4f61-a0a2-c551d14d5efa</id>
         <iterationEntity>
            <iterationType>ALL</iterationType>
            <value></value>
         </iterationEntity>
         <testDataId>Data Files/IDs</testDataId>
      </testDataLink>
      <variableLink>
         <testDataLinkId>97b0971a-983c-4f61-a0a2-c551d14d5efa</testDataLinkId>
         <type>DATA_COLUMN</type>
         <value>ID</value>
         <variableId>342bf1c4-85fb-4aba-a2d1-af4e28f5704e</variableId>
      </variableLink>
   </testCaseLink>
</TestSuiteEntity>

You should be able to guess what it is. In the *.ts file, the setup information of a Test Suite is serialised. I suppose that TS1.ts file contains everything about the Test Suite.

TS1.ts file is just a text in XML format. It is easy to copy, easy to edit. So I tried to create one more Test Suite, based on the original TS1.

I created a sub-directory product, where I copied the original TS1.groovy to product/TS1_1-99.groovy; I copied the original TS1.ts to product/TS1_1-99.ts file. I edited the TS1_1-99.ts file so that it reads the line #1-99 only from the data source. Also I edited the <testSuiteGuid> a bit so that it is unique (different from the original). Also I edited the <name> of the test suite.

$ ls -la
total 4
drwxr-xr-x   4 kazurayam staff     128 2021-12-11 21:16 .
drwxr-xr-x   9 kazurayam staff     288 2021-12-11 21:03 ..
-rw-r--r--   1 kazurayam staff    2291 2021-12-11 20:58 TS1_1-99.groovy
-rw-r--r--   1 kazurayam staff    1320 2021-12-11 21:06 TS1_1-99.ts

The TS1_1-99.ts looks like this

<?xml version="1.0" encoding="UTF-8"?>
<TestSuiteEntity>
   <description></description>
   <name>TS1_1-99</name>
   <tag></tag>
   <isRerun>false</isRerun>
   <mailRecipient></mailRecipient>
   <numberOfRerun>0</numberOfRerun>
   <pageLoadTimeout>30</pageLoadTimeout>
   <pageLoadTimeoutDefault>true</pageLoadTimeoutDefault>
   <rerunFailedTestCasesOnly>false</rerunFailedTestCasesOnly>
   <rerunImmediately>false</rerunImmediately>
   <testSuiteGuid>2e76c940-3d67-4d4b-b851-30baca84ff5d</testSuiteGuid>
   <testCaseLink>
      <guid>7841d637-cc22-43ea-9c01-671ac983c487</guid>
      <isReuseDriver>false</isReuseDriver>
      <isRun>true</isRun>
      <testCaseId>Test Cases/printID</testCaseId>
      <testDataLink>
         <combinationType>ONE</combinationType>
         <id>e0c417fb-1acb-41bd-9525-d4db1bd9c15d</id>
         <iterationEntity>
            <iterationType>RANGE</iterationType>
            <value>1-99</value>
         </iterationEntity>
         <testDataId>Data Files/IDs</testDataId>
      </testDataLink>
      <variableLink>
         <testDataLinkId>e0c417fb-1acb-41bd-9525-d4db1bd9c15d</testDataLinkId>
         <type>DATA_COLUMN</type>
         <value>ID</value>
         <variableId>342bf1c4-85fb-4aba-a2d1-af4e28f5704e</variableId>
      </variableLink>
   </testCaseLink>
</TestSuiteEntity>

Please note the following portion:

         <iterationEntity>
            <iterationType>RANGE</iterationType>
            <value>1-99</value>
         </iterationEntity>
          

This instructs that this TS1_1-99 should extract the range of lines #1-99 from the external data file and transfer it to the linked Test Case printID. I edited this portion as such manually.

Then I started Katalon Studio, opened the project. I could find the new Test Suite product/TS1_1-99 in the GUI. I executed it. It ran as I expected. It processed the line#1-99 out of 5000 lines in the data file. This experiment proved this: if a user create 2 text files (testSuiteName.groovy , testSuiiteName.ts) in the “Test Suites” directory with good enough content, then Katalon Studio will happily accept them and run it as a Test Suite.

You would be able to automate this text conversion by any scripting languages. Once you could develop such text converter, you would be able to generate hundreds of Test Suites in a minute.

By ‘good content’, here @kazurayam want’s to say:

  • The groovy file must be valid, and the name of it must match the <name>Test Suite Name</name> entry in the xml (you can simply duplicate a valid one).
  • The id’s / guid’s all over must be valid and most probably unique. Looks like they are Version 4 UUId’s. Such can be easily produced by using the randomUUID library in Java (for other language feel free to search the relevant docs)
    (see also: Universally unique identifier - Wikipedia)

For the rest, use your imagination, no matter what approach you choose (multiple test suites calling same test case, same test case multiple times in the same suite, or a combination of both).
There are plenty libraries available in whatever language you like to read and produce valid XML.

Similar technique can be used to create also test-cases ‘on the fly’ but here it is a bit more complex.
The XML’s are under the ‘Test Case’ folder (the .tc files) but the TC scripts are under the Scripts folder having their own folders.
The folder name must match the name entry in the XML but the name of the Script file it is not yet clear how for me it should be (perhaps the digits sequence in Script1637062943055.groovy is a timestamp?)

This is actually what I’m doing. Test script (Test case) is the same. I’m only doing the test suite as a data binding mechanism to the data. I’m not sure if the other code that I have to deal contain another code in the Test Suite. If that happens, it will be different. Some other code might have pre/post stuff for the test case.

the only things you have to care about a test suite are as above mentioned.
the skeletal groovy script and the. ts (xml) file.
produce them properly and should just work

Let me assume that you have managed to split the original “TS Big” of 50,000 iterations into 100 pieces. Then I would like to ask @david.casiano how you will execute the bulk of Test Suites.

Do you have 100 machines to distribute the Test Suites? Do you have 100 licenses of Katalon Runtime Engine?

Running Test Suites in parallel does not necessarily make them run faster. Provided with limited machine resources, it could become worse. If you execute 100 test suites on a single machine, I would guess, it will take far longer hours (or days) than the original “TS Big”. I would guess, splitting the “TS Big” into 2 and running them in parallel on a machine would be no faster than the original single Test Suite. You should examine the time taken with 2 Test Suites to verify how much “parallelism” is effective for your case.

By the way, I have ever tried running multiple threads in Katalon Studio with hope that I can execute my test scripts multiple times with different parameters as quickly as possible. I could make my test scripts run in multi-threaded fashion, but it was not fast at all. See

I noticed that WebU.openBrowser() opens Chrome browser with a single Chrome User profile. If my 2 threads open 2 Chrome windows, yes, the windows open, but only one of them runs and other window halts because these 2 Chromes competes for a single Chrome User profile. Resource contention occurred inside Chrome. Therefore my threads could not run in parallel; they are blocked each other and could only run sequential. I learned that if I want to run multiple threads for Katalon Tests, I need multiple machines as many as the number of parallelism I want to achieve. If I want 2 processes, I need 2 boxes. If I want 100 processes, I need 100 boxes. As simple as this. I learned that parallelism is difficult to make use of.

I think you are going too far.
Most probably the reason behind is memory/runtime efficiency (and yeah, you know, the logging tricks)
Having this in mind, the multiple suites approaches may be more suitable with such large data,
(compared with a single suite having multiple instances of the same testcase or multiple testcases)

… provided we already know how the threads are scoped at runtime …

Just I want to hear from @david.casiano.

If he has only 2 machines in hand on which he can run Test Suits, then he would need only 2 Test Suites. He would be able to create 2 Test Suites using Katalon Studio GUI manually. He would not need any script that generates 100 Test Suites.

But @david.casiano said he wants a script that generates 100 Test Suites. Therefore I presume that he would be afford of 100 machines and 100 KRE licenses.

which will be the awful approach :smiling_imp:
ok, let’s grab popcorn and wait for david input …

Processing 50K of data in Web UI is awful enough. 1K is too much for me.

If I am asked to verify 50K rows of a given Excel data somehow, I would take 2 approaches mixed. I would write a Groovy script that makes a DB-query to fetch a result set. I will compare the Excel data against the result set. This will finish in a few seconds. Then, I would make a small subset of the original data (say 0.5K) and verify them through Web UI interactions. This should finish in an hour.

I will do the same.
Simply create a relational db, upload your data and so on.
But this is about engineering :slight_smile:

We only have one machine (VM actually) for this work. It would validate results from one system going to another system and requires a lot of entities to change. One row in the csv(data file) consists of data change in the system. You will have thousands of those (the biggest I’ve seen is 50K). Reports is important since the use case has to make sure that data is correct in the system. Currently, that project has only a single KS license for the sole purpose of using the already created code for KS. Main selling point is that KS is data driven. I have not used KS to have more than 10 entries there in the data files and worked great.
Thanks!

Take note that the reason why I have to separate the test suites in the first place to is distribute the data to small chunks to make KS faster and would create reports if KS happen to sleep. The code is the same but the data to iterate is sequential.
Thanks!