How to extract a specific set of words from a string?

I’m trying to write a script that does the following:

  • extract the text from a text view (via Mobile.getText())
  • Use a VariableCollections to compare to and find specific sets of words in the extracted text (John Smith, 26 July 2023, Location Name Here etc.)

My issue right now is that I don’t know the specific keyword/method that allows me to look for sets of words. I tried verifyMatch (as is shown in the screenshot) but that compares the VariableCollection to the entire text rather than searching through them. I have considered using something like ‘split’ but I don’t know how exactly to apply that one in this context.

screenshot of the code

Please give us a test fixture. Please specify a concrete value to work with

  1. the text from which you want to search a set of words
  2. the set of words to search

And please specify what you want as the result of the processing.
A single boolean value (true/false)?
Or something like:

{ “result”: [
{“word”: “a”, “found”: true, “line#”: 1, “pos”: 1},
{“word”: “quick”, “found”: true, “line#”: 1, “pos”: 3},
{“word”: “brown”, “found”: true, “line#”: 1, “pos”: 9},
{“word”: “fox”, “found”: true, “line#”: 1, “pos”: 15},
{“word”: “jumps”, “found”: true, “line#”: 1, “pos”: 19},
{“word”: “over”, “found”: true, “line#”: 1, “pos”: 25},
{“word”: “a”, “found”: true, “line#”: 1, “pos”: 30},
{“word”: “lazy”, “found”: true, “line#”: 1, “pos”: 32},
{“word”: “cat”, “found”: false, “line#”: 0, “pos”: 0},

What do you want if a word appears twice or more times in the given text?

The value of the getText() result is a notification message that a user receives after booking a class via an app. The body of the message itself wouldn’t have any changes except the values I listed in the screenshot. The words in bold below are what I’m trying to extract and compare to the VariableCollections (values based on the user’s booked class).

Here’s an example of the text (non-bolded words never change):
“You’ve successfully booked the 9:00 pm Chair Yoga class with Kimberly Yong. It’s at Yoga - Asia Square Tower 2 on Sat 29 July 2023. Please arrive 15 minutes earlier. Check-in closes 5 minutes before class.”

The result I’m looking for is a boolean result. So every time the script verifies that a specific word (the words in bold) exists in the getText(), it will return true and add 1 to a counter.

Edit: Here’s a screenshot of the values of the VariableCollections:

I made a Test Case script that does scanning a text for words given.

import com.kms.katalon.core.util.KeywordUtil

boolean wordIsContained(String text, String word) {
	return text.contains(word)

Tuple2<List<String>, List<String>> scanTextForWords(String text, List<String> words) {
	List<String> found = []
	List<String> notFound = []
	for (String w : words) {
		if (wordIsContained(text, w) ) {
			found += w	
		} else {
			notFound += w
	return new Tuple(found, notFound)

String text = """
You’ve successfully booked the 9:00 pm Chair Yoga class with Kimberly Yong. 
It’s at Yoga - Asia Square Tower 2 on Sat 29 July 2023. 
Please arrive 15 minutes earlier. Check-in closes 5 minutes before class.

List<String> words = [
	"Chair Yoga",
	"Sat 29 July 2023",
	"Kimberly Yong",
	"Yoga - Asia Square Tower 2"
Tuple2<List<String>, List<String>> result = scanTextForWords(text, words)
Boolean allFound = (result.get(0).size() == words.size())

println "found: ${result.get(0)}"
println "notFound: ${result.get(1)}"
println "allFound: ${allFound}"

if (!allFound) {
	KeywordUtil.markFailed("some words are not found in the text")

When I ran this, I got the following output in the console

2023-07-26 13:41:44.234 INFO  c.k.katalon.core.main.TestCaseExecutor   - --------------------
2023-07-26 13:41:44.238 INFO  c.k.katalon.core.main.TestCaseExecutor   - START Test Cases/TC2
found: [Chair Yoga, Sat 29 July 2023, Kimberly Yong, Yoga - Asia Square Tower 2]
notFound: [9:00PM, 60min]
allFound: false
2023-07-26 13:41:45.164 ERROR com.kms.katalon.core.util.KeywordUtil    - ❌ some words are not found in the text
2023-07-26 13:41:45.190 ERROR c.k.katalon.core.main.TestCaseExecutor   - ❌ Test Cases/TC2 FAILED.
com.kms.katalon.core.exception.StepFailedException: some words are not found in the text
	at com.kms.katalon.core.util.KeywordUtil.markFailed(
	at com.kms.katalon.core.util.KeywordUtil$ Source)
	at com.kms.katalon.core.main.ScriptEngine.runScriptAsRawText(
	at com.kms.katalon.core.main.TestCaseExecutor.runScript(
	at com.kms.katalon.core.main.TestCaseExecutor.doExecute(
	at com.kms.katalon.core.main.TestCaseExecutor.processExecutionPhase(
	at com.kms.katalon.core.main.TestCaseExecutor.accessMainPhase(
	at com.kms.katalon.core.main.TestCaseExecutor.execute(
	at com.kms.katalon.core.main.TestCaseMain.runTestCase(
	at com.kms.katalon.core.main.TestCaseMain.runTestCase(
	at com.kms.katalon.core.main.TestCaseMain$runTestCase$ Source)

2023-07-26 13:41:45.208 INFO  c.k.katalon.core.main.TestCaseExecutor   - END Test Cases/TC2


Use regex in verifyMatch keyword.

And you can remove “==true” from if conditions.


1 Like

Thanks for the help ya’ll. @kazurayam @bhavesh.maru

I’ll try both of your methods and use whichever ones fit my script better.

regex worked for me in the past