Question about Testing Page Titles with Non Printable Characters

Hi,

I’ve been having some difficulty testing web page titles where I am seeing non printable characters. An example of the occurrence can be found here: https://www.torontohousing.ca/residents .

When I look at the web page code in Chrome Developer I see this:

image

I ended up using a tool to try to find out what these characters were at https://www.soscisurvey.de/tools/view-chars.php

Which showed tab and enter characters:

The standard code I use to check a page title is:

'Check of the Title of the Webpage is correct'
assert WebUI.verifyMatch(Test_Page_Title, 'Residents', false)

I was looking for a suggestion on how to handle this?

The big picture is I am crawling through this website checking page titles and content which are read from an Excel file (just in case that matters. Thank you.

WebUI.verifyMatch(Test_Page_Title, '.*Residents.*', true)

Here is a link to that same regex with the same source text proving it should work: https://regex101.com/r/mYWrv4/1

@Russ_Thomas

Thank you for your feedback. The solution works if literally code each title page title in the Katalon editor.

But if I try to read from an Excel file it will fail:

The Contents of the Excel files looks like this (where the column ‘Test_Page_Title’ stores the page title):

Code to check the page title from the Excel File

'Check of the Title of the Webpage is correct'
assert WebUI.verifyMatch(WebUI.getWindowTitle(), Test_Page_Title, true)

Should I use a different syntax when the title is stored in an Excel file?

Try this… get the text from excel, then…

println(stuffFromExcel)

I would expect Excel to preserve the text but maybe not.

If you know that the title will have “stuff” in it, you could put the printable characters in Excel and then put the reg ex aspect within the verifyMatch method, like:

WebUI.verifyMatch(WebUI.getWindowTitle(), “.*” + Test_Page_Title + “.*”, true)

Test_Page_Title would contain only “Residents”.

1 Like

@grylion54 @Russ_Thomas

Hi Russ and Mike,

Thanks for the suggestions. This morning I did change the script line to:

'Get the value of the Test_Page_Title before any manipulation'
println "This is the page title being tested: " + (Test_Page_Title)

'Check of the Title of the Webpage is correct'
WebUI.verifyMatch(WebUI.getWindowTitle(), ".*" + Test_Page_Title + ".*", true)

I’m not sure where to look for the print line results (println)?

I can see this in the Execution logs:

08-19-2020 10:22:42 AM verifyMatch(getWindowTitle(), ".*" + Test_Page_Title + ".*", true)

Elapsed time: 0.114s

Unable to verify match between actual text 'Residents' and expected text '.*Resident​​s.*' using 
regular expression (Root cause: com.kms.katalon.core.exception.StepFailedException: Actual text 
'Residents' and expected text '.*Resident​​s.*' are not matched using regular expression

So it seems that the ‘.*’ suggested by Mike is added to the string. But then it is not treated like a regular expression when it is processed.

On a side note I also tried changing the line as follows but with the same results:

'Check of the Title of the Webpage is correct'
Test_Page_Title = ".*" + Test_Page_Title + ".*"
WebUI.verifyMatch(WebUI.getWindowTitle(), Test_Page_Title, true)

Error in the log:

=============== ROOT CAUSE =====================

For trouble shooting, please visit: https://docs.katalon.com/katalon-studio/docs/troubleshoot-common-execution-exceptions-web-test.html

08-19-2020 11:32:16 AM verifyMatch(getWindowTitle(), Test_Page_Title, true)

Elapsed time: 0.031s

Unable to verify match between actual text ‘Residents’ and expected text ‘.Resident​​s.’ using regular expression (Root cause: com.kms.katalon.core.exception.StepFailedException: Actual text ‘Residents’ and expected text ‘.Resident​​s.’ are not matched using regular expression
at com.kms.katalon.core.keyword.builtin.VerifyMatchKeyword$_verifyMatch_closure1.doCall(VerifyMatchKeyword.groovy:57)
at

You will notice that the string is now ‘.Resident.’ which shows periods in the string . Perhaps this is a clue on the processing.

Russ may have information on why the Reg Ex is not working on the page like I would expect it to, however, since your actual text is “Residents” which does not seem to have the unprintable characters, then perhaps you should just try to compare directly (without Reg Ex).

Test_Page_Title = “Residents”
WebUI.verifyMatch(WebUI.getWindowTitle(), Test_Page_Title, false)

As a note, I use the method of adding the dot asterisk onto my Reg Ex comparisons for date timestamps to compare dates (down to the milliseconds). It works for me (or my version of KS).

Oh, one last thought. Did you copy and paste your text from this forum? If you did, can you replace all the smart quotes (have a curve or curl to them) to be straight quotes.

You can find the output of a “println” statement within the console log. If you want to see the output in the normal output, then add a comment statement also.

WebUI.comment("This is the page title being tested: " + (Test_Page_Title))

There is an issue with Regular Expressions in Katalon but I’ve long forgotten the details (@kazurayam solved it as I recall).

But getting back to the problem at hand, you could try retrieving the value, trimming it, then comparing it. Also, try using String.contains(). Messy though - this should be easy. Wait until @kazurayam drops by…

Ah… here you go…

I tried to reproduce the problem by creating a Test Case script on my side.

import com.kms.katalon.core.webui.keyword.WebUiBuiltInKeywords as WebUI

WebUI.openBrowser('')
WebUI.navigateToUrl('https://www.torontohousing.ca/residents')
def title = WebUI.getWindowTitle()
WebUI.verifyMatch(title, '.*Residents.*', true)
WebUI.closeBrowser()

This code worked just fine. The test case succeeded. Therefore I am sure that the WebUI.match is not the issue. There must be a cause unspotted yet.

I doubt if Test_Page_Title, which is supposed to be a regexp pattern, is ok or not. For example, if the data in the Excel cell contains any whitespaces, what will happen? The WebUI.verifyMatch() will fail.

You should look at the Console tab rather than the Log View tab:

By the way, my experiment above revealed that the title of https://www.torontohousing.ca/residents is Residents, which has no CR+LF characters prepended and appended.

@Randy.Ramkissoon

You misunderstood. You looked at the view rendered by Chrome Developer and found CR and LF there. That is not necessarily equivalent to the raw <title>Residents</title> node of your AUT. The Chrome Developer silently added CR + LF as cosmetics.

I think you’ll find it’s in the source. The bad thing is, Firefox DevTools Inspector silently removes it. (something I’ve long complained about with Mozilla).

Chrome view source:

Firefox view source:

Chrome Inspector:
image

Firefox Inspector:
image

Lastly, Chrome/Firefox consoles, side-by-side (Mozilla should really be ashamed of themselves)

1 Like

Ah, yes, I saw CR+LF in the source.

I was surprised to find that a line in my test case script

def title = WebUI.getWindowTitle()

returned Residents without CR+LF. I was not aware that getWindowTitle() silently trims whitespaces.

That’s a bug. Please post it! (@ThanhTo @duyluong)

Aside: Mozilla’s issue, for your information, is that the inspector does not inspect source-original text - it inspects the DOM which can be, after much dicking around in JS etc, very different. I know this because of the run-in I mentioned above - this was related to the display of non-printing entities in the inspector (things like &equiv; &nbsp; &emsp; etc). Chrome displays them in the inspector. Firefox did not but now manages to draw a black block where they sit - not ideal but better than before. They basically had no way to map original source onto rendered dom output.

If you have nothing better to do:

2 Likes

@Randy.Ramkissoon

Sorry for derailing your thread. I’m going to revert to my earlier answer…

We should go back to the original question by @Randy.Ramkissoon

Randy, please check how the data read from Excel looks like.

@kazurayam @Russ_Thomas @grylion54

Hi Everyone,

I’m just reading through all the interesting responses (thank you again for your help). I think your observations and descriptions of the different functions behavior will be useful to everyone when planning an approach to writing Test Scripts. I’m going to try to read more about string manipulation (e.g. “String.contains()” today). It has been on my mind as in SQL there are ltrim, rtrim and trim functions.

I wanted to provide answers to the questions you all asked below:

This is what was in the console for the printf:

2020-08-20 08:43:23.547 DEBUG .2. Crawl Torontohousing Site Map - File - 3: println("This is the page title being tested: " + Test_Page_Title)
This is the page title being tested: Resident​​s

This code which Russ suggested to use a Regex did work:

WebUI.verifyMatch(Test_Page_Title, ‘.Residents.’, true)

Remaining Problem:

The challenge seems to be applying the regex logic when you are comparing a page title in the attached Excel file with the actual page title retrieved via the “WebUI.getWindowTitle()” function.

Current Code Example which uses sample Excel file where Regex logic from Mikes Suggestion:

WebUI.enableSmartWait()

WebUI.openBrowser('')

WebUI.maximizeWindow()

WebUI.navigateToUrl(Test_Page_URL)

WebUI.delay(5)

assert WebUI.getUrl() == Test_Page_URL

println "This is the page title being tested: " + (Test_Page_Title)

WebUI.verifyMatch(WebUI.getWindowTitle(), ".*" + Test_Page_Title + ".*", true)

WebUI.closeBrowser()

The Error this produces when referencing the ‘Residents’ page at URL Residents :

2020-08-20 09:48:09.362 DEBUG .2. Crawl Torontohousing Site Map - File - 4: verifyMatch(getWindowTitle(), “." + Test_Page_Title + ".”, true)
2020-08-20 09:48:09.471 ERROR c.k.k.core.keyword.internal.KeywordMain - :x: Unable to verify match between actual text ‘Residents’ and expected text ‘.Resident​​s.’ using regular expression (Root cause: com.kms.katalon.core.exception.StepFailedException: Actual text ‘Residents’ and expected text ‘.Resident​​s.’ are not matched using regular expression
at com.kms.katalon.core.keyword.builtin.VerifyMatchKeyword$_verifyMatch_closure1.doCall(VerifyMatchKeyword.groovy:57)
at com.kms.katalon.core.keyword.builtin.VerifyMatchKeyword$_verifyMatch_closure1.call(VerifyMatchKeyword.groovy)
at com.kms.katalon.core.keyword.internal.KeywordMain.runKeyword(KeywordMain.groovy:68)
at com.kms.katalon.core.keyword.builtin.VerifyMatchKeyword.verifyMatch(VerifyMatchKeyword.groovy:60)
at com.kms.katalon.core.keyword.builtin.VerifyMatchKeyword.execute(VerifyMatchKeyword.groovy:45)
at com.kms.katalon.core.keyword.internal.KeywordExecutor.executeKeywordForPlatform(KeywordExecutor.groovy:73)
at com.kms.katalon.core.keyword.BuiltinKeywords.verifyMatch(BuiltinKeywords.groovy:73)
at 2. Crawl Torontohousing Site Map - File.run(2. Crawl Torontohousing Site Map - File:81)
at com.kms.katalon.core.main.ScriptEngine.run(ScriptEngine.java:194)
at com.kms.katalon.core.main.ScriptEngine.runScriptAsRawText(ScriptEngine.java:119)
at com.kms.katalon.core.main.TestCaseExecutor.runScript(TestCaseExecutor.java:339)
at com.kms.katalon.core.main.TestCaseExecutor.doExecute(TestCaseExecutor.java:330)
at com.kms.katalon.core.main.TestCaseExecutor.processExecutionPhase(TestCaseExecutor.java:309)
at com.kms.katalon.core.main.TestCaseExecutor.accessMainPhase(TestCaseExecutor.java:301)
at com.kms.katalon.core.main.TestCaseExecutor.execute(TestCaseExecutor.java:235)
at com.kms.katalon.core.main.TestSuiteExecutor.accessTestCaseMainPhase(TestSuiteExecutor.java:191)
at com.kms.katalon.core.main.TestSuiteExecutor.accessTestSuiteMainPhase(TestSuiteExecutor.java:141)
at com.kms.katalon.core.main.TestSuiteExecutor.execute(TestSuiteExecutor.java:90)
at com.kms.katalon.core.main.TestCaseMain.startTestSuite(TestCaseMain.java:157)
at com.kms.katalon.core.main.TestCaseMain$startTestSuite$0.call(Unknown Source)
at TempTestSuite1597931259844.run(TempTestSuite1597931259844.groovy:39)

TorontoHousing-Site-example.zip (11.5 KB)

Hopefully this will give a more through view of the issue being troubleshooted.

Part of the dilemma is where best to put the Regex text (e.g. in the Excel which was not working, perhaps Excel requires single quotation to indicate it is text in the cell which might throw things off).

String reTitle = ".*" + Test_Page_Title + ".*"
WebUI.verifyMatch(WebUI.getWindowTitle(), reTitle, true)

Let me know what that does. Although, based on @kazurayam’s findings (400 years ago) I suspect it’ll fail.

But all is not lost, I can code some JS for you to fix all this BS.

I want to see what this does.

println(">>>" + Test_Page_Title + "<<<")
String reTitle = ".*" + Test_Page_Title + ".*" 
WebUI.verifyMatch(WebUI.getWindowTitle(), reTitle, true)

I have a doubt Test_Page_Title might not be "Residents". It might be

Residents
\n

because I see trailing \n characters unintentionally typed in cells of my Excel files quite often.

1 Like