How do I decode a base64 encoded URL from a test email using Katalon

Katalon Studio Version:
KSE 8.4.0, Build 208
Windows 10 Enterprise (64-bit)
Chrome: 107.0.5304.62 (Official Build) (32-bit)

Hi folks, Does anyone have any insight as to how to Decode a ‘URL’ from an email file with Katalon.

Setup:
I have a test email file where the URL ‘https://www.google.com’ has been encoded to base64:
aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbQ==

Test steps:

  1. Open the test.eml file in text mode
  2. Decode the base64 encoded URL
  3. Store the URL (https://www.google.com) in GlobalVariable.Decoded.Url for use in a called test case

I found Base64 (Katalon Studio API Specification)
But I am not sure how to implement it with Katalon.

The test.eml file:
X-Sender: “Do Not Reply” donotreply@anywhere.com
X-Receiver: xyz.com
MIME-Version: 1.0
From: “Do Not Reply” donotreply@anywhere.com
To: xyz.com
Date: 26 Oct 2022 14:43:15 -0700
Subject: Hello world, what is my URL?
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: base64

Hello world, what is my URL?
aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbQ==

1 Like

Before talking about base64, your test case should be able to do the following:

  1. open the test.eml file
  2. parse the content of the file and pick up the string “aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbQ==”, store it into a variable. Be sure the variable value should not contain leading/trailing white spaces (new lines).
  3. println the variable; just to make sure what you have got.

Please make a test case that does this; and show your code.

Provided that you could make it, we would be able start discussing how to use the com.kms.katalon.core.util.internal.Base64 class.

1 Like

Thanks @kazurayam, good plan!
I will post back here when done.

Cheers,
Dave

Hi @kazurayam ,
I have created the following test case and am seeing the expected value for “GlobalVariable.DecodedUrl”

import static com.kms.katalon.core.checkpoint.CheckpointFactory.findCheckpoint
import static com.kms.katalon.core.testcase.TestCaseFactory.findTestCase
import static com.kms.katalon.core.testdata.TestDataFactory.findTestData
import static com.kms.katalon.core.testobject.ObjectRepository.findTestObject
import static com.kms.katalon.core.testobject.ObjectRepository.findWindowsObject
import com.kms.katalon.core.checkpoint.Checkpoint as Checkpoint
import com.kms.katalon.core.cucumber.keyword.CucumberBuiltinKeywords as CucumberKW
import com.kms.katalon.core.mobile.keyword.MobileBuiltInKeywords as Mobile
import com.kms.katalon.core.model.FailureHandling as FailureHandling
import com.kms.katalon.core.testcase.TestCase as TestCase
import com.kms.katalon.core.testdata.TestData as TestData
import com.kms.katalon.core.testng.keyword.TestNGBuiltinKeywords as TestNGKW
import com.kms.katalon.core.testobject.TestObject as TestObject
import com.kms.katalon.core.webservice.keyword.WSBuiltInKeywords as WS
import com.kms.katalon.core.webui.keyword.WebUiBuiltInKeywords as WebUI
import com.kms.katalon.core.windows.keyword.WindowsBuiltinKeywords as Windows
import internal.GlobalVariable as GlobalVariable
import org.openqa.selenium.Keys as Keys

File emlfile = new File("C:\\Users\\devers\\Downloads\\TestFiles\\test.eml")
def emlFileContents = emlfile.getText("UTF-8")
GlobalVariable.DecodedUrl = emlFileContents
println("base64EncodedUrl: " + GlobalVariable.DecodedUrl)

TokenUrl = emlFileContents
TokenUrl = TokenUrl.replace(' ', '')
int TokenUrlLength = TokenUrl.length()
println("TokenUrlLength: " + TokenUrlLength)
int startPos = (TokenUrl.lastIndexOf("base64") + 6)
println("startPos: " + startPos)
int endPos = TokenUrl.lastIndexOf("")
println("endPos: " + endPos)
TokenUrl = TokenUrl.substring(startPos, endPos)
GlobalVariable.DecodedUrl = TokenUrl
println("base64EncodedUrl: " + GlobalVariable.DecodedUrl)

Output after code is run:

test.eml:
base64Email
test.eml (339 Bytes)

Hi @kazurayam,
I also have this question…

We have several *.eml files (100+) in a folder named similar to the following:
79edddc6-ce98-4eff-b8ea-414e392bce1f.eml
f503182a-d0ef-444d-a52b-658b5d10de81.eml
f503182a-d0ef-444d-x52b-658b5d10de81.eml
+100

My question is how do I open say the first *.eml file in the list with Katalon?

@Dave_Evers

Your 1st question was how to use a Katalon’s builtin class to decode a string which has value encoded by Base64. The following shows how to do it.

import com.kms.katalon.core.util.internal.Base64
import internal.GlobalVariable

String httpBody = "aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbQ=="

GlobalVariable.DecodedUrl = Base64.decode(httpBody)
println "DecodedUrl ${GlobalVariable.DecodedUrl}"

Too easy, isn’t it?

I think that your real problem was how to parse the file which contains a Email message serialized. How to use Base64 — it wasn’t what you wondered. You wondered how to pickup the body part of a Email message serialized in a file.

2 Likes

OMG, very cool!
Thanks @kazurayam it works like a charm!

@Dave_Evers

Your second question was how to open the 1st file *.eml file in a folder.
The following code shows how to:

import java.nio.file.Files
import java.nio.file.Path
import java.nio.file.Paths
import java.util.stream.Collectors

import com.kms.katalon.core.configuration.RunConfiguration

Path projectDir = Paths.get(RunConfiguration.getProjectDir())
Path dataDir = projectDir.resolve("Include/data")

List<Path> emlFiles = 
	Files.list(dataDir)
		.filter({ p -> p.toString().endsWith(".eml")})
		.collect(Collectors.toList())

// print path of all eml files
emlFiles.forEach  { p ->
	println p.toString()
}

// open the 1st eml file, print its contents
String content = emlFiles.get(0).toFile().text
println content

This gave me, for example:

/Users/kazurayam/katalon-workspace/test/Include/data/f503182a-d0ef-444d-a52b-658b5d10de81.eml
/Users/kazurayam/katalon-workspace/test/Include/data/79edddc6-ce98-4eff-b8ea-414e392bce1f.eml
... the content of a .eml file

Here I used the following techniques:

1 Like

Hi @kazurayam,
Thanks so much for this, I will give this a try and post back here so that others can learn from your excellent direction.

Best regards,
Dave

Hi @kazurayam,
Sorry to bother you yet again, but I am seeing Illegal base64 character d’ error messages when trying to decode my captured Base64 encoded data. If you have time could you please take a look?

Thanks ~ Dave

import static com.kms.katalon.core.checkpoint.CheckpointFactory.findCheckpoint
import static com.kms.katalon.core.testcase.TestCaseFactory.findTestCase
import static com.kms.katalon.core.testdata.TestDataFactory.findTestData
import static com.kms.katalon.core.testobject.ObjectRepository.findTestObject
import static com.kms.katalon.core.testobject.ObjectRepository.findWindowsObject
import com.kms.katalon.core.checkpoint.Checkpoint as Checkpoint
import com.kms.katalon.core.cucumber.keyword.CucumberBuiltinKeywords as CucumberKW
import com.kms.katalon.core.mobile.keyword.MobileBuiltInKeywords as Mobile
import com.kms.katalon.core.model.FailureHandling as FailureHandling
import com.kms.katalon.core.testcase.TestCase as TestCase
import com.kms.katalon.core.testdata.TestData as TestData
import com.kms.katalon.core.testng.keyword.TestNGBuiltinKeywords as TestNGKW
import com.kms.katalon.core.testobject.TestObject as TestObject
import com.kms.katalon.core.webservice.keyword.WSBuiltInKeywords as WS
import com.kms.katalon.core.webui.keyword.WebUiBuiltInKeywords as WebUI
import com.kms.katalon.core.windows.keyword.WindowsBuiltinKeywords as Windows
import org.openqa.selenium.Keys as Keys
import com.kms.katalon.core.util.internal.Base64
import internal.GlobalVariable as GlobalVariable

File emlfile = new File("C://Users//devers//Downloads//TestFiles//test.eml")
def emlFileContents = emlfile.getText("UTF-8")
println("emlFileContents: " + emlFileContents)
GlobalVariable.DecodedUrl = emlFileContents
println("GlobalVariable.DecodedUrl: " + GlobalVariable.DecodedUrl)

// The following works to capture the Base64 encoded "aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbQ==" from the test.eml file
// However when I attempt to decode the result I keep getting 'Illegal base64 character d' error messages
// Can you see where I am going wrong? I am using substring to capture the base64 encoded data
// Would there be a better solution for splitting out the data so there are no spaces?
// The following are my print results... It seems like I have a space but am not sure how to remove it.
// EncodedUrl:
// 
// aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbQ==
// 2022-10-27 16:26:58.191 INFO  c.k.katalon.core.main.TestCaseExecutor

EncodedUrl = emlFileContents
EncodedUrl = EncodedUrl.replace(' ', '')
int EncodedUrlLength = EncodedUrl.length()
println("EncodedUrlLength: " + EncodedUrlLength)
int startPos = (EncodedUrl.lastIndexOf("base64") + 6)
println("startPos: " + startPos)
int endPos = EncodedUrl.lastIndexOf("")
println("endPos: " + endPos)
EncodedUrl = EncodedUrl.substring(startPos, endPos)
EncodedUrl = EncodedUrl.replace(' ', '')
println("EncodedUrl: " + EncodedUrl)

//Failed attempted decoding
String httpBody = EncodedUrl 
GlobalVariable.DecodedUrl = Base64.decode(httpBody)
println "DecodedUrl ${GlobalVariable.DecodedUrl}"

//Your code works fine when I use it...
//String httpBody = "aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbQ=="
//GlobalVariable.DecodedUrl = Base64.decode(httpBody)
//println "DecodedUrl ${GlobalVariable.DecodedUrl}"

Results and error:

2022-10-27 16:31:06.346 INFO  c.k.katalon.core.main.TestCaseExecutor   - --------------------
2022-10-27 16:31:06.353 INFO  c.k.katalon.core.main.TestCaseExecutor   - START Test Cases/A1_PreFeatureCases/01 How_to_cases/25 Katalon_tips/36 Read from tst.eml file
2022-10-27 16:31:07.140 DEBUG testcase.36 Read from tst.eml file       - 1: emlfile = new java.io.File(C://Users//devers//Downloads//TestFiles//test.eml)
2022-10-27 16:31:07.147 DEBUG testcase.36 Read from tst.eml file       - 2: emlFileContents = emlfile.getText("UTF-8")
2022-10-27 16:31:07.150 DEBUG testcase.36 Read from tst.eml file       - 3: println("emlFileContents: " + emlFileContents)
emlFileContents: X-Sender: “Do Not Reply” donotreply@anywhere.com
X-Receiver: xyz.com
MIME-Version: 1.0
From: “Do Not Reply” donotreply@anywhere.com
To: xyz.com
Date: 26 Oct 2022 14:43:15 -0700
Subject: Hello world, what is my URL?
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: base64

aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbQ==
2022-10-27 16:31:07.153 DEBUG testcase.36 Read from tst.eml file       - 4: DecodedUrl = emlFileContents
2022-10-27 16:31:08.635 DEBUG testcase.36 Read from tst.eml file       - 5: println("GlobalVariable.DecodedUrl: " + DecodedUrl)
GlobalVariable.DecodedUrl: X-Sender: “Do Not Reply” donotreply@anywhere.com
X-Receiver: xyz.com
MIME-Version: 1.0
From: “Do Not Reply” donotreply@anywhere.com
To: xyz.com
Date: 26 Oct 2022 14:43:15 -0700
Subject: Hello world, what is my URL?
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: base64

aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbQ==
2022-10-27 16:31:08.637 DEBUG testcase.36 Read from tst.eml file       - 6: EncodedUrl = emlFileContents
2022-10-27 16:31:08.639 DEBUG testcase.36 Read from tst.eml file       - 7: EncodedUrl = EncodedUrl.replace(" ", "")
2022-10-27 16:31:08.646 DEBUG testcase.36 Read from tst.eml file       - 8: EncodedUrlLength = EncodedUrl.length()
2022-10-27 16:31:08.647 DEBUG testcase.36 Read from tst.eml file       - 9: println("EncodedUrlLength: " + EncodedUrlLength)
EncodedUrlLength: 306
2022-10-27 16:31:08.648 DEBUG testcase.36 Read from tst.eml file       - 10: startPos = EncodedUrl.lastIndexOf("base64") + 6
2022-10-27 16:31:08.650 DEBUG testcase.36 Read from tst.eml file       - 11: println("startPos: " + startPos)
startPos: 270
2022-10-27 16:31:08.651 DEBUG testcase.36 Read from tst.eml file       - 12: endPos = EncodedUrl.lastIndexOf("")
2022-10-27 16:31:08.652 DEBUG testcase.36 Read from tst.eml file       - 13: println("endPos: " + endPos)
endPos: 306
2022-10-27 16:31:08.653 DEBUG testcase.36 Read from tst.eml file       - 14: EncodedUrl = EncodedUrl.substring(startPos, endPos)
2022-10-27 16:31:08.655 DEBUG testcase.36 Read from tst.eml file       - 15: EncodedUrl = EncodedUrl.replace(" ", "")
2022-10-27 16:31:08.658 DEBUG testcase.36 Read from tst.eml file       - 16: println("EncodedUrl: " + EncodedUrl)
EncodedUrl: 

aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbQ==
2022-10-27 16:31:08.660 DEBUG testcase.36 Read from tst.eml file       - 17: httpBody = EncodedUrl
2022-10-27 16:31:08.661 DEBUG testcase.36 Read from tst.eml file       - 18: DecodedUrl = decode(httpBody)
2022-10-27 16:31:08.686 ERROR c.k.katalon.core.main.TestCaseExecutor   - ❌ Test Cases/A1_PreFeatureCases/01 How_to_cases/25 Katalon_tips/36 Read from tst.eml file FAILED.
Reason:
java.lang.IllegalArgumentException: Illegal base64 character d
	at com.kms.katalon.core.util.internal.Base64.decode(Base64.java:48)
	at com.kms.katalon.core.util.internal.Base64$decode.call(Unknown Source)
	at 36 Read from tst.eml file.run(36 Read from tst.eml file:51)
	at com.kms.katalon.core.main.ScriptEngine.run(ScriptEngine.java:194)
	at com.kms.katalon.core.main.ScriptEngine.runScriptAsRawText(ScriptEngine.java:119)
	at com.kms.katalon.core.main.TestCaseExecutor.runScript(TestCaseExecutor.java:445)
	at com.kms.katalon.core.main.TestCaseExecutor.doExecute(TestCaseExecutor.java:436)
	at com.kms.katalon.core.main.TestCaseExecutor.processExecutionPhase(TestCaseExecutor.java:415)
	at com.kms.katalon.core.main.TestCaseExecutor.accessMainPhase(TestCaseExecutor.java:407)
	at com.kms.katalon.core.main.TestCaseExecutor.execute(TestCaseExecutor.java:284)
	at com.kms.katalon.core.main.TestCaseMain.runTestCase(TestCaseMain.java:142)
	at com.kms.katalon.core.main.TestCaseMain.runTestCase(TestCaseMain.java:133)
	at com.kms.katalon.core.main.TestCaseMain$runTestCase$0.call(Unknown Source)
	at TempTestCase1666913463549.run(TempTestCase1666913463549.groovy:25)

2022-10-27 16:31:08.699 ERROR c.k.katalon.core.main.TestCaseExecutor   - ❌ Test Cases/A1_PreFeatureCases/01 How_to_cases/25 Katalon_tips/36 Read from tst.eml file FAILED.
Reason:
java.lang.IllegalArgumentException: Illegal base64 character d
	at com.kms.katalon.core.util.internal.Base64.decode(Base64.java:48)
	at com.kms.katalon.core.util.internal.Base64$decode.call(Unknown Source)
	at 36 Read from tst.eml file.run(36 Read from tst.eml file:51)
	at com.kms.katalon.core.main.ScriptEngine.run(ScriptEngine.java:194)
	at com.kms.katalon.core.main.ScriptEngine.runScriptAsRawText(ScriptEngine.java:119)
	at com.kms.katalon.core.main.TestCaseExecutor.runScript(TestCaseExecutor.java:445)
	at com.kms.katalon.core.main.TestCaseExecutor.doExecute(TestCaseExecutor.java:436)
	at com.kms.katalon.core.main.TestCaseExecutor.processExecutionPhase(TestCaseExecutor.java:415)
	at com.kms.katalon.core.main.TestCaseExecutor.accessMainPhase(TestCaseExecutor.java:407)
	at com.kms.katalon.core.main.TestCaseExecutor.execute(TestCaseExecutor.java:284)
	at com.kms.katalon.core.main.TestCaseMain.runTestCase(TestCaseMain.java:142)
	at com.kms.katalon.core.main.TestCaseMain.runTestCase(TestCaseMain.java:133)
	at com.kms.katalon.core.main.TestCaseMain$runTestCase$0.call(Unknown Source)
	at TempTestCase1666913463549.run(TempTestCase1666913463549.groovy:25)

2022-10-27 16:31:08.715 INFO  c.k.katalon.core.main.TestCaseExecutor   - END Test Cases/A1_PreFeatureCases/01 How_to_cases/25 Katalon_tips/36 Read from tst.eml file

No, it is too messy for me to understand, sorry.

And I think you shouldn’t do that. I will explain why as follows.

@Dave_Evers

You should start with studying the specification of Email message format.

Refer to RFC-2822

Internet Message Format

Messages are divided into lines of characters. A line is a series of characters that is delimited with the two characters carriage-return and line-feed; that is, the carriage return (CR) character (ASCII value 13) followed immediately by the line feed (LF) character (ASCII value 10). (The carriage-return/line-feed pair is usually written in this document as “CRLF”.)

Here the doc mentions CRLF (0x0D 0x0A in hex) should be used as line separator in Email messages on the fly.

In the networking domain, a “new line” must be represented by a fixed byte sequence (0x0D 0x0A). Protocol defines it regardless which type of OS/application is generating/consuming messages across network. The RFC-2822 requires CR LF.

However, a “new line” in files is OS-dependent due to a historical reason. Here is a pitfall.

Mac wants a new line to be a single LF (0x0A). Therefore any text editors on Mac translate 0x0D 0x0A into 0x0a.

Linux want the same as Mac.

Windows wants (perhaps) a new line to be 2 bytes of CRLF. Text editors translates 0x0A into 0x0D 0x0A.

Also, Git has an optional feature that, on Windows, it translates all CR LF (new line in Windows style) in files into LF (Unix style) when you commit changes into git repository.

I found an interesting article about the history of CR+LF vs LF:

There is a twist in Java also. The following code:

System.out.println("\n");

This line writes a 0x0A on Mac, but it will (though I haven’t tested it) writes “0x0D 0x0A” on Windows. Java checks the type of OS where it is currently working on, and switches the binary representation of “\n”.

If you save a new line in Email message into a file, it will have either of 0x0D 0x0A or 0x0A depending on which OS, on with which application you saved the message into the file. Therefore the content of the file may not be valid as a Email message as RFC-2822 specifies.


Short advice to @Dave_Evers

You shouldn’t make *.eml files at all.

You should not serialzie raw bytes of an Email message into a local file to consume later.

You should use some proven software (such as Java Mail API) to parse Email messages on the fly into informational pieces. Your application queries the message body, save it into file, and just use it. Your application should not try to parse the Email message (=headers + CRLF + body).

You should not try to create/edit Email messages as file using text editors.

If your application takes this way, you will get a message body:

aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbQ==

already decoded as

https://www.google.com

because the library that you use will do Base64 decoding behind the scene.

Therefore, your original question “How do I decode a base64 encoded string from an email using Katalon” is pointless.

Hi @kazurayam,
I made a few changes to your code so I could load the files from our Network drive.
Which works fine… The issue I am facing is that 400th record is being picked not the 1st record.
When I load the folder from Windows it’s sorted by date. I am guessing this is the issue?
Is there any way to tell Katalon to sort the folder on date? Or do you think something else is going on?
There are 955 records in the folder right now which continues to grow daily.

Note: I am not trying to decode any data as you pointed out this should not be done…
I have another use case for opening saved eml files to validate contents.

//import files
import java.nio.file.Files
import java.nio.file.Path
import java.nio.file.Paths
import java.util.stream.Collectors
import com.kms.katalon.core.configuration.RunConfiguration

dataDir = Paths.get('//Test1//MailerPickupDirectory//')
List<Path> emlFiles =
	Files.list(dataDir)
		.filter({ p -> p.toString().endsWith(".eml")})
		.collect(Collectors.toList())

// print path of all eml files
def RecordCnt = 0
emlFiles.forEach  { p ->
	RecordCnt ++
	println("Record Count: " + RecordCnt)
	println p.toString()
}

// open the 1st eml file, print its contents
String content = emlFiles.get(0).toFile().text
println("\nContent: " + content)

@kazurayam,
Thanks for highlighting the risks of working with files.

Katalon has nothing to do with this problem: sorting it.
Your Groovy code is totally responsible for it.

Please specify in more detail how you want to sort the List<Path> emlFiles.

Do you want to sort them by their file names? or do you want to sort their date-time when each files were last modified? or do you want to sort them by the value of its content such as “Date : 9 Sep 2022 11:27:02 -0700”?

Do you want to sort them in ascending order or descending order?

1 Like

Hi @kazurayam,

Thanks for your quick reply.

Sorting them by the value of their contents, such as “Date : 9 Sep 2022 11:27:02 -0700” and descending would work the best. So an email from 1:30 would display first in the list and be opened/read from before an email at 1:20.

Thanks,
Dave

@Dave_Evers

Please refer to

for my proposal to your problem.

Hi @kazurayam thank you for all your hard work and time you spent on developing a solution for my file filtering request, hopefully it will benefits others looking to do something similar.

I did the following in my Katalon project:

  1. Added the ‘ComparablePath’ & ‘ComparablePathByContentDateTime’ keywords to the project ‘Keywords’ folder
  2. Added ‘SortingFilesByDateTimeInContent-0.1.0.jar’ to the project ‘Drivers’ folder
  3. Added the ‘test case code’ to a test case
  4. I successfully ran the test case on my local with 7 files loaded in the data folder
    **All files displayed in the expected order

I however ran into a ‘java.lang.StringIndexOutOfBoundsException: String index out of range: -1’ error when I added an 8th file which contained a base64 encoded Subject. The error was caused by the base64 Subject: VGVzdCBTdWJqZWN0 (once I removed ‘VGVzdCBTdWJqZWN0’ the test case ran with no errors). After seeing this error I realized that I should have asked to have the files sorted based on their last modified date-time rather than depending on the file contents (sorry about this kazurayam). This way the file contents would not have to be checked thus avoiding content errors.

Questons:

  1. Would you have time to update the solution so file filtering would be based on last modified date-time (as you originally suggested)?

  2. How would I change dataDir so it points to our network folder?
    For example our network folder is \\Test1\MailerPickupDirectory

Path dataDir = Paths.get(RunConfiguration.getProjectDir()).resolve("Include/data") 
  1. How would I read in the contents of first file that is returned by the filtering?

Already done. Use PathComparableByFileLastModified class, which is included in the same jar.

Alter this to:

Path dataDir = Paths.get("\\Test1\MailerPickupDirectory") `
List<PathComparableByFileLastModified> emlFiles = ...

String contentOfFirstFile = emlFiles.get(0).get().toFile().text