Problem
As a test developer, at some point, you may have to interact with Gmail to check for some email messages…
…but as you soon realize, that is easier said than done.
People have tried this various ways:
- scraping the Gmail website in their browser
- using that Gmail plugin in the Katalon Store
Both methods are flawed:
- Gmail is specifically designed to STOP web scrapers, hackers, … . Good luck getting your Katalon Studio script to get into this. There’s not really a way to do this without deliberately compromising the security of your email (why the Heck would you want to do that?!)
- That Gmail plugin in the Katalon Studio Store is woefully simplistic. It cannot handle any of the following:
- searching Message Threads. Your latest message may be part of a thread, but that plugin can’t get it.
- searching messages in an inbox. It is ONLY designed to get the latest message, assuming it’s NOT part of a thread. That’s it!
- message creation or deletion
- built-in support for getting things like links (we’re going to be covering that at the end of this blog post)
A third approach : interacting with it through API
What if I were to tell you that you can access emails programmatically, and that Google has an API for this?
It took me a lot of time to get this set up, as it was way more than simply writing code…
WARNING: It’s going to get pretty…involved…
To get this working
First step is to configure Google Cloud Console for your desktop project (your Katalon Studio workspace)…
Enable the Gmail API
This was covered in the Gmail Java Quickstart walkthrough
Create the API Credentials
Also covered in the walkthrough, but worth talking about in detail here…
Follow this link to create the OAuth 2.0 Client ID for your desktop app.
Once that is done, download the OAuth Client JSON to your Katalon Studio project. Anywhere is fine, for now, just give it an idiomatic name like gmail-access-credentials.json
or something similar.
While you’re at it, we can create service account too…
Setup OAuth Consent Screen
This is also in the walkthrough, but we need to discuss this with a little more detail:
Scopes
You’re going to want to add one or more of the following scopes, all of which can be found under Gmail API (click Add or Remove Scopes on page two of the OAuth consent screen setup)
.../auth/userinfo.email
: See your primary Google Account email address.../auth/gmail.readonly
: Views your email messages and settings (this is what you want for viewing the messages)- (Optional)
.../auth/gmail.compose
: Manage drafts and send emails
Test users
Add your email account, that you’re trying to manage programmatically, as test user. Also add any other users on there, that are using your code base (e.g. your personal email).
It’s coding time…!
Dependency management
In my project, I use Gradle to manage dependencies. I recommend you install it to your machine before proceeding. I won’t go over details here, but I would recommend getting SDKMAN! so that you can install it from the command line.
After setting up Gradle on your machine, change your build.gradle
to something like the following:
plugins {
id 'java'
id "com.katalon.gradle-plugin" version "0.1.1"
}
repositories {
mavenCentral()
}
dependencies {
// Google API stuff
implementation 'com.google.api-client:google-api-client:2.0.0'
implementation 'com.google.oauth-client:google-oauth-client-jetty:1.34.1'
implementation 'com.google.apis:google-api-services-gmail:v1-rev20220404-2.0.0'
}
After Gradle is installed on your machine, and your build.gradle
is set up, go to your terminal (cmd, Git BASH, …) and run the command gradle clean katalonCopyDependencies
to install all dependencies.
Enough config! Let’s write some code, dammit!
Next, Create custom keyword for your util class. In my real codebase, it is named SMDEmailUtils
and contains email utilities as well as all the email/Gmail methods I need to get the email-related work done.
This class (name it whatever you want, that is idiomatic to you/your workspace) will be home to your Gmail handling logic.
It should contain the following:
private static Gmail _GmailInstance
, or something similar. We are going to be using a singleton design pattern here…- …a
public static
getter method for that instance. This is where we implement the singleton stuff:
public static Gmail GetGmailInstance() {
if (this._GmailInstance == null) {
// Build a new authorized API client service.
final NetHttpTransport httpTransport = GoogleNetHttpTransport.newTrustedTransport();
this._GmailInstance = new Gmail.Builder(httpTransport, this.JSONFactory, getCredentials(httpTransport))
.setApplicationName(this.AppName)
.build();
}
return this._GmailInstance;
}
private static final JsonFactory JSONFactory = GsonFactory.getDefaultInstance()
private static final String AppName
(set this to whatever you want to call your app on the OAuth Consent Screen)private static final List<String> Scopes = [GmailScopes.GMAIL_READONLY];
private static final String TokensDirectoryPath = "tokens";
private static final String CredentialsFilePath
(set this to the path to that Gmail credentials JSON file that you downloaded)private static
method for getting the credentials. It should look like:
/**
* Creates an authorized Credential object.
*
* @param httpTransport The network HTTP Transport.
* @return An authorized Credential object.
* @throws IOException If the credentials.json file cannot be found.
*/
private static Credential getCredentials(final NetHttpTransport httpTransport)
throws IOException {
// Load client secrets.
InputStream is = new FileInputStream(this.CredentialsFilePath);
if (is == null) {
throw new FileNotFoundException("Resource not found: " + this.CredentialsFilePath);
}
GoogleClientSecrets clientSecrets =
GoogleClientSecrets.load(this.JSONFactory, new InputStreamReader(is));
// Build flow and trigger user authorization request.
GoogleAuthorizationCodeFlow flow = new GoogleAuthorizationCodeFlow.Builder(
httpTransport, this.JSONFactory, clientSecrets, this.Scopes)
.setDataStoreFactory(new FileDataStoreFactory(new java.io.File(this.TokensDirectoryPath)))
.setAccessType("offline")
.build();
LocalServerReceiver receiver = new LocalServerReceiver.Builder().setPort(8888).build();
return new AuthorizationCodeInstalledApp(flow, receiver).authorize("user");
}
- a method for handling the request to Gmail (or any other Google service for that matter) . In this method, we take in a callback
onDoRequest
for doing the request, a callbackonCheckResponse
that validates the response, and atimeOut
. We define that method thus:
public static GenericJson HandleRetryableRequest(Closure<GenericJson> onDoRequest, Closure<Boolean> onCheckResponse, int timeOut) {
long startTime = System.currentTimeSeconds();
int exponent = 0;
while (System.currentTimeSeconds() < startTime + timeOut) {
GenericJson response = onDoRequest();
if (onCheckResponse(response))
return response;
// wait some time to try again, exponential backoff style
sleep(1000 * 2**exponent++);
}
return null;
}
The method is trying the request, in an exponential backoff style, until either the response checks out, or until timeOut
(which will cause it to return null). Speaking of handling request, we also need…
- …a method to handle user-caused (error code 4xx) HTTP error. For now, we care about the end user causing an
invalid_grant
error (b/c they have stale token). Let’s implement it:
private static void handleUserHttpError(HttpResponseException ex) {
if ((ex instanceof TokenResponseException) && (ex.getDetails().getError().equals('invalid_grant'))) {
new File(this.TokensDirectoryPath).deleteDir();
this._GmailInstance = null;
KeywordUtil.logInfo("Token is pending reset. Please sign into your Google Chrome browser with the test email, and be ready to authenticate the app!");
return;
}
throw ex;
}
Next Step: handling the actual Messages
For the sake of this blog post, we’re going to assume that you are interested only in getting the latest message, and that message was sent less than 1 day ago. (We can handle other use cases in the comments below, or in another blog post.)
For that use case, it is worth noting that this latest message may be in a thread, namely the last message in said message thread.
Let’s also assume a ubiquitous use case: you want to extract a link from the email message, and that you happen to know xpath selection strategy for said link.
Let’s start this by getting the boilerplate out the way:
More boilerplate…
public static String GetLatestMessageBody(int timeOut) {
return this.getContent(this.GetLatestMessage(timeOut));
}
public static Message GetLatestMessage(int timeOut) {
// get the latest thread list
ListThreadsResponse response = this.HandleRetryableRequest({
return this.GetGmailInstance()
.users()
.threads()
.list("me")
.setQ("is:unread newer_than:1d")
.setIncludeSpamTrash(true)
.execute();
},
{ ListThreadsResponse res -> return ((res.getThreads() != null) && (!res.getThreads().isEmpty())) },
timeOut);
return response.getThreads()
.collect({ Thread thread ->
return this.GetGmailInstance()
.users()
.threads()
.get("me", thread.getId())
.execute()
}).max { Thread thread -> thread.getMessages().last().getInternalDate() }
.getMessages()
.last();
}
/**
* Copied from https://stackoverflow.com/a/58286921
* @param message
* @return
*/
private static String getContent(Message message) {
StringBuilder stringBuilder = new StringBuilder();
try {
getPlainTextFromMessageParts(message.getPayload().getParts(), stringBuilder);
// NOTE: updated by Mike Warren, this was adapted for message that contain URLs in its body
return new String(Base64.getUrlDecoder().decode(stringBuilder.toString()),
StandardCharsets.UTF_8);
} catch (UnsupportedEncodingException e) {
// NOTE: updated by Mike Warren
Logger.getGlobal().severe("UnsupportedEncoding: ${e.toString()}");
return message.getSnippet();
}
}
/**
* Copied from https://stackoverflow.com/a/58286921
* @param messageParts
* @param stringBuilder
*/
private static void getPlainTextFromMessageParts(List<MessagePart> messageParts, StringBuilder stringBuilder) {
for (MessagePart messagePart : messageParts) {
// NOTE: updated by Mike Warren
if (messagePart.getMimeType().startsWith("text/")) {
stringBuilder.append(messagePart.getBody().getData());
}
if (messagePart.getParts() != null) {
getPlainTextFromMessageParts(messagePart.getParts(), stringBuilder);
}
}
}
/**
* **NOTE**: forked from https://stackoverflow.com/a/2269464/2027839 , and then refactored
*
* Processes HTML, using XPath
*
* @param html
* @param xpath
* @return the result
*/
public static String ProcessHTML(String html, String xpath) {
final String properHTML = this.toProperHTML(html);
final Element document = DocumentBuilderFactory.newInstance()
.newDocumentBuilder()
.parse(new ByteArrayInputStream( properHTML.bytes ))
.documentElement;
return XPathFactory.newInstance()
.newXPath()
.evaluate( xpath, document );
}
private static String toProperHTML(String html) {
// SOURCE: https://stackoverflow.com/a/19125599/2027839
String properHTML = html.replaceAll( "(&(?!amp;))", "&" );
if (properHTML.contains('<!DOCTYPE html'))
return properHTML;
if (!properHTML.contains('<html')) {
properHTML = """
<html>
<head></head>
<body>
${properHTML}
</body>
</html>
"""
}
return """<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
${properHTML}
""";
}
There are several things going on already, let’s highlight them:
- we have created a method for getting the latest Message (even if it is part of a Message Thread)
- we have created a method for getting the content of that Message, as a String (we assume the relevant message parts all start with
text/
- we have created methods for processing that String as HTML, and querying it with xpath selectors
Now, let’s conclude this with a real use-case from my actual code base:
Real code for the URL extraction
In here, I am extracting sign-up link, and that link can be selected with the xpath "//a[.//div[@class = 'sign-mail-btn-text']]/@href"
. (The @href
at the end gets the actual URL in that link.)
public static String ExtractSignUpLink() {
int retryAttempts;
return ActionHandler.HandleReturnableAction({
return this.ProcessHTML(this.GetLatestMessageBody(30),
"//a[.//div[@class = 'sign-mail-btn-text']]/@href");
}, { boolean success, ex ->
if (success) {
final String url = (String)ex;
if (!SMDStringUtils.IsNullOrEmpty(url)) {
return url;
}
}
if (ex instanceof HttpResponseException) {
final int errorCode = ((HttpResponseException)ex).getStatusCode();
if (ex.isSuccessStatusCode())
KeywordUtil.logInfo("Somehow, we got an exception, that has a success status code, with message: '${ex.getStatusMessage()}'");
if ((errorCode < 400) || (errorCode >= 500))
throw ex;
this.handleUserHttpError(ex);
}
sleep(1000 * 2**retryAttempts++);
}, TimeUnit.MINUTES.toSeconds(15));
}
Note that ActionHandler
is not a built-in class, but one that we write:
public class ActionHandler {
public static Object HandleReturnableAction(Closure onAction, Closure<Object> onDone, long timeOut) {
long startTime = System.currentTimeSeconds();
while (System.currentTimeSeconds() < startTime + timeOut) {
try {
final Object result = onDone(true, onAction());
if (result)
return result;
} catch (Exception ex) {
onDone(false, ex);
}
}
return null;
}
}
What’s going on
We are waiting on the last message containing a link, and handling any exceptions, using an exponential backoff style, for at most 15 minutes.
Thoughts? Concerns? Want me to cover more use cases?
Let me know in the comments below.
Happy coding!