I need help from Katalon community to understand if we can automate browser based calling
Below is the high level flow that I want to automate
The customer will initiate the call - the specialist will receive the notification that customer has started a call - the specialist will join the call (no approval needed) - another customer (from same organization) or specialist can join the call directly - they will talk and resolve the issue - then end the call.
This is Zoom Web SDK under the hood and everything is happening inside web application.
Yes, you can automate a browser-based calling flow built on Zoom Web SDK using Katalon Studio, but with some important constraints and setup steps due to WebRTC (camera/mic permissions), multi-user orchestration, and the fact that “real” audio/video quality validation isn’t feasible via UI automation.
What is feasible to automate
For your flow—
Customer initiates call → Specialist gets notification → Specialist joins → Another customer/specialist joins directly → Conversation happens → End call
—you can validate via Katalon:
Join/Leave states (buttons enabled/disabled, status banners, call timer).
Participant presence/count (UI participant list or badges).
Notifications (in-app toast/banners/WS-driven DOM updates).
Role-based access (specialist vs. customer: allowed to join w/o approval).
In-call interactions (mute/unmute, camera toggle, screen share button visible/enabled).
Chat (send/receive messages inside the meeting).
End-call flow (participant leaves, others remain or meeting ends for all—whichever applies to your app).
Automating Browser-Based Calling with Katalon: What’s Possible
Problem Analysis
You want to automate a calling workflow built on Zoom Web SDK where customers initiate calls, specialists receive notifications, join calls, and multiple participants can interact—all within a web application. The key question is: what can Katalon actually automate in this scenario?
The answer is nuanced: Katalon can automate the UI workflow (buttons, notifications, joining) but cannot automate or validate the actual audio/video communication itself.
What Katalon CAN Automate
UI Interactions & Workflow Steps:
Clicking “Start Call” or “Initiate Call” buttons
Waiting for and detecting notification badges/alerts when a call is initiated
Clicking “Join Call” buttons for specialists and other participants
Verifying call status indicators on the page
Clicking “End Call” buttons
Navigating through the calling interface
Multi-Browser Scenarios (With Limitations):
Katalon can open multiple browser instances simultaneously using WebUI.openBrowser() or direct Selenium WebDriver instantiation
You can simulate different participants (customer, specialist, another customer) by running separate browser sessions
Each session can perform its assigned role independently
For complex multi-participant scenarios, consider using Katalon TestCloud’s parallel test execution to run separate test cases for each participant role.
Notification Handling:
Use WebUI.waitForElementPresent() with appropriate timeouts to wait for notification badges
Verify notification content if possible (text, icons, etc.)
Zoom Web SDK Specifics:
The Zoom Meeting SDK for web uses WebAssembly and JavaScript frameworks (React, Angular, Vue.js)
Focus your automation on the UI layer that wraps the SDK, not the SDK internals
Bottom Line: You can absolutely automate the calling workflow UI with Katalon, but you’ll need to accept that actual audio/video validation is outside the scope of browser automation tools. Focus on verifying that the right buttons appear, notifications trigger, and participants can join—the actual communication quality would require separate monitoring tools.