Is it possible to automate browser based calling using Katalon

I need help from Katalon community to understand if we can automate browser based calling

Below is the high level flow that I want to automate

The customer will initiate the call - the specialist will receive the notification that customer has started a call - the specialist will join the call (no approval needed) - another customer (from same organization) or specialist can join the call directly - they will talk and resolve the issue - then end the call.

This is Zoom Web SDK under the hood and everything is happening inside web application.

2 Likes

Yes, you can automate a browser-based calling flow built on Zoom Web SDK using Katalon Studio, but with some important constraints and setup steps due to WebRTC (camera/mic permissions), multi-user orchestration, and the fact that “real” audio/video quality validation isn’t feasible via UI automation.

What is feasible to automate

For your flow—
Customer initiates call → Specialist gets notification → Specialist joins → Another customer/specialist joins directly → Conversation happens → End call

—you can validate via Katalon:

  1. Join/Leave states (buttons enabled/disabled, status banners, call timer).

  2. Participant presence/count (UI participant list or badges).

  3. Notifications (in-app toast/banners/WS-driven DOM updates).

  4. Role-based access (specialist vs. customer: allowed to join w/o approval).

  5. In-call interactions (mute/unmute, camera toggle, screen share button visible/enabled).

  6. Chat (send/receive messages inside the meeting).

  7. End-call flow (participant leaves, others remain or meeting ends for all—whichever applies to your app).

Not feasible via pure browser automation:

  • Measuring audio/video quality, network jitter, echo cancellation.

  • Verifying actual audio is transmitted (we can fake mic/camera for stability).

  • Accessing chrome://webrtc-internals programmatically in a stable CI flow.

1 Like

Automating Browser-Based Calling with Katalon: What’s Possible

Problem Analysis

You want to automate a calling workflow built on Zoom Web SDK where customers initiate calls, specialists receive notifications, join calls, and multiple participants can interact—all within a web application. The key question is: what can Katalon actually automate in this scenario?

The answer is nuanced: Katalon can automate the UI workflow (buttons, notifications, joining) but cannot automate or validate the actual audio/video communication itself.


What Katalon CAN Automate

:white_check_mark: UI Interactions & Workflow Steps:

  • Clicking “Start Call” or “Initiate Call” buttons
  • Waiting for and detecting notification badges/alerts when a call is initiated
  • Clicking “Join Call” buttons for specialists and other participants
  • Verifying call status indicators on the page
  • Clicking “End Call” buttons
  • Navigating through the calling interface

:white_check_mark: Multi-Browser Scenarios (With Limitations):

  • Katalon can open multiple browser instances simultaneously using WebUI.openBrowser() or direct Selenium WebDriver instantiation
  • You can simulate different participants (customer, specialist, another customer) by running separate browser sessions
  • Each session can perform its assigned role independently
  • According to Katalon community discussions, you can use DriverFactory.changeWebDriver() to switch between multiple browser instances within a single test

:white_check_mark: Event & State Verification:

  • Verify that UI elements appear/disappear at the right time
  • Confirm notification badges display when calls are initiated
  • Check that call status changes (e.g., “In Call” → “Call Ended”)
  • Validate page elements are clickable and responsive

What Katalon CANNOT Automate

:cross_mark: Audio/Video Streams:

  • Katalon cannot capture, validate, or interact with WebRTC audio/video data
  • WebRTC streams operate at the browser’s media layer, which is not accessible to test automation tools
  • You cannot verify if audio is actually being transmitted or received between participants

:cross_mark: Real-Time Communication Validation:

  • Cannot measure call quality, latency, or packet loss
  • Cannot validate that audio/video is flowing between participants
  • Cannot simulate network conditions affecting the call
  • WebRTC testing requires specialized tools designed specifically for real-time communication validation

:cross_mark: Zoom Web SDK Internals:

  • Cannot directly interact with Zoom SDK APIs or internal call state
  • Limited ability to verify call metadata or participant information at the SDK level

Recommended Approach for Your Use Case

Option 1: UI-Level Workflow Testing (Recommended)

Automate the happy path to verify the calling workflow functions correctly:

  1. Browser 1 (Customer): Click “Start Call” button
  2. Browser 2 (Specialist): Wait for notification badge → Click “Join Call”
  3. Browser 3 (Another Participant): Wait for call status → Click “Join Call”
  4. All Browsers: Verify “In Call” status displays
  5. Browser 1: Click “End Call” → Verify call ends for all participants

Example approach:

// Open multiple browsers
WebUI.openBrowser('')
WebUI.navigateToUrl('https://your-calling-app.com')
def customerDriver = DriverFactory.getCurrentWebDriver()

WebUI.openBrowser('')
WebUI.navigateToUrl('https://your-calling-app.com')
def specialistDriver = DriverFactory.getCurrentWebDriver()

// Customer initiates call
DriverFactory.changeWebDriver(customerDriver)
WebUI.click(findTestObject('Object Repository/StartCallButton'))

// Specialist receives notification and joins
DriverFactory.changeWebDriver(specialistDriver)
WebUI.waitForElementPresent(findTestObject('Object Repository/CallNotification'), 10)
WebUI.click(findTestObject('Object Repository/JoinCallButton'))

// Verify both are in call
WebUI.verifyElementPresent(findTestObject('Object Repository/InCallStatus'))

Option 2: Hybrid Approach

  • Use Katalon for UI workflow automation
  • Use specialized WebRTC testing tools (like testRTC) for audio/video quality validation if needed
  • Combine results in your test reports

Option 3: API-Level Testing

  • If Zoom Web SDK exposes APIs for call state management, test those endpoints separately
  • Verify call initiation, joining, and termination at the API level
  • Use Katalon for UI verification alongside API tests

Key Considerations

Multi-Browser Limitations:

Notification Handling:

  • Use WebUI.waitForElementPresent() with appropriate timeouts to wait for notification badges
  • Verify notification content if possible (text, icons, etc.)

Zoom Web SDK Specifics:

  • The Zoom Meeting SDK for web uses WebAssembly and JavaScript frameworks (React, Angular, Vue.js)
  • Focus your automation on the UI layer that wraps the SDK, not the SDK internals

References


Bottom Line: You can absolutely automate the calling workflow UI with Katalon, but you’ll need to accept that actual audio/video validation is outside the scope of browser automation tools. Focus on verifying that the right buttons appear, notifications trigger, and participants can join—the actual communication quality would require separate monitoring tools.

let’s hear from people who had implemented similar to your use case

Yeah, I’m waiting to hear from them only.