Separating an Array/Grabbing Links from a Table

I’m trying to grab ad links from a table on a web page, but I can’t define them by element because the table is the element and everything else is just an image with a link attached to it. So I was wondering if there is a way to get all of the links within a specified table (and how to write that), and if the data pulled is in array form, how to get that data into Excel or at least separate it up so i can get it into Excel.

I’ve done a lot of research and could not find any answers for either of these things. So all help is appreciated

can you please post html of that table so we can take a look

I use the id in the yellow box to find it, and then I need to know how to pull the links out of it, since there 5 links in there, each

within the table is another ad

2018-07-26 (2).png

import org.openqa.selenium.By as By
import org.openqa.selenium.WebDriver as WebDriver
import org.openqa.selenium.WebElement as WebElement
import com.kms.katalon.core.webui.driver.DriverFactory as DriverFactory

WebDriver driver = DriverFactory.getWebDriver()
WebElement temp = driver.findElement(By.id('ad_block_0'))
List list = temp.findElement(By.xpath('.//a'))
println list.size()
for( i in list){println i}

FAILED because (of) org.codehaus.groovy.runtime.typehandling.GroovyCastException: Cannot cast object ‘[[[[CChromeDriver: chrome on XP (5b09454836af6a351578be13d9355635)] -> id: ad_block_0]] -> xpath: .//a]’ with class ‘org.openqa.selenium.remote.RemoteWebElement’ to class ‘java.util.List’

It failed after the List command completed, something is wrong with list.size() maybe?

Edit: Doesn’t seem to be the case. I inserted println “Hello World!” after the line 4 and it doesn’t print it.

sorry i’m not best at groovy (or anything actually) so this should be :

import org.openqa.selenium.By as By
import org.openqa.selenium.WebDriver as WebDriver
import org.openqa.selenium.WebElement as WebElement
import com.kms.katalon.core.webui.driver.DriverFactory as DriverFactory

WebDriver driver = DriverFactory.getWebDriver()
WebElement temp = driver.findElement(By.id('ad_block_0'))
List list = temp.findElements(By.xpath('.//a'))
println list.size()
for( i in list){println i.getAttribute('href')}
1 Like

I caught on to adding that ‘s’ on temp.findElement, but it doesn’t give me any useful info. It prints the same four things. Maybe I just need to play with this, it might be a good stepping stone. thank you

This is what it spits out:

07-26-2018 01:12:59 PM - [START] - Start action : Statement - For (def i : list)

07-26-2018 01:12:59 PM - [START] - Start action : Statement - println(i)

[[[[CChromeDriver: chrome on XP (a1189306e666a838e5f25d55e3619fd5)] -> id: ad_block_0]] -> xpath: .//a]

07-26-2018 01:12:59 PM - [END] - End action : Statement - println(i)

07-26-2018 01:12:59 PM - [START] - Start action : Statement - println(i)

[[[[CChromeDriver: chrome on XP (a1189306e666a838e5f25d55e3619fd5)] -> id: ad_block_0]] -> xpath: .//a]

07-26-2018 01:12:59 PM - [END] - End action : Statement - println(i)

07-26-2018 01:12:59 PM - [START] - Start action : Statement - println(i)

[[[[CChromeDriver: chrome on XP (a1189306e666a838e5f25d55e3619fd5)] -> id: ad_block_0]] -> xpath: .//a]

07-26-2018 01:12:59 PM - [END] - End action : Statement - println(i)

07-26-2018 01:12:59 PM - [START] - Start action : Statement - println(i)

[[[[CChromeDriver: chrome on XP (a1189306e666a838e5f25d55e3619fd5)] -> id: ad_block_0]] -> xpath: .//a]

07-26-2018 01:12:59 PM - [END] - End action : Statement - println(i)

07-26-2018 01:12:59 PM - [END] - End action : Statement - For (def i : list)

07-26-2018 01:12:59 PM - [PASSED] - Test Cases/New COW Ads Idea/Grab All Links

07-26-2018 01:12:59 PM - [END] - End Test Case : Test Cases/New COW Ads Idea/Grab All Links

i added getAttribute fn(). please try with that

Can I get those links into variable form somehow? That is pretty perfect

Nevermind, I think I got it. I’ll post final code when done

import org.openqa.selenium.By as By
import org.openqa.selenium.WebDriver as WebDriver
import org.openqa.selenium.WebElement as WebElement
import com.kms.katalon.core.webui.driver.DriverFactory as DriverFactory

WebDriver driver = DriverFactory.getWebDriver()
WebElement temp = driver.findElement(By.id('ad_block_0'))
List list = temp.findElements(By.xpath('.//a'))
println list.size()def aHref = []
for( i in list){aHref.add(i.getAttribute('href'))}println aHref

Andrej Podhajský said:

sorry i’m not best at groovy (or anything actually) so this should be :

import org.openqa.selenium.By as By

import org.openqa.selenium.WebDriver as WebDriver
import org.openqa.selenium.WebElement as WebElement
import com.kms.katalon.core.webui.driver.DriverFactory as DriverFactory

WebDriver driver = DriverFactory.getWebDriver()
WebElement temp = driver.findElement(By.id(‘ad_block_0’))
List list = temp.findElements(By.xpath(’.//a’))
println list.size()
for( i in list){println i.getAttribute(‘href’)}


  

So I have a new issue now. Where ‘ad_block_0’ is written, I need to loop into the next ones (ie: ‘ad_block_1’, ‘ad_block_2’, etc). I have a loop already going where it pulls it from my object repository under the name ‘AdNo’, but it uses the folder it’s in to reference it, so it’s actually ‘Ads/ad_block_0’ and the code doesn’t like when I replace ‘ad_block_0’ with ‘AdNo’ for the reason that it has a directory attached. Is there a way I can reference AdNo but without the directory attached to it? I’ve tried removing ‘By.id’ but it doesn’t help

will it be ok, to get all links from all ad_block_XYZ at once?

import org.openqa.selenium.By as By
import org.openqa.selenium.WebDriver as WebDriver
import org.openqa.selenium.WebElement as WebElement
import com.kms.katalon.core.webui.driver.DriverFactory as DriverFactory

WebDriver driver = DriverFactory.getWebDriver()
List list = driver.findElements(By.xpath('//*[contains(@id,"ad_block")]//a'))
println list.size()
for( i in list){println i.getAttribute('href')}

I can test it, but not all of the ad_block_#'s are like that. I was testing each individual one so that it would know how to handle it. I’ll update you if this works

Actually, it wouldn’t work because I need to copy and paste information about each one into an excel sheet, which happens after checking it

Edit: I might be able to get away if just using this formula no matter what though, because it still seems to work with the other elements. I’ll test it and let you know.

there is way how to send testObject but that is closely tied to way how your objects are defined.
eg. if they use xpath definition (basic -> xpath=‘something’) then one can use:

import org.openqa.selenium.By as Byimport org.openqa.selenium.WebDriver as WebDriverimport org.openqa.selenium.WebElement as WebElementimport com.kms.katalon.core.webui.driver.DriverFactory as DriverFactoryWebDriver driver = DriverFactory.getWebDriver()WebElement temp = driver.findElement(By.xpath(findTestObject('Ads/ad_block_0').findPropertyValue("xpath")))List list = temp.findElements(By.xpath('.//a'))println list.size()for( i in list){println i.getAttribute('href')}

No matter how I edit it, it spits this at me:

Test Cases/COW Ads/test FAILED because (of) org.openqa.selenium.InvalidSelectorException: invalid selector: Unable to locate an element with the xpath expression because of the following error:

SyntaxError: Failed to execute ‘evaluate’ on ‘Document’: The string ‘’ is not a valid XPath expression.

(Session info: chrome=68.0.3440.106)

(Driver info: chromedriver=2.35.528161 (5b82f2d2aae0ca24b877009200ced9065a772e73),platform=Windows NT 10.0.17134 x86_64) (WARNING: The server did not provide any stacktrace information)

Command duration or timeout: 0 milliseconds

For documentation on this error, please visit: http://seleniumhq.org/exceptions/invalid_selector_exception.html

Build info: version: ‘3.7.1’, revision: ‘8a0099a’, time: ‘2017-11-06T21:07:36.161Z’

System info: host: ‘DESKTOP-C47HL4K’, ip: ‘192.168.0.118’, os.name: ‘Windows 10’, os.arch: ‘amd64’, os.version: ‘10.0’, java.version: ‘1.8.0_102’

Driver info: com.kms.katalon.selenium.driver.CChromeDriver

Capabilities {acceptInsecureCerts: false, acceptSslCerts: false, applicationCacheEnabled: false, browserConnectionEnabled: false, browserName: chrome, chrome: {chromedriverVersion: 2.35.528161 (5b82f2d2aae0ca…, userDataDir: C:\Users\uvers\AppData\Loca…}, cssSelectorsEnabled: true, databaseEnabled: false, handlesAlerts: true, hasTouchScreen: false, javascriptEnabled: true, locationContextEnabled: true, mobileEmulationEnabled: false, nativeEvents: true, networkConnectionEnabled: false, pageLoadStrategy: normal, platform: XP, platformName: XP, rotatable: false, setWindowRect: true, takesHeapSnapshot: true, takesScreenshot: true, unexpectedAlertBehaviour: , unhandledPromptBehavior: , version: 68.0.3440.106, webStorageEnabled: true}

Session ID: ebcc345ec1edc0322746eab49baf535f

*** Element info: {Using=xpath, value=}

It fails on the WebElement line. Any ideas? I set it back to what you gave me for reference on the error code

please, post here how your ‘Ads/ad_block_0’ is defined

In my repository, the only property it has is that the ‘id’ is equal to whatever ad_block_# it is. In the code, I refer to it using findTestObject(‘AdNo’) where AdNo is defined by where in the loop it is at (first loop is ad_block_0, second is ad_block_1, and so on). I’ve tried it the original way as well, but it doesn’t work

WebElement temp = driver.findElement(By.xpath(findTestObject(AdNo).findPropertyValue(“xpath”)))

List list = temp.findElements(By.xpath(’.//a’))

ok, in taht case we need to use id from property definitions of object like:

WebElement temp = driver.findElement(By.id(findTestObject('AdNo').findPropertyValue("id")))
1 Like