How to get data from xml file using groovy?

Hi,
i re-edit my question because i made a mistake in my xml so the right xml is:

my file: person.xml contains:

<mydata>
  <person>
   		<data firstname="Pierre" Age="25" address="Paris" Date="2020-01-01T00:00:00"/>
		<data firstname="Lucy" Age="30" address="Angers" Date="2020-02-01T00:00:00"/>
  </person>
  <job>
  <value company = "Darty" position="excutant" startdate = "1980-01-01T00;00;00"/>
 <value company = "Auchan" position="Chef equipe" startdate = "1982-01-01T00;00;00"/>
 </job>
</mydata>

I want to get
firstname=“Pierre”
Age=“25”
address=“Paris”
Date=“2020-01-01T00:00:00”/

I can get the value from the first part of the xml which is person
But im not able to get value from the second part.
i tried this

def data= new XmlSlurper().parse(new File(“C:/temp/person.xml”))
println “company[0]: ${data.mydata.job.value.company[0]}”
or
println “company[0]: ${data.job.value.company[0]}”

thank you

Try the following test case script:

import groovy.xml.*

/**
 * see https://groovy-lang.org/processing-xml.html
 */

def data = """
<person>
<data firstname="Pierre" Age="25" address="Paris" Date="2020-01-01T00:00:00"/>
<data firstname="Lucy" Age="30" address="Angers" Date="2020-02-01T00:00:00"/>
</person>
"""

def slurper = new XmlSlurper()
def person = slurper.parseText(data)
println("firstname: ${person.data[0].@firstname}")
println("Age: ${person.data[0].@Age}")
println("address: ${person.data[0].@address}")
println("Date: ${person.data[0].@Date}")

This will emit:

2020-08-19 12:07:26.783 DEBUG testcase.TC5                             - 4: println(firstname: $person.data[0].firstname)
firstname: Pierre
2020-08-19 12:07:26.877 DEBUG testcase.TC5                             - 5: println(Age: $person.data[0].Age)
Age: 25
2020-08-19 12:07:26.882 DEBUG testcase.TC5                             - 6: println(address: $person.data[0].address)
address: Paris
2020-08-19 12:07:26.885 DEBUG testcase.TC5                             - 7: println(Date: $person.data[0].Date)
Date: 2020-01-01T00:00:00

or, using @kazurayam example, use the bellow code to print all person data:

person.data.each {it ->

    println("firstname: ${it.@firstname}")
    println("Age: ${it.@Age}")
    println("address: ${it.@address}")
    println("Date: ${it.@Date}")

}

output:

firstname: Pierre
Age: 25
address: Paris
Date: 2020-01-01T00:00:00

firstname: Lucy
Age: 30
address: Angers
Date: 2020-02-01T00:00:00

Thank you for your help and im sorry.
i made a mistake in my xml data.
this first part with person works for me.
The second part with job is what i`m not able to get value.
if i try this:
println(“company : ${mydata.job.value[0].@company }”)
it return empty .

In fact the xml is like:

i dont see my answer with xml, i will re edit the question

worked for me using XmlParser instead of XmlSlurper:

import groovy.xml.*

/**
 * see https://groovy-lang.org/processing-xml.html
 */

def data = """
<mydata>
  <person>
    <data firstname="Pierre" Age="25" address="Paris" Date="2020-01-01T00:00:00"/>
    <data firstname="Lucy" Age="30" address="Angers" Date="2020-02-01T00:00:00"/>
  </person>
  <job>
    <value company = "Darty" position="excutant" startdate = "1980-01-01T00;00;00"/>
    <value company = "Auchan" position="Chef equipe" startdate = "1982-01-01T00;00;00"/>
 </job>
</mydata>
"""

def parser = new XmlParser()
parsed = parser.parseText(data)

parsed.person.data.each {it ->

    println("firstname: ${it.@firstname}")
    println("Age: ${it.@Age}")
    println("address: ${it.@address}")
    println("Date: ${it.@Date}")
    println()

}

parsed.job.value.each {it ->

    println("Company: ${it.@company}")
    println("Position: ${it.@position}")
    println("startdate: ${it.@startdate}")
    println()

}

output:

firstname: Pierre
Age: 25
address: Paris
Date: 2020-01-01T00:00:00

firstname: Lucy
Age: 30
address: Angers
Date: 2020-02-01T00:00:00

Company: Darty
Position: excutant
startdate: 1980-01-01T00;00;00

Company: Auchan
Position: Chef equipe
startdate: 1982-01-01T00;00;00

LE: it’s working fine with XmlSlurper too, i just got a bit confused, attempting to print the whole ‘parsed’ variable looks empty, perhaps due to the lazy evaluation, but traversing it by Gpath works fine.

i have added these two lines to the above code.

println("Incomplete Gpath prints nothing: ${parsed.job.value[0]}")

println("Complete Gpath shows the value: ${parsed.job.value[0].@company}")

out with slurper:

Incomplete Gpath prints nothing: 
Complete Gpath shows the value: Darty

When using parser:

Incomplete Gpath prints something: value[attributes={company=Darty, position=excutant, startdate=1980-01-01T00;00;00}; value=[]]
Complete Gpath shows the value: Darty

so, if the behavior of the slurper is too confusing, just use the parser (easier to debug)

it works, thank you for your help.

i just need to deal with the first line: <?xml version='1.0' encoding='UTF-8'?>
because of this: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.I`m looking for this…

And also i`m working on the rest of the solution that mentioned:
println(“Incomplete Gpath prints nothing: ${parsed.job.value[0]}”)

The following long long thread may be helpful for you.


possibly

parser=new XmlSlurper()
parser.setFeature("http://apache.org/xml/features/disallow-doctype-decl", false) 
parser.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
parser.parse(it)

quoted from https://stackoverflow.com/questions/30753884/how-to-work-around-groovys-xmlslurper-refusing-to-parse-html-due-to-doctype-and

unfortunately this is a valid bug in the java parser.
the prolog format is valid, according to xml the specs, but you have to deal with …

the easiest way will be to use an regex parser, looking for something starting with <?xml and remove that line completely, before passing the content to the xml parser