Page 1 of 1

XPath for HTML

Posted: 10 May 2016 16:11
by poiuztr
Hey, I tried to get some information from a website (youtube) using the evaluateXpathAsString function. I want to use only the html I get via a GET request. When I try to execute the script, I get the following error:

org.xml.sax.SAXParseException: unterminated entity ref (position:ENTITY_REF @1:1024 in java.io.Stringreader@2e17e7bb)

I tried some other websites and got different errors, but none worked.

Should this function work with html?

Re: XPath for HTML

Posted: 10 May 2016 19:46
by Martin
Hi,

No, the XPath version only supports proper XML. You could try to use a converter/cleaner to produce an XML file from a web site. This one looks promising: https://infohound.net/tidy/

Regards,
Martin

Re: XPath for HTML

Posted: 17 May 2016 13:12
by poiuztr
Ok, I managed to work around my issue by using AutoShare's Subject field, where the information I need can be found by simple substring searching. I prefer this solution for now, instead of trying to find a way to get a proper xml fully automated over http requests or something. Anyway, thanks for your help, maybe I will need this for other projects...

Kind regards