XPath for HTML

Post your questions and help other users.

Moderator: Martin

Post Reply
poiuztr
Posts: 2
Joined: 20 Jul 2013 22:23

XPath for HTML

Post by poiuztr » 10 May 2016 16:11

Hey, I tried to get some information from a website (youtube) using the evaluateXpathAsString function. I want to use only the html I get via a GET request. When I try to execute the script, I get the following error:

org.xml.sax.SAXParseException: unterminated entity ref (position:ENTITY_REF @1:1024 in java.io.Stringreader@2e17e7bb)

I tried some other websites and got different errors, but none worked.

Should this function work with html?

User avatar
Martin
Posts: 4468
Joined: 09 Nov 2012 14:23

Re: XPath for HTML

Post by Martin » 10 May 2016 19:46

Hi,

No, the XPath version only supports proper XML. You could try to use a converter/cleaner to produce an XML file from a web site. This one looks promising: https://infohound.net/tidy/

Regards,
Martin

poiuztr
Posts: 2
Joined: 20 Jul 2013 22:23

Re: XPath for HTML

Post by poiuztr » 17 May 2016 13:12

Ok, I managed to work around my issue by using AutoShare's Subject field, where the information I need can be found by simple substring searching. I prefer this solution for now, instead of trying to find a way to get a proper xml fully automated over http requests or something. Anyway, thanks for your help, maybe I will need this for other projects...

Kind regards

Post Reply