Discussion:
help..parsing html via XMLHTTPREQUEST.
(too old to reply)
Alex Wolff
2005-07-12 18:28:02 UTC
Permalink
Hello,

All code is at the bottom of message. I need to programmatically call an
html form, check some boxes and then submit the form, and finally parse the
returned page again. I am thinking I need to use XMLHTTPREQUEST then load
the resulting HTML as a DOM document then step thru the controls and do my
thing. Problem is I am getting an error when I load the form: "XML document
must have top level element". I do not have access to the original form
since its a private website. When I do "Debug.Print XMLHTTP.ResponseText" I
do see all the HTML is correctly there. Is there another of doing what I
need to do? I guess the DOM is complaining that the HTML may not be
compliant!?

Thanks in Advanced!!!


Dim XMLHTTP As New MSXML2.XMLHTTP40
Dim xmldoc As New MSXML2.DOMDocument40
xmldoc.async = False
xmldoc.resolveExternals = False
XMLHTTP.Open "GET", "http://my.url.com/forms/NC.htm", False
XMLHTTP.send
Debug.Print XMLHTTP.responseText
xmldoc.loadXML (XMLHTTP.responseXML.xml)
If (xmldoc.parseError.errorCode <> 0) Then
Dim Myerr
Set Myerr = xmldoc.parseError
MsgBox ("You Have Error" & Myerr.reason)
Else
Set ObjNodeList = xmldoc.getElementsByTagName("Message")
MsgBox (ObjNodeList.Item(0).Text)
End If

End Sub
Marvin Smit
2005-07-25 13:05:42 UTC
Permalink
Hi,

you are correct. HTML is not per say, XML. Within XML there are a few
rules which HTML does not adhere too (Always having a closing tag is
one of the major ones).

In this case, the chance that the HTML returned is NON XML compliant,
is very high.

You will have to "XML" ify the HTML you get. You can do this with
"TidyCOM". This allows you to parse HTML and get XHTML (the XML
compliant version of HTML).

Hope this helps,

Marvin Smit


On Tue, 12 Jul 2005 11:28:02 -0700, "Alex Wolff"
Post by Alex Wolff
Hello,
All code is at the bottom of message. I need to programmatically call an
html form, check some boxes and then submit the form, and finally parse the
returned page again. I am thinking I need to use XMLHTTPREQUEST then load
the resulting HTML as a DOM document then step thru the controls and do my
thing. Problem is I am getting an error when I load the form: "XML document
must have top level element". I do not have access to the original form
since its a private website. When I do "Debug.Print XMLHTTP.ResponseText" I
do see all the HTML is correctly there. Is there another of doing what I
need to do? I guess the DOM is complaining that the HTML may not be
compliant!?
Thanks in Advanced!!!
Dim XMLHTTP As New MSXML2.XMLHTTP40
Dim xmldoc As New MSXML2.DOMDocument40
xmldoc.async = False
xmldoc.resolveExternals = False
XMLHTTP.Open "GET", "http://my.url.com/forms/NC.htm", False
XMLHTTP.send
Debug.Print XMLHTTP.responseText
xmldoc.loadXML (XMLHTTP.responseXML.xml)
If (xmldoc.parseError.errorCode <> 0) Then
Dim Myerr
Set Myerr = xmldoc.parseError
MsgBox ("You Have Error" & Myerr.reason)
Else
Set ObjNodeList = xmldoc.getElementsByTagName("Message")
MsgBox (ObjNodeList.Item(0).Text)
End If
End Sub
Loading...