Home > Archive > MS SQL XML > September 2005 > openxml seemingly fails to process external entity









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author openxml seemingly fails to process external entity
HolmesDM

2005-09-02, 8:24 pm

I am having a problem getting OPENXML to process some xml that contains an
"external entity" in SQL Server 2000.

Background, which you probably already know:
----------
Xml uses "external entities" the way the C language uses "#include": when
an xml processor finds the directive, it refers to an external file whose
contents are to be processed at that point.


I have created a minimal example that illustrates the problem.

1. First, here is some SQL that correctly processes some XML that does
*not* use external entities. The XML is stored in a variable (@testXml), and
a select statement reads this XML and reports some basic facts about the only
data in it, namely, that there is a tag called "DatastreamDownload ":

-- demonstrate validity of simple xml
declare @testXml varchar(8000)
set @testXml ='
<?xml version="1.0" ?>
<DatastreamDownload />
'

declare @idoc int
exec sp_xml_preparedocume
nt @idoc output, @testXml

select *
from openxml( @idoc, '/ DatastreamDownload',
0)

exec sp_xml_removedocumen
t @idoc


Here are the beginnings of the lines I see in SQL Query Analyzer:
id parentid nodetype localname
-------------------- -------------------- ----------- -------------------
0 NULL 1 DatastreamDownload

(1 row(s) affected)


2. I then stored the XML from @testXml in a file, simple.xml. The
following 2 lines show the contents of the file, except that I've added 4
spaces to the beginning of each of the lines in this listing (that is, the
prefixed spaces are not in the file):
<?xml version="1.0" ?>
<DatastreamDownload />


3. I created a second file of XML, simpleEntity.xml, that includes
simple.xml by using a external entity. The following 11 lines show the
contents of the file, except that I've added 4 spaces to the beginning of
each of the lines in this listing (that is, the prefixed spaces are not in
the file):
<?xml version="1.0" ?>
<!DOCTYPE just4ExternalEntity
[
<!ENTITY externalEntity SYSTEM "simple.xml">
]
>


<just4ExternalEntity>
<workingTag />
&externalEntity;
</just4ExternalEntity>
You can verify that this external entity stuff is done properly by opening
this second file in Internet Explorer. I see something like this, except
with better colors:
<?xml version="1.0" ?>
<!DOCTYPE just4ExternalEntity (View Source for full doctype...)>
- <just4ExternalEntity>
<workingTag />
<DatastreamDownload />
</just4ExternalEntity>

4. Finally, I copied the working xml from simpleEntity.xml into a SQL
variable, @xmlDoc. I had to prefix a network path to "simple.xml", to give
the SQL Server a chance of locating simple.xml. I've represented this
network path by "\\unc..." below, although in my actual test I used a valid
(and triple-tested) unc.

Openxml can partially process this code. The first select statement below
reports on the "workingTag" tag, which is defined in the @xmlDoc variable ,
but fails to report on the sibling "DatastreamDownload" tag, which is defined
in the external entity.

-- attempt to read the same xml from a disk file using an external entity
declare @xmlDoc varchar(1000)
set @xmlDoc = -- code from simpleEntity.xml
'
<?xml version="1.0" ?>
<!DOCTYPE just4ExternalEntity
[
<!ENTITY externalEntity SYSTEM "\\unc...\simple.xml">
]
>


<just4ExternalEntity>
<workingTag />
&externalEntity;
</just4ExternalEntity>
'
declare @jdoc int
declare @rc int
exec @rc = sp_xml_preparedocume
nt @jdoc output, @xmlDoc

select @rc as resultCode

-- lists row, as expected
select *
from openxml( @jdoc, '/just4ExternalEntity/workingTag', 0)

-- fails to list row
select *
from openxml( @jdoc, '/just4ExternalEntity/ DatastreamDownload',
0)

exec sp_xml_removedocumen
t @jdoc



Here are the beginnings of the lines I see in SQL Query Analyzer:
resultCode
-----------
0

(1 row(s) affected)

id parentid nodetype localname
-------------------- -------------------- ----------- -----------
7 0 1 workingTag

(1 row(s) affected)

id parentid nodetype localname
-------------------- -------------------- ----------- -----------

(0 row(s) affected)

I was expecting to see 1 row in the last result set above, with
"DatastreamDownload" in the "localname" column.

So openxml is failing to process something that Internet Explorer has no
trouble with. As far as I know (and I'm no expert) external entities are a
standard part of XML, and a standards-conforming XML processor should handle
them. So it seems very likely that the processor used by openxml does handle
them. It's even possible that openxml and IE are using the same processor,
although that would quite a feat of coordination for a company the size of
Microsoft.

Since external entities are standard, and since IE issues a security warning
when I use one via a network unc, I'm suspecting that there is a privilege
problem in getting openxml to read the external entity.

Thanks for any help.
Michael Rys [MSFT]

2005-09-02, 8:24 pm

External entities are not supported with OpenXML for a variety of security
related reasons.

Best is to resolve them on the client/midtier side before sending the XML to
the server.

Best regards
Michael

"HolmesDM" < HolmesDM@discussions
.microsoft.com> wrote in message
news:8AF86E16-03D4-49A7-9A20- 8133C2B0E109@microso
ft.com...
>I am having a problem getting OPENXML to process some xml that contains an
> "external entity" in SQL Server 2000.
>
> Background, which you probably already know:
> ----------
> Xml uses "external entities" the way the C language uses "#include": when
> an xml processor finds the directive, it refers to an external file whose
> contents are to be processed at that point.
>
>
> I have created a minimal example that illustrates the problem.
>
> 1. First, here is some SQL that correctly processes some XML that does
> *not* use external entities. The XML is stored in a variable (@testXml),
> and
> a select statement reads this XML and reports some basic facts about the
> only
> data in it, namely, that there is a tag called "DatastreamDownload ":
>
> -- demonstrate validity of simple xml
> declare @testXml varchar(8000)
> set @testXml ='
> <?xml version="1.0" ?>
> <DatastreamDownload />
> '
>
> declare @idoc int
> exec sp_xml_preparedocume
nt @idoc output, @testXml
>
> select *
> from openxml( @idoc, '/ DatastreamDownload',
0)
>
> exec sp_xml_removedocumen
t @idoc
>
>
> Here are the beginnings of the lines I see in SQL Query Analyzer:
> id parentid nodetype localname
> -------------------- -------------------- ----------- -------------------
> 0 NULL 1
> DatastreamDownload
>
> (1 row(s) affected)
>
>
> 2. I then stored the XML from @testXml in a file, simple.xml. The
> following 2 lines show the contents of the file, except that I've added 4
> spaces to the beginning of each of the lines in this listing (that is, the
> prefixed spaces are not in the file):
> <?xml version="1.0" ?>
> <DatastreamDownload />
>
>
> 3. I created a second file of XML, simpleEntity.xml, that includes
> simple.xml by using a external entity. The following 11 lines show the
> contents of the file, except that I've added 4 spaces to the beginning of
> each of the lines in this listing (that is, the prefixed spaces are not in
> the file):
> <?xml version="1.0" ?>
> <!DOCTYPE just4ExternalEntity
> [
> <!ENTITY externalEntity SYSTEM "simple.xml">
> ]
>
> <just4ExternalEntity>
> <workingTag />
> &externalEntity;
> </just4ExternalEntity>
> You can verify that this external entity stuff is done properly by opening
> this second file in Internet Explorer. I see something like this, except
> with better colors:
> <?xml version="1.0" ?>
> <!DOCTYPE just4ExternalEntity (View Source for full doctype...)>
> - <just4ExternalEntity>
> <workingTag />
> <DatastreamDownload />
> </just4ExternalEntity>
>
> 4. Finally, I copied the working xml from simpleEntity.xml into a SQL
> variable, @xmlDoc. I had to prefix a network path to "simple.xml", to
> give
> the SQL Server a chance of locating simple.xml. I've represented this
> network path by "\\unc..." below, although in my actual test I used a
> valid
> (and triple-tested) unc.
>
> Openxml can partially process this code. The first select statement below
> reports on the "workingTag" tag, which is defined in the @xmlDoc variable
> ,
> but fails to report on the sibling "DatastreamDownload" tag, which is
> defined
> in the external entity.
>
> -- attempt to read the same xml from a disk file using an external
> entity
> declare @xmlDoc varchar(1000)
> set @xmlDoc = -- code from simpleEntity.xml
> '
> <?xml version="1.0" ?>
> <!DOCTYPE just4ExternalEntity
> [
> <!ENTITY externalEntity SYSTEM "\\unc...\simple.xml">
> ]
>
> <just4ExternalEntity>
> <workingTag />
> &externalEntity;
> </just4ExternalEntity>
> '
> declare @jdoc int
> declare @rc int
> exec @rc = sp_xml_preparedocume
nt @jdoc output, @xmlDoc
>
> select @rc as resultCode
>
> -- lists row, as expected
> select *
> from openxml( @jdoc, '/just4ExternalEntity/workingTag', 0)
>
> -- fails to list row
> select *
> from openxml( @jdoc, '/just4ExternalEntity/ DatastreamDownload',

> 0)
>
> exec sp_xml_removedocumen
t @jdoc
>
>
>
> Here are the beginnings of the lines I see in SQL Query Analyzer:
> resultCode
> -----------
> 0
>
> (1 row(s) affected)
>
> id parentid nodetype localname
> -------------------- -------------------- ----------- -----------
> 7 0 1 workingTag
>
> (1 row(s) affected)
>
> id parentid nodetype localname
> -------------------- -------------------- ----------- -----------
>
> (0 row(s) affected)
>
> I was expecting to see 1 row in the last result set above, with
> "DatastreamDownload" in the "localname" column.
>
> So openxml is failing to process something that Internet Explorer has no
> trouble with. As far as I know (and I'm no expert) external entities are
> a
> standard part of XML, and a standards-conforming XML processor should
> handle
> them. So it seems very likely that the processor used by openxml does
> handle
> them. It's even possible that openxml and IE are using the same
> processor,
> although that would quite a feat of coordination for a company the size of
> Microsoft.
>
> Since external entities are standard, and since IE issues a security
> warning
> when I use one via a network unc, I'm suspecting that there is a privilege
> problem in getting openxml to read the external entity.
>
> Thanks for any help.



HolmesDM

2005-09-06, 1:25 pm

Thanks for the definitive answer, even if it's not exactly the one I'd hoped
for.

"Michael Rys [MSFT]" wrote:

> External entities are not supported with OpenXML for a variety of security
> related reasons.
>
> Best is to resolve them on the client/midtier side before sending the XML to
> the server.
>
> Best regards
> Michael
>
> "HolmesDM" < HolmesDM@discussions
.microsoft.com> wrote in message
> news:8AF86E16-03D4-49A7-9A20- 8133C2B0E109@microso
ft.com...
>
>
>

Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com