|
Home > Archive > MS SQL XML > November 2005 > SQL Server XML Query Help Required.
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
SQL Server XML Query Help Required.
|
|
| jamie.downs@risk.sungard.com 2005-11-09, 1:23 pm |
| Hi All,
If I have the following:
DROP TABLE #Table1
GO
CREATE TABLE #Table1(
CustID INT,
XMLCol XML)
INSERT INTO #Table1(CustID, XMLCol) VALUES (3, '<AdminAttributes>
<Attribute ID="1214">
<ID>1214</ID>
<Value>100000051</Value>
</Attribute>
<Attribute ID="1215">
<ID>1215</ID>
<Value>100000057</Value>
</Attribute>
<Attribute ID="1216">
<ID>1216</ID>
<Value>53048</Value>
</Attribute>
<Attribute ID="1217">
<ID>1217</ID>
<Value>179</Value>
</Attribute>
<Attribute ID="1219">
<ID>1219</ID>
<Value>400</Value>
</Attribute>
<Attribute ID="1224">
<ID>1224</ID>
<Value>4575</Value>
</Attribute>
<Attribute ID="1225">
<ID>1225</ID>
<Value>40239004</Value>
</Attribute>
<Attribute ID="1226">
<ID>1226</ID>
<Value>0</Value>
</Attribute>
<Attribute ID="1227">
<ID>1227</ID>
<Value>0</Value>
</Attribute>
<Attribute ID="1228">
<ID>1228</ID>
<Value>0</Value>
</Attribute>
<Attribute ID="1230">
<ID>1230</ID>
<Value>SONDIS NV</Value>
</Attribute>
<Attribute ID="1232">
<ID>1232</ID>
<Value>50</Value>
</Attribute>
<Attribute ID="1235">
<ID>1235</ID>
<Value>162</Value>
</Attribute>
<Attribute ID="1238">
<ID>1238</ID>
<Value>0</Value>
</Attribute>
<Attribute ID="1536">
<ID>1536</ID>
<Value>0</Value>
</Attribute>
<Attribute ID="1204">
<ID>1204</ID>
<Value>0</Value>
</Attribute>
<Attribute ID="1213">
<ID>1213</ID>
<Value>SONDIS NV</Value>
</Attribute>
<Attribute ID="1218">
<ID>1218</ID>
<Value>0</Value>
</Attribute>
</AdminAttributes>')
I want to extract CustID and values based on specific ID's. Can
someone please suggest the best way of doing it. I have tried the
following but it seems to be rather slow. When I say slow I mean
slower than retriveing the data from a relational structure we
currently have.
SELECT CA2.CustID,
CAST(CustAttributes.query('data(/AdminAttributes[1]/Attribute[ID="1216"]/Value)')
AS VARCHAR(100))AS IndustryID,
CAST(CustAttributes.query('data(/AdminAttributes[1]/Attribute[ID="1217"]/Value)')
AS VARCHAR(100))AS LocationID,
CAST(CustAttributes.query('data(/AdminAttributes[1]/Attribute[ID="1214"]/Value)')
AS VARCHAR(100))AS TypeID,
NULL AS LEFCompPermitted,
CAST(CustAttributes.query('data(/AdminAttributes[1]/Attribute[ID="1215"]/Value)')
AS VARCHAR(100))AS RegTypeID
FROM customertest CA2
I have also tried:
SELECT CA2.CustID,
t2.c.value('Value[1]', 'VARCHAR(100)') AS IndustryID,
t3.c.value('Value[1]', 'VARCHAR(100)') AS LocationID,
t4.c.value('Value[1]', 'VARCHAR(100)') AS TypeID,
t5.c.value('Value[1]', 'VARCHAR(100)') AS RegTypeID
FROM customertest CA2
CROSS APPLY
CustAttributes.nodes('/AdminAttributes/Attribute[ID="1216"]') T2(c) --
T stands for table (c) stands for column
CROSS APPLY
CustAttributes.nodes('/AdminAttributes/Attribute[ID="1217"]') T3(c) --
T stands for table (c) stands for column
CROSS APPLY
CustAttributes.nodes('/AdminAttributes/Attribute[ID="1214"]') T4(c) --
T stands for table (c) stands for column
CROSS APPLY
CustAttributes.nodes('/AdminAttributes/Attribute[ID="1215"]') T5(c) --
T stands for table (c) stands for column
which is even slower.
I would really appreciate some help on this.
Thanks
Jamie
| |
| Babu Krishnaswamy [MSFT] 2005-11-12, 3:23 am |
| Here are some suggestions to improve the performance:
1) First of all your second implementation (nodes/value) should be in theory faster than the query method for scenarios like this.
2) One thing you can try to improve the perf of the 2nd query is to modify the predicates from [ID="1215"] to [ID/text() = "1215"] or [@ID="1215]. This should improve the performance noticeably. There are a few XML performance articles in MSDN explaining
the effect of text().
3) The way you have written the 2nd query might lead to more rows than you expect.
For instance consider the following query:
SELECT CA2.CustID,
t2.c.value('Value[1]', 'VARCHAR(100)') AS IndustryID,
t3.c.value('Value[1]', 'VARCHAR(100)') AS LocationID
FROM #Table1 CA2
CROSS APPLY
XmlCol.nodes('/AdminAttributes/Attribute[ID/text()="1214"]') T2(c)
CROSS APPLY
XmlCol.nodes('/AdminAttributes/Attribute[ID/text()="1215"]') T3(c)
A) When the input data is the following
INSERT INTO #Table1(CustID, XMLCol) VALUES (3, '<AdminAttributes>
<Attribute ID="1214">
<ID>1214</ID>
<Value>100000051</Value>
</Attribute>
<Attribute ID="1215">
<ID>1215</ID>
<Value>100000057</Value>
</Attribute>
</AdminAttributes>')
the output will be 1 row
B) When the input data is the following
INSERT INTO #Table1(CustID, XMLCol) VALUES (3, '<AdminAttributes>
<Attribute ID="1214">
<ID>1214</ID>
<Value>100000051</Value>
</Attribute>
<Attribute ID="1215">
<ID>1215</ID>
<Value>100000057</Value>
</Attribute>
<Attribute ID="1214">
<ID>1214</ID>
<Value>100000051</Value>
</Attribute>
<Attribute ID="1215">
<ID>1215</ID>
<Value>100000057</Value>
</Attribute>
</AdminAttributes>')
The output will be 4 rows. This is the semantics of that query. In order to always to get one row (say the first matching row) and also to improve the performance if u know that the IDs are unique is to specify an ordinal [1] like
XmlCol.nodes('(/AdminAttributes/Attribute[@ID="1216"])[1]') T2(c)
4) Also adding text() in the path expression in the value method will also be useful for the same reason as 2. So the following query might give u a better performance:
SELECT CA2.CustID,
t2.c.value('(Value/text())[1]', 'VARCHAR(100)') AS IndustryID,
t3.c.value('(Value/text())[1]', 'VARCHAR(100)') AS LocationID,
t4.c.value('(Value/text())[1]', 'VARCHAR(100)') AS TypeID,
t5.c.value('(Value/text())[1]', 'VARCHAR(100)') AS RegTypeID
FROM #Table1 CA2
CROSS APPLY
XmlCol.nodes('(/AdminAttributes/Attribute[@ID="1216"])[1]') T2(c) -- T stands for table (c) stands for column
CROSS APPLY
XmlCol.nodes('(/AdminAttributes/Attribute[@ID="1217"])[1]') T3(c) -- T stands for table (c) stands for column
CROSS APPLY
XmlCol.nodes('(/AdminAttributes/Attribute[@ID="1214"])[1]') T4(c) -- T stands for table (c) stands for column
CROSS APPLY
XmlCol.nodes('(/AdminAttributes/Attribute[@ID="1215"])[1]') T5(c) -- T stands for table (c) stands for column
Hope the suggestions help.
By the way, when u say the query is slower than pulling relational data could u give a rough idea of how much slower; in what environment you are running the queries and how u r running the queries?
Thanks
Babu
-----Original Message-----
From: jamie.downs@risk.sungard.com
Posted At: Wednesday, November 09, 2005 9:46 AM
Posted To: microsoft.public.sqlserver.xml
Conversation: SQL Server XML Query Help Required.
Subject: SQL Server XML Query Help Required.
Hi All,
If I have the following:
DROP TABLE #Table1
GO
CREATE TABLE #Table1(
CustID INT,
XMLCol XML)
INSERT INTO #Table1(CustID, XMLCol) VALUES (3, '<AdminAttributes>
<Attribute ID="1214">
<ID>1214</ID>
<Value>100000051</Value>
</Attribute>
<Attribute ID="1215">
<ID>1215</ID>
<Value>100000057</Value>
</Attribute>
<Attribute ID="1216">
<ID>1216</ID>
<Value>53048</Value>
</Attribute>
<Attribute ID="1217">
<ID>1217</ID>
<Value>179</Value>
</Attribute>
<Attribute ID="1219">
<ID>1219</ID>
<Value>400</Value>
</Attribute>
<Attribute ID="1224">
<ID>1224</ID>
<Value>4575</Value>
</Attribute>
<Attribute ID="1225">
<ID>1225</ID>
<Value>40239004</Value>
</Attribute>
<Attribute ID="1226">
<ID>1226</ID>
<Value>0</Value>
</Attribute>
<Attribute ID="1227">
<ID>1227</ID>
<Value>0</Value>
</Attribute>
<Attribute ID="1228">
<ID>1228</ID>
<Value>0</Value>
</Attribute>
<Attribute ID="1230">
<ID>1230</ID>
<Value>SONDIS NV</Value>
</Attribute>
<Attribute ID="1232">
<ID>1232</ID>
<Value>50</Value>
</Attribute>
<Attribute ID="1235">
<ID>1235</ID>
<Value>162</Value>
</Attribute>
<Attribute ID="1238">
<ID>1238</ID>
<Value>0</Value>
</Attribute>
<Attribute ID="1536">
<ID>1536</ID>
<Value>0</Value>
</Attribute>
<Attribute ID="1204">
<ID>1204</ID>
<Value>0</Value>
</Attribute>
<Attribute ID="1213">
<ID>1213</ID>
<Value>SONDIS NV</Value>
</Attribute>
<Attribute ID="1218">
<ID>1218</ID>
<Value>0</Value>
</Attribute>
</AdminAttributes>')
I want to extract CustID and values based on specific ID's. Can
someone please suggest the best way of doing it. I have tried the
following but it seems to be rather slow. When I say slow I mean
slower than retriveing the data from a relational structure we
currently have.
SELECT CA2.CustID,
CAST(CustAttributes.query('data(/AdminAttributes[1]/Attribute[ID="1216"]/Value)')
AS VARCHAR(100))AS IndustryID,
CAST(CustAttributes.query('data(/AdminAttributes[1]/Attribute[ID="1217"]/Value)')
AS VARCHAR(100))AS LocationID,
CAST(CustAttributes.query('data(/AdminAttributes[1]/Attribute[ID="1214"]/Value)')
AS VARCHAR(100))AS TypeID,
NULL AS LEFCompPermitted,
CAST(CustAttributes.query('data(/AdminAttributes[1]/Attribute[ID="1215"]/Value)')
AS VARCHAR(100))AS RegTypeID
FROM customertest CA2
I have also tried:
SELECT CA2.CustID,
t2.c.value('Value[1]', 'VARCHAR(100)') AS IndustryID,
t3.c.value('Value[1]', 'VARCHAR(100)') AS LocationID,
t4.c.value('Value[1]', 'VARCHAR(100)') AS TypeID,
t5.c.value('Value[1]', 'VARCHAR(100)') AS RegTypeID
FROM customertest CA2
CROSS APPLY
CustAttributes.nodes('/AdminAttributes/Attribute[ID="1216"]') T2(c) --
T stands for table (c) stands for column
CROSS APPLY
CustAttributes.nodes('/AdminAttributes/Attribute[ID="1217"]') T3(c) --
T stands for table (c) stands for column
CROSS APPLY
CustAttributes.nodes('/AdminAttributes/Attribute[ID="1214"]') T4(c) --
T stands for table (c) stands for column
CROSS APPLY
CustAttributes.nodes('/AdminAttributes/Attribute[ID="1215"]') T5(c) --
T stands for table (c) stands for column
which is even slower.
I would really appreciate some help on this.
Thanks
Jamie
|
|
|
|
|