Trying to optimize a procedure that parses GML strings to Oracle sdo_geometry.
<gml:Polygon >
<gml:exterior>
<gml:LinearRing>
<gml:posList > [coordinates] </gml:posList>
</gml:LinearRing>
</gml:exterior>
<gml:interior>
<gml:Ring>
<gml:curveMember>
<gml:Curve>
<gml:segments>
<gml:LineStringSegment>
<gml:posList > [coordinates] </gml:posList>
</gml:LineStringSegment>
<gml:Arc>
<gml:posList > [coordinates] </gml:posList>
</gml:Arc>
<gml:LineStringSegment>
<gml:posList > ... </gml:posList>
</gml:LineStringSegment>
</gml:segments>
</gml:Curve>
</gml:curveMember>
</gml:Ring>
</gml:interior>
<gml:interior>
<gml:LinearRing>
<gml:posList > [coordinates] </gml:posList>
</gml:LinearRing>
</gml:interior>
</gml:Polygon>
The polygon consists of 1 exterior and 0-many interior elements.
Our current solution works as follows.
First it extracts all exterior and interior elements and processes those separately
cursor c_exterior(p_xml xmltype) is
select t.*
from xmltable('//exterior' passing p_xml columns exterior xmltype path '/') as t;
cursor c_interior(p_xml xmltype) is
select t.*
from xmltable('//interior' passing p_xml columns interior xmltype path '/') as t;
begin
-- process each exterior/interior ring.
for r_exterior in c_exterior(p_xml)
loop
process(r_exterior.exterior, [other params]);
end loop;
for r_interior in c_interior(p_xml)
loop
process(r_interior.interior, [other params]);
end loop;
The processing of the exterior and interior element is as follows:
select t.*
from xmltable('for $d in //node() where exists($d/posList) return $d' passing p_xml columns
poslist clob path './posList'
,parent varchar2(100) path 'name()') as t;
This gets the list of coordinates and the name of the immediate parent of the poslist element.
Question: Since this is a two step process it may not be optimized for speed. I'm looking for ways to get the information I need from the GML in 1 query. But I can't figure out how to do that. Mainly because of the varying level into the XML of the poslist element and generating the number of the exterior/interior ring.
Information I need:
So from above example:
exterior, 1, gml:LinearRing, [coordinates]
interior, 2, gml:LineStringSegment, [coordinates]
interior, 2, gml:Arc, [coordinates]
interior, 2, gml:LineStringSegment, [coordinates]
interior, 3, gml:LinearRing, [coordinates]
You can use chained XMLTable calls:
select x1.type, x1.ring_num, x2.parent, x2.coordinates
from xmltable(
xmlnamespaces(default 'http://www.opengis.net/gml/'),
'/Polygon/*'
passing p_xml
columns type varchar2(8) path './local-name()',
ring_num for ordinality,
nodes xmltype path '//posList/..'
) x1
cross join xmltable (
xmlnamespaces(default 'http://www.opengis.net/gml/'),
'/*'
passing x1.nodes
columns parent varchar2(21) path './name()',
coordinates varchar2(30) path 'posList'
) x2;
(adjusting output data types and sizes as needed for your real data, of course)
The x1 table gets the exterior/interior nodes, adds an ordinal column for the 'ring number', and includes an XMLType column that has all the nodes that are parents of a posList
node. That is then passed into x2 which extracts the multiple posList
coordinates and parents.
Demo using a CTE to provide the XML document instead of PL/SQL:
with t (p_xml) as (
select xmltype('<gml:Polygon xmlns:gml="http://www.opengis.net/gml/">
<gml:exterior>
<gml:LinearRing>
<gml:posList > [coordinates] </gml:posList>
</gml:LinearRing>
</gml:exterior>
<gml:interior>
<gml:Ring>
<gml:curveMember>
<gml:Curve>
<gml:segments>
<gml:LineStringSegment>
<gml:posList > [coordinates] </gml:posList>
</gml:LineStringSegment>
<gml:Arc>
<gml:posList > [coordinates] </gml:posList>
</gml:Arc>
<gml:LineStringSegment>
<gml:posList > ... </gml:posList>
</gml:LineStringSegment>
</gml:segments>
</gml:Curve>
</gml:curveMember>
</gml:Ring>
</gml:interior>
<gml:interior>
<gml:LinearRing>
<gml:posList > [coordinates] </gml:posList>
</gml:LinearRing>
</gml:interior>
</gml:Polygon>') from dual
)
select x1.type, x1.ring_num, x2.parent, x2.coordinates
from t
cross join xmltable(
xmlnamespaces(default 'http://www.opengis.net/gml/'),
'/Polygon/*'
passing t.p_xml
columns type varchar2(8) path './local-name()',
ring_num for ordinality,
nodes xmltype path '//posList/..'
) x1
cross join xmltable (
xmlnamespaces(default 'http://www.opengis.net/gml/'),
'/*'
passing x1.nodes
columns parent varchar2(21) path './name()',
coordinates varchar2(30) path 'posList'
) x2;
which gets:
TYPE RING_NUM PARENT COORDINATES
-------- ---------- --------------------- ------------------------------
exterior 1 gml:LinearRing [coordinates]
interior 2 gml:LineStringSegment [coordinates]
interior 2 gml:Arc [coordinates]
interior 2 gml:LineStringSegment ...
interior 3 gml:LinearRing [coordinates]
Switching between name()
and local-name()
lets you include or omit the namespace from the value.
You could also do this with a single XMLTable, but getting the 'ring number' is a bit of a pain; here I'm injecting an attribute into the exterior/interior node - which may well be slower than the chained XMLTable approach, but possibly worth testing on your data:
select x.type, x.ring_num, x.parent, x.coordinates
from xmltable(
xmlnamespaces(default 'http://www.opengis.net/gml/'),
'copy $i := /Polygon
modify
for $j in $i/(exterior | interior)
return (insert node attribute pos { count($j/preceding-sibling::*) + 1 } into $j)
return $i//posList'
passing t.p_xml
columns type varchar2(8) path 'local-name(./ancestor::exterior | ./ancestor::interior)',
ring_num number path '(./ancestor::exterior/@pos | ./ancestor::interior/@pos)',
parent varchar2(21) path '../name()',
coordinates varchar2(30) path '.'
) x;
With a CTE for sample data again that gets the same results. I'm not sure if there's another way to get - effectively - the iteration count in the FLWOR loop.