Nobody likes XML, except masochists and Microsoft consultants who charge by the hour.
Sometimes you are forced to deal with XML anyway, such as parsing the response from external APIs.
Parsing XML is per se a nasty business to begin with, but in this particular example, the ugliness set new records.

The only “solution” I came up with is too ugly for production use, but the alternatives were even uglier, so I had no option.

I hope there is someone out there reading this who can present a proper solution to the problem.

Let’s say you have this XML:

<testxml xmlns:ns1="http://www.example.com" xmlns:ns2="http://www.example2.com">
    <ns1:foo>
        <ns2:bar>baz</ns2:bar>
    </ns1:foo>
</testxml>

Using xpath() you extract the content of testxml:

SELECT xpath(
    '/testxml/ns1:foo',
    '<testxml xmlns:ns1="http://www.example.com" xmlns:ns2="http://www.example2.com"><ns1:foo><ns2:bar>baz</ns2:bar></ns1:foo></testxml>'::xml,
    ARRAY[
        ['ns1','http://www.example.com'],
        ['ns2','http://www.example2.com']
    ]
);
-- Result: <ns1:foo><ns2:bar>baz</ns2:bar></ns1:foo>

The returned XML is not valid since its missing the xmlns definitions,
but the PostgreSQL XML data type doesn’t complain, which is OK I guess,
a bit of quirks mode perhaps?

Because of the missing xmlns, it’s impossible to make use of this XML fragment returned.
You cannot extract any subsequent sub-elements in it using XPath.

For instance, this won’t work:

SELECT xpath(
    '/ns1:foo/ns2:bar/text()',
    (xpath(
        '/testxml/ns1:foo',
        '<testxml xmlns:ns1="http://www.example.com" xmlns:ns2="http://www.example2.com"><ns1:foo><ns2:bar>baz</ns2:bar></ns1:foo></testxml>'::xml,
        ARRAY[
            ['ns1','http://www.example.com'],
            ['ns2','http://www.example2.com']
        ]
    ))[1]
);

-- Error:
-- ERROR:  could not create XPath object
-- DETAIL:  namespace error : Namespace prefix ns1 on foo is not defined
-- <ns1:foo>
--         ^
-- namespace error : Namespace prefix ns2 on bar is not defined
--   <ns2:bar>baz</ns2:bar>
--           ^
-- Undefined namespace prefix
-- xmlXPathCompiledEval: evaluation failed

Even if you pass it the NSArray in the outer xpath() call, you don’t get ‘baz’ but nothing at all.

SELECT xpath(
    '/ns1:foo/ns2:bar/text()',
    (xpath(
        '/testxml/ns1:foo',
        '<testxml xmlns:ns1="http://www.example.com" xmlns:ns2="http://www.example2.com"><ns1:foo><ns2:bar>baz</ns2:bar></ns1:foo></testxml>'::xml,
        ARRAY[
            ['ns1','http://www.example.com'],
            ['ns2','http://www.example2.com']
        ]
    ))[1],
    ARRAY[
        ['ns1','http://www.example.com'],
        ['ns2','http://www.example2.com']
    ]
);

-- Returns:
--  xpath 
-- -------
--  {}
-- (1 row)
-- And NOT 'baz' which is what we want.

Therefore, given you have an XML “fragment” where the tags have ns:___ but without the xmlns:___=”URI” part,
is there any way to extract sub-elements, given you know the namespaces?

For instance, by wrapping the XML “fragment” inside an outer XML tag,
and specifying bogous xmlns elements for the namespaces,
I managed to hack together a work-around, probably with hundreds of flaws,
and this cannot possibly be the best way to approach this problem.

SELECT xpath_fragment('/ns1:foo/ns2:bar/text()','<ns1:foo><ns2:bar>baz</ns2:bar></ns1:foo>',ARRAY['ns1','ns2']);

 xpath_fragment 
----------------
 {baz}
(1 row)

Source code of insanely ugly “solution” to the “problem”:

CREATE OR REPLACE FUNCTION xpath_fragment(_XPath text, _XML xml, _NSNames text[]) RETURNS XML[] AS $BODY$
DECLARE
_ text;
_WrappedXML xml;
_NSArray text[][] := ARRAY[]::text[][];
BEGIN

SELECT ('<xml ' || array_to_string(array_agg('xmlns:' || unnest || '=" "'),' ') || '>' || _XML::text || '</xml>')::text INTO _WrappedXML FROM unnest(_NSNames);

FOR _ IN SELECT unnest(_NSNames)
LOOP
    _NSArray := _NSArray || ARRAY[[_, ' ']];
END LOOP;

RETURN xpath('/xml' || _XPath, _WrappedXML, _NSArray);
END;
$BODY$ LANGUAGE plpgsql IMMUTABLE;

Ideas, anyone?