Sep 22, 2017

Getting XML Directly from PostgreSQL

There was a discussion on our company Slack recently about databases and XML, and someone pointed out that PostgreSQL has some nice XML functions. I’m a Postgres fan and knew that it had some XML functions but haven’t dug into them yet. Using my stormy-data project, I decided to play around with the data in there. Directions are included on how to get Postgres and populate the data via Flyway.

I started with something basic – just give me the states and the comments:

SELECT xmlforest(state_id,comments)
FROM storm_info

Each row is returned like:
<state_id>187</state_id><comments>Mainly D2 drought conditions persisted through January and into February. D3 drought conditions were present across portions of Henry and eastern Dale counties.</comments>
That’s nice, but we really don’t want the state_id… the state name makes more sense. And let’s make sure it has a good tag name

SELECT xmlforest(state.name AS state,comments)
FROM storm_info
JOIN state ON state.id=storm_info.state_id

The result is:

<state>ALABAMA</state><comments>Mainly D2 drought conditions persisted through January and into February. D3 drought conditions were present across portions of Henry and eastern Dale counties.</comments>

But each row is just an XML Snippet – it’s not even well-formed since it doesn’t have a root tag. So let’s put all the results in one document. Doing that is actually pretty easy with the query_to_xml function:

SELECT query_to_xml('select state.name as state,comments
from storm_info
join state on state.id=storm_info.state_id',TRUE,FALSE,'')

There are some strange arguments for this function. To break it down:

  1. The actual SQL query as text.
  2. If nulls are allowed
  3. This one is called xmlforest but what the really means is to put each row in different documents or all in one, We want it all in one so we say to turn off xmlforest.
  4. The namespace to put the result in. We just want the default namespace.

The result looks like:

<table xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<row>
<state>ALABAMA</state>
<comments>Mainly D2 drought conditions persisted through January and into February. D3 drought conditions were present across portions of Henry and eastern Dale counties.</comments>
</row>

<row>
<state>CALIFORNIA</state>
<comments>On January 1, long period swell from the Pacific created hazardous beach conditions in the area with sneaker waves and large breaking surf.</comments>
</row>

</table>

It’s not pretty – I don’t like the table tag as the root or row for each row in the data. But it’s in a document…. you can run a simple XSLT to change those values if needed.

About the Author

Mike Hostetler profile.

Mike Hostetler

Principal Technologist

Mike has almost 20 years of experience in technology. He started in networking and Unix administration, and grew into technical support and QA testing. But he has always done some development on the side and decided a few years ago to pursue it full-time. His history of working with users gives Mike a unique perspective on writing software.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Blog Posts
Feature Flags in Terraform
Feature flagging any code can be useful to developers but many don’t know how to or even that you can do it in Terraform. Some benefits of Feature Flagging your code You can enable different […]
Infrastructure as Code – The Wrong Way
You are probably familiar with the term “infrastructure as code”. It’s a great concept, and it’s gaining steam in the industry. Unfortunately, just as we had a lot to learn about how to write clean […]
Snowflake CI/CD using Jenkins and Schemachange
CI/CD and Management of Data Warehouses can be a serious challenge. In this blog you will learn how to setup CI/CD for Snowflake using Schemachange, Github, and Jenkins. For access to the code check out […]
How to get your pull requests approved more quickly
TL;DR The fewer reviews necessary, the quicker your PR gets approved. Code reviews serve an essential function on any software codebase. Done right, they help ensure correctness, reliability, and maintainability of code. On many teams, […]