Using XQuery to transform multiple xml nodes to a single xml node with comma separated string value

By Art on July 13, 2012

Not another comma separated string blog post I hear you say! I know that this topic has probably been done to death but I hope that this is a little different as it is about creating a single xml node containing a comma separated string from the values of multiple other xml nodes all within an XQuery expression.

Recently I’ve been working on some xml migrations using XQuery and one of the many types of transformations that I had to do was to take a number of identically named xml nodes within a single xml structure and then convert them to a single xml node that had a comma separated string value of all the original nodes. So imagine this xml structure:

<BookAuthors>
  <Author>John Green</Author>
  <Author>Linda Blue</Author>
  <Author>Peter Red</Author>
  <Author>Jill White</Author>
</BookAuthors>

I needed to transform the xml and create a structure that had a single node containing all four values as a comma separated string. I could have done this in a number of ways but the goal was to keep this all totally self contained within a single XQuery as I was doing a number of other transformations on the same xml structure.

The logic that I came up with in the end was pretty straight forward. It consisted of a flwor statement that concatenated each <Author> node value with a comma, unless the <Author> node was the last node in the sequence. We can test to see if a node is the last node in a sequence by using the function last() as part of a predicate to get the last node and then compare that node by using the is operator. The strings that the flwor statement returns create a sequence of atomic values which then forms the comma separated string value in the new node.

SELECT @XML.query(
<BookAuthors>
  <Authors>
    {
      for $x in /BookAuthors/Author
      return
        if (not( ($x) is (/BookAuthors/Author[last()])[1] )) then
          concat($x, ",")
        else
          string($x)
    }
  </Authors>
</BookAuthors>
)

This query returns the following:

<BookAuthors>
  <Authors>John Green, Linda Blue, Peter Red, Jill White</Authors>
</BookAuthors>

The only downside is that because we are creating a sequence of atomic values during the flwor statment, XQuery concatenates those values and pads them with a space. Note the space after each comma. I couldn’t really find a way around that and for the purposes of my migration, it didn’t matter anyway. It would be nice to know if it is possible to get around the space being added that doesn’t involve a hack like a post query REPLACE().

Enjoy!

Download Script