Grouping

Contact Us or call 1-877-932-8228
Grouping

Grouping

It is often necessary to group elements based on certain attributes or subelements. Consider our XML document that contains the largest cities in America. We might want to output these cities by state as shown below.

Code Sample:

Keys/Demos/CityByStateOutput.html
<?xml version="1.0" encoding="UTF-8"?>
<html>
<head>
<title>Most Populated Cities by State</title>
</head>
<body>
<h2>1. California - 17 Cities</h2>
<ol>
<li>Los Angeles</li>
<li>San Diego</li>
<li>San Jose</li>
---- C O D E   O M I T T E D ----

</ol>
<h2>2. Texas - 12 Cities</h2>
<ol>
<li>Houston</li>
<li>Dallas</li>
<li>San Antonio</li>
---- C O D E   O M I T T E D ----

</ol>
<h2>3. Arizona - 5 Cities</h2>
<ol>
<li>Phoenix</li>
<li>Tucson</li>
<li>Mesa</li>
---- C O D E   O M I T T E D ----

</body>
</html>

In XSLT 1.0, there is no tag meant for grouping, but there are at least a couple of ways to get the job done. The most efficient way to do this is to use a method called the Muencian method after Steve Muench who created it. This method requires a good understanding of the generate-id() function discussed below.

The generate-id() Function

The generate-id() function creates ids that uniquely identify a node in an XML document. It takes a single argument: the node-set to identify. The value of the generated id is based on the first node in the node-set, and the same id will be generated no matter how the node is retrieved. To illustrate this, take a look at the following two files.

Code Sample:

Keys/Demos/GenerateId.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="GenerateID.xsl"?>
<Songs Source="http://classicrock.about.com/library/misc/ptop500.htm">
	<Song Artist="Led Zeppelin">Stairway To Heaven</Song>
	<Song Artist="Rolling Stones">Satisfaction</Song>
	<Song Artist="Derek And The Dominoes">Layla</Song>
---- C O D E   O M I T T E D ----

</Songs>

Code Sample:

Keys/Demos/GenerateId.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
			 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
	<xsl:output method="xml" indent="yes"/>
	<xsl:key name="keySong" match="Song" use="@Artist"/>
	<xsl:template match="/">
		<IDs>
			<ID>
				<xsl:value-of select="Songs/Song[2]"/>
				<xsl:text>: </xsl:text>
				<xsl:value-of select="generate-id(Songs/Song[2])"/>
			</ID>
			<ID>
				<xsl:value-of select="Songs/Song[.='Satisfaction']"/>
				<xsl:text>: </xsl:text>
				<xsl:value-of select="generate-id(Songs/Song[.='Satisfaction'])"/>
			</ID>
			<ID>
				<xsl:value-of select="Songs/Song[@Artist='Rolling Stones']"/>
				<xsl:text>: </xsl:text>
				<xsl:value-of 
					select="generate-id(Songs/Song[@Artist='Rolling Stones'])"/>
			</ID>
			<ID>
				<xsl:value-of select="key('keySong','Rolling Stones')"/>
				<xsl:text>: </xsl:text>
				<xsl:value-of select="generate-id(key('keySong','Rolling Stones'))"/>
			</ID>
			<ID>
				<xsl:value-of select="Songs/Song[starts-with(@Artist,'R')]"/>
				<xsl:text>: </xsl:text>
				<xsl:value-of 
					select="generate-id(Songs/Song[starts-with(@Artist,'R')])"/>
			</ID>
		</IDs>
	</xsl:template>
</xsl:stylesheet>

The generate-id() function is passed five different XPaths and returns the same output each time. Again, this is because the first element in the node-set is always the same.

Code Sample:

Keys/Demos/GenerateIdOutput.xml
<?xml version="1.0" encoding="UTF-8"?>
<IDs>
<ID>Satisfaction: IDAMAK4B</ID>
<ID>Satisfaction: IDAMAK4B</ID>
<ID>Satisfaction: IDAMAK4B</ID>
<ID>Satisfaction: IDAMAK4B</ID>
<ID>Satisfaction: IDAMAK4B</ID>
</IDs>

As shown below, we can use this knowledge to find the first City element with a specific State attribute.

Code Sample:

Keys/Demos/BiggestCitiesByState.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
				xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
	<xsl:output method="xml" indent="yes"/>
	<xsl:key name="keyCities" match="City" use="@State"/>
	
	<xsl:template match="/">
		<html>
			<head>
				<title>Most Populated Cities by State</title>
			</head>
			<body>
				<ol>
				<xsl:for-each 
					select="//City[generate-id(.)=generate-id(key('keyCities',@State))]">
					<!--
						The first generate-id() above returns an id for the current City.
						The second generate-id() above returns an id for the first 
							City with a State attribute that is the same as the current 
							City's State attribute.
						These values will only be equal for the first City elements 
							with a specific State attribute.
						So, this for-each loops through the each City first to specify
							a particular State.
					-->
					<li>
						<xsl:value-of select="."/>, <xsl:value-of select="@State"/>
					</li>
				</xsl:for-each>
				</ol>
			</body>
		</html>
	</xsl:template>
	
</xsl:stylesheet>

As indicated in the comment:

  1. The first generate-id() above returns an id for the current City.
  2. The second generate-id() above returns an id for the first City with a State attribute that is the same as the current City's State attribute.
  3. These values will only be equal for the first City elements with a specific State attribute. So, this for-each loops through each City first to specify a particular State.
  4. As the XML document lists the cities in descending order by population, the most populated cities in each state are returned.

Using generate-id for Grouping

Okay. Now let's get back to grouping. We've learned how to use generate-id() and keys pick just one city from every state. Another way to look at this is that we've learned how to retrieve each distinct state once and only once. To group by state, we need loop through the City elements for each state and return all the cities in that state. But we don't need to go through the whole node tree again, because we have a key that contains all City elements by State. Take a look at the following code.

Code Sample:

Keys/Demos/GroupCities.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
				xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
	<xsl:output method="xml" indent="yes"/>
	<xsl:key name="keyCities" match="City" use="@State"/>
	<xsl:key name="keyState" match="State" use="@Abbr"/>
	
	<xsl:template match="/">
		<html>
			<head>
				<title>Most Populated Cities by State</title>
			</head>
			<body>
				<xsl:for-each 
					select="//City[generate-id(.)=generate-id(key('keyCities',@State))]">
					<!--
						The first generate-id() above returns an id for the current City.
						The second generate-id() above returns an id for the first 
							City with a State attribute that is the same as the current 
							City's State attribute.
						These values will only be equal for the first City elements 
							with a specific State attribute.
						So, this for-each loops through the each City first to specify
							a particular State.
					-->
					<xsl:sort select="count(key('keyCities',@State))" 
						order="descending" data-type="number"/>
					<h2>
						<xsl:value-of select="position()"/>
						<xsl:text>. </xsl:text>
						<xsl:value-of select="key('keyState',@State)"/>
						<xsl:text> - </xsl:text>
						<xsl:value-of select="count(key('keyCities',@State))"/>
						<xsl:text> Cities</xsl:text>
					</h2>
					<ol>
						<xsl:for-each select="key('keyCities',@State)">
							<li><xsl:value-of select="."/></li>
						</xsl:for-each>
						<!--
							This nested for-each loops through the City elements that have
								the same value for the State attribute as the current City in
								the outer for-each loop.
						-->
					</ol>
				</xsl:for-each>
			</body>
		</html>
	</xsl:template>
	
</xsl:stylesheet>

Remember that the key() function returns a node-set. We loop through this node-set with the following code:

<xsl:for-each select="key('keyCities',@State)"> <li><xsl:value-of select="."/></li> </xsl:for-each>

We also sort the outer loop by state according to the number of cities per state. This is done by counting the number of City elements matched by the key() function:

<xsl:sort select="count(key('keyCities',@State))" order="descending" data-type="number"/>
Next