Stripping (HTML) tags in XSLT

As there doesn’t seem to be any built-in function in XLST for stripping tags from strings (e.g., to remove all markup from a piece of HTML-formatted text), people came up with a recursive template-based solution, which has been posted several times on the web (e.g., here). However, I found this approach hard to use when the string to be cleaned from all tags already is stored in a variable or is created by using a xsl:value-of statement. Therefore, I transformed the existing template-based solution into a function-based one, which is a bit shorter and easier to use. Here it is:

<xsl:function name="util:strip-tags">
  <xsl:param name="text"/>
  <xsl:choose>
    <xsl:when test="contains($text, '&lt;')">
      <xsl:value-of select="concat(substring-before($text, '&lt;'),
        util:strip-tags(substring-after($text, '&gt;')))"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$text"/>
    </xsl:otherwise>
  </xsl:choose>
</xsl:function>

Note: Don’t forget to declare a namespace for this function (called util in the above code).

This entry was posted in XML. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>