kevinr1

Configuring the Share HTML processing black/white list

Blog Post created by kevinr1 on Jun 19, 2012
Alfresco Share has a number of features to protect against XSS (Cross Site Scripting) attacks, session hijacking and similar. One of the most aggressive features is the automatic processing of 3rd party HTML to 'sanitise' or 'strip' out unwanted HTML tags and attributes before rendering in the page. By 3rd party HTML, I mean any HTML content that is displayed in Share that is sourced from a node content stream - such as a Wiki page, Blog post or Discussion post. So any content that may be user edited or could come from any source (not just Share itself!)



This is a well tested feature that handles all commonly known XSS attack holes and many less well known ones - including all the attack vectors listed here: http://ha.ckers.org/xss.html



One of the downsides to this, is the stripping of some otherwise useful HTML attributes and elements is mainly to support issues in legacy browsers such as IE6 and IE7. Consider the STYLE attribute - not a problem attribute you would assume, how could setting a STYLE cause an XSS attack?! Well in IE8, FireFox, Safari, Chrome etc. it can't. But in IE6/7 Microsoft in their wisdom allowed JavaScript to be inserted into a STYLE attribute (called 'CSS Expressions' - a better name would have 'CSS Hacks'). This is a potential XSS hole that only affects those legacy browsers - but the HTML stripping process cannot rely on your browser agent (which of course could be faked) so must always assume the worst and strip those STYLE attributes.



For the majority Alfresco users who discarded IE6 (or even just IE...) long ago, why should they be punished with this limitation? And it is an annoying limitation, as most of the in-line editing capabilities of TinyMCE and other in-line editors that can potentially be used with Alfresco use STYLE attributes to apply formatting to much of their generated content.



In Alfresco 3.4.9/4.0.2 and onwards, it is now possible to fully configure the black/white list of HTML tags and attributes that the HTML stripping process will use.



This is the default configuration this is applied OFTB:

      <!-- the set of HTML tags considered safe for rendering when mixing with existing client-side output -->

      <!-- NOTE: define all tags in UPPER CASE only -->

      <property name='tagWhiteList'>

         <set>

            <value>!DOCTYPE</value>

            <value>HTML</value>

            <value>HEAD</value>

            <value>BODY</value>

            <value>META</value>

            <value>BASE</value>

            <value>TITLE</value>

            <value>LINK</value>

            <value>CENTER</value>

            <value>EM</value>

            <value>STRONG</value>

            <value>SUP</value>

            <value>SUB</value>

            <value>P</value>

            <value>B</value>

            <value>I</value>

            <value>U</value>

            <value>BR</value>

            <value>UL</value>

            <value>OL</value>

            <value>LI</value>

            <value>H1</value>

            <value>H2</value>

            <value>H3</value>

            <value>H4</value>

            <value>H5</value>

            <value>H6</value>

            <value>SPAN</value>

            <value>DIV</value>

            <value>A</value>

            <value>IMG</value>

            <value>FONT</value>

            <value>TABLE</value>

            <value>THEAD</value>

            <value>TBODY</value>

            <value>TR</value>

            <value>TH</value>

            <value>TD</value>

            <value>HR</value>

            <value>DT</value>

            <value>DL</value>

            <value>DT</value>

            <value>PRE</value>

            <value>BLOCKQUOTE</value>

            <value>BUTTON</value>

            <value>CODE</value>

            <value>FORM</value>

            <value>OPTION</value>

            <value>SELECT</value>

            <value>TEXTAREA</value>

         </set>

      </property>

      <!-- The set of HTML tag attributes that are to be removed before rendering -->

      <!-- NOTE: define all attributes in UPPER CASE only -->

      <!-- IMPORTANT: JavaScript event handler attributes starting with 'on' are always removed -->

      <property name='attributeBlackList'>

         <set>

            <value>STYLE</value>

         </set>

      </property>

      <!-- The set of HTML tag attributes that are considered for sanitisation i.e. script content removed -->

      <!-- NOTE: define all attributes in UPPER CASE only -->

      <property name='attributeGreyList'>

         <set>

            <value>SRC</value>

            <value>DYNSRC</value>

            <value>LOWSRC</value>

            <value>HREF</value>

            <value>BACKGROUND</value>

         </set>

      </property>


As you can see it's quite a list. The import config for STYLE attribute processing is here:

      <property name='attributeBlackList'>

         <set>

            <value>STYLE</value>

         </set>

      </property>


So simply override the black list in the stringutils bean in your custom-slingshot-application-context.xml file - generally found in \tomcat\shared\classes\alfresco\web-extension - as detailed in previous blog posts:

<?xml version='1.0' encoding='UTF-8'?>

<!DOCTYPE beans PUBLIC '-//SPRING//DTD BEAN//EN' 'http://www.springframework.org/dtd/spring-beans-2.0.dtd'>



<beans>



   <!-- Override HTML processing black list -->

   <bean id='webframework.webscripts.stringutils' parent='webframework.webscripts.stringutils.abstract'

         class='org.springframework.extensions.webscripts.ui.common.StringUtils'>

      <property name='attributeBlackList'>

         <set></set>

      </property>

   </bean>



</beans>


Restart the Share web-application and STYLE attributes will no longer be removed by Share.

Outcomes