After adding each of the libraries to the test project and testing I found out that the library that was causing the issue was the following:
xmlbeans-2.3.0.jar
Once I removed this library from the userlib the sanitize action worked fine.
I believe that this is a legacy library from the excel importer, which is no longer used.
Thanks for the support.
Simon
You could try the attribute option Render XHTML?
I took a look at the issue and it seems to be throwing the following error:
Uncaught fatal error from thread [MxRuntimeSystem-akka.actor.action-dispatcher-23] shutting down ActorSystem [MxRuntimeSystem]
--------
java.lang.NoClassDefFoundError: org/w3c/dom/Document
at org.owasp.validator.html.Policy.getTopLevelElement(Policy.java:281)
at org.owasp.validator.html.Policy.getTopLevelElement(Policy.java:264)
at org.owasp.validator.html.Policy.getInstance(Policy.java:198)
at org.owasp.validator.html.Policy.getInstance(Policy.java:180)
at org.owasp.validator.html.Policy.getInstance(Policy.java:154)
at communitycommons.StringUtils.XSSSanitize(StringUtils.java:260)
at communitycommons.StringUtils.XSSSanitize(StringUtils.java:245)
at communitycommons.actions.XSSSanitize.executeAction(XSSSanitize.java:54)
at communitycommons.actions.XSSSanitize.executeAction(XSSSanitize.java:1)
at com.mendix.systemwideinterfaces.core.UserAction.execute(SourceFile:53)
at com.mendix.core.actionmanagement.CoreAction.doCall(SourceFile:291)
at com.mendix.core.actionmanagement.CoreAction.call(SourceFile:276)
at com.mendix.core.actionmanagement.ActionManager.executeSync(SourceFile:205)
at com.mendix.core.component.InternalCore.execute(SourceFile:259)
at com.mendix.hs.execute(SourceFile:42)
at com.mendix.ib.a(SourceFile:47)
at com.mendix.ia.a(SourceFile:193)
at com.mendix.ia.executeAction(SourceFile:148)
at com.mendix.systemwideinterfaces.core.UserAction.execute(SourceFile:53)
at com.mendix.core.actionmanagement.CoreAction.doCall(SourceFile:291)
at com.mendix.core.actionmanagement.CoreAction.call(SourceFile:276)
at com.mendix.core.actionmanagement.ActionManager.executeSync(SourceFile:205)
at com.mendix.core.component.InternalCore.execute(SourceFile:259)
at com.mendix.iO.a(SourceFile:135)
at com.mendix.pa$g.apply$mcV$sp(SourceFile:292)
at com.mendix.pa$g.apply(SourceFile:283)
at com.mendix.pa$g.apply(SourceFile:283)
at com.mendix.core.session.Worker$$anonfun$receive$3$$anonfun$2.apply(SourceFile:148)
at scala.util.Try$.apply(Try.scala:161)
at com.mendix.core.session.Worker$$anonfun$receive$3.applyOrElse(SourceFile:146)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.ClassNotFoundException: org.w3c.dom.Document not found by project [87]
at org.apache.felix.framework.BundleWiringImpl.findClassOrResourceByDelegation(BundleWiringImpl.java:1532)
at org.apache.felix.framework.BundleWiringImpl.access$400(BundleWiringImpl.java:75)
at org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.loadClass(BundleWiringImpl.java:1955)
at java.lang.ClassLoader.loadClass(Unknown Source)
at org.owasp.validator.html.Policy.getTopLevelElement(Policy.java:281)
at org.owasp.validator.html.Policy.getTopLevelElement(Policy.java:264)
at org.owasp.validator.html.Policy.getInstance(Policy.java:198)
at org.owasp.validator.html.Policy.getInstance(Policy.java:180)
at org.owasp.validator.html.Policy.getInstance(Policy.java:154)
at communitycommons.StringUtils.XSSSanitize(StringUtils.java:260)
at communitycommons.StringUtils.XSSSanitize(StringUtils.java:245)
at communitycommons.actions.XSSSanitize.executeAction(XSSSanitize.java:54)
at communitycommons.actions.XSSSanitize.executeAction(XSSSanitize.java:1)
at com.mendix.systemwideinterfaces.core.UserAction.execute(SourceFile:53)
at com.mendix.core.actionmanagement.CoreAction.doCall(SourceFile:291)
at com.mendix.core.actionmanagement.CoreAction.call(SourceFile:276)
at com.mendix.core.actionmanagement.ActionManager.executeSync(SourceFile:205)
at com.mendix.core.component.InternalCore.execute(SourceFile:259)
at com.mendix.hs.execute(SourceFile:42)
at com.mendix.ib.a(SourceFile:47)
at com.mendix.ia.a(SourceFile:193)
at com.mendix.ia.executeAction(SourceFile:148)
at com.mendix.systemwideinterfaces.core.UserAction.execute(SourceFile:53)
at com.mendix.core.actionmanagement.CoreAction.doCall(SourceFile:291)
at com.mendix.core.actionmanagement.CoreAction.call(SourceFile:276)
at com.mendix.core.actionmanagement.ActionManager.executeSync(SourceFile:205)
at com.mendix.core.component.InternalCore.execute(SourceFile:259)
at com.mendix.iO.a(SourceFile:135)
at com.mendix.pa$g.apply$mcV$sp(SourceFile:292)
at com.mendix.pa$g.apply(SourceFile:283)
at com.mendix.pa$g.apply(SourceFile:283)
at com.mendix.core.session.Worker$$anonfun$receive$3$$anonfun$2.apply(SourceFile:148)
at scala.util.Try$.apply(Try.scala:161)
at com.mendix.core.session.Worker$$anonfun$receive$3.applyOrElse(SourceFile:146)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
This seems to happen when this line of code is hit in the StringUtils.java:
Policy p = Policy.getInstance(filename);
I looked at all the userlibs and they all seem to be fine and up to date. I removed a couple of outdated ones. I have redownloaded the community commons. The following are in the userlib folder:
"commons-fileupload-1.2.1.jar"
"commons-httpclient-3.1.jar"
"commons-io-2.3.jar"
"commons-io-2.3.jar.ExcelImporter.RequiredLib"
"commons-lang-2.5.jar"
"commons-lang3-3.1.jar"
"commons-logging-1.1.jar"
"commons-net-3.1.jar"
"commons-pool-1.6.jar"
"communitycommons.txt"
"dom4j-1.6.1.jar"
"dom4j-1.6.1.jar.ExcelImporter.RequiredLib"
"fontbox-1.8.5.jar"
"geronimo-stax-api_1.0_spec-1.0.jar"
"guava-12.0.jar"
"httpclient-4.1.1.jar"
"httpcore-4.1.jar"
"itext-2.0.6.jar"
"jempbox-1.8.5.jar"
"joda-time-1.6.2.jar"
"jsch-0.1.48.jar"
"mail.jar"
"mxwinsso.jar"
"nekohtml.jar"
"nekohtml.txt"
"org.apache.commons.fileupload-1.2.1.jar"
"org.apache.commons.io-2.3.0.jar"
"org.apache.servicemix.bundles.commons-codec-1.3.0.jar"
"pdfbox-1.8.5.jar"
"poi-3.8-20120326.jar"
"poi-3.8-20120326.jar.ExcelImporter.RequiredLib"
"poi-ooxml-3.6-20091214.jar"
"poi-ooxml-3.6-20091214.jar.ExcelImporter.RequiredLib"
"poi-ooxml-3.8-20120326.jar"
"poi-ooxml-schemas-3.6-20091214.jar"
"poi-ooxml-schemas-3.6-20091214.jar.ExcelImporter.RequiredLib"
"poi-ooxml-schemas-3.8-20120326.jar"
"recaptcha4j-0.0.7.jar"
"replication.jar"
"replication.jar.ExcelImporter.RequiredLib"
"serializer.jar"
"servlet-api-3.0.jar"
"smack.jar"
"spring.jar"
"spring_license.txt"
"spring-ldap-1.2.1.jar"
"spring-ldap-1.2.1_license.txt"
"stax-api-1.0.1.jar"
"webservices-rt.jar"
"wsdl4j-1.5.2.jar"
"xalan-2.7.1.jar"
"xalan-2.7.1.jar.ExcelImporter.RequiredLib"
"xbean.jar"
"xbean.jar.ExcelImporter.RequiredLib"
"xercesImpl.jar"
"xml-apis-ext.jar"
"xmlbeans-2.3.0.jar"
"XmlSchema-1.0.2.jar"
"activation-1.1-osgi.jar"
"antisamy-1.5.3.jar"
"avalon-framework-4.2.0.jar"
"axiom-api-1.2.12.jar"
"axiom-dom-1.2.12.jar"
"axiom-impl-1.2.12.jar"
"axis2-adb-1.0.jar"
"axis2-kernel-1.0.jar"
"batik-all-1.7.jar"
"com.google.guava-14.0.1.jar"
"com.springsource.org.apache.batik.css-1.7.0.jar"
"com.springsource.org.apache.commons.lang-2.5.0.jar"
"commons-codec-1.3.jar"
"commons-collections-3.2.1.jar"
"commons-dbcp-1.4.jar"
"commons-email-1.3.1.jar"
Have a look at the "From rich text to PDF sample" in the App Store.
CommunityCommons provides several sanitize policies. I suppose you need to try some of the other policies. As your HTML is not entered by a user, you could use one that is not so strict. If you really trust the service provider, you could use the anythinggoes policy, which will just convert the HTML to XHTML.
Class org/w3c/dom/Document is in xml-apis.jar. AntiSamy will load classes as it needs it. This depends on the HTML content. If I'm not mistaken xml-apis.jar used to be in the CommunityCommons userlib.
You could try downloading an Apache Batik distribution from http://xmlgraphics.apache.org/batik/download.html
The jar is in the lib of that distribution. Copy it into your userlib and see if it helps. Still weird that it works in one situation and not in the other test. Are you sure the HTML is the same in each situation?