Unterminated br tags were the cause in my case.
However, replacing them by <br/>
did not work as these are removed while generating the PDF content.
I replaced them by p tags, which seems to work just fine.
The rich text / HTML content comes from the widget.
The document generator should work well with HTML content generated within Mendix components. Either the widget or the exporter needs to deal with this, or some common module that 'fixes' the output of the widget so the document generator can work with it, so we only have to call that module.
Too big for a comment so a new reply:
I have been experimenting with XSSSanitize in CommunityCommons. As a bonus, the output of the sanitize is XHTML rather than HTML so 'fixing' the content is no longer necessary.
Unfortunately, none of the supplied policy files match the functionality of the widget, especially the option to embed images in the text.
So I made my own.
I registered a ticket about this. I am happy to share my solution and I hope it can be put in CommunityCommons
Please find more info in this topic.
Maybe there are unterminated <p>
or <br>
tags in the HTML. This is allowed by the browsers for 'normal' HTML. XHTML requires all tags to be terminated, for tags without content it must be like <p/>
and <br/>
.
For our project a similar requirement is scheduled for this week, but I will include a quick test to check whether our content has the same problems.
If I do have the same issue, I will create a ticket with a test project.
Maybe you can create a ticket too, draws more attention to it.
I found a userlib --> commons-text-1.10.0 here --> https://commons.apache.org/proper/commons-text/download_text.cgi
Downlaoded the jar file and inserted it in my user lib. Then I created a java action that I use to unescape the HTML -->
// BEGIN USER CODE
return StringEscapeUtils.unescapeHtml4(this.InputHTML_Unescaped);
// END USER CODE
where the static class is imported like this in the upperside of the code -->
import org.apache.commons.text.StringEscapeUtils;