After some more research it turned out that the Rich Text Editor (even the latest version) is doing the incorrect translation of the < tag and as such causing the issue. I tested using a form displaying the email in a text area (so I could read the HTML code) and the replacement did not occur. Once I add the RTE to the form (while leaving the text area in the form as well), the replacement of the < is done in as well the RTE as the textarea (referring to the same attribute).
In my code I now try to replace the specific code before it is posted to the form so the text will be displayed properly to the user. Not a nice solution, but it does work.
According to the documentation tinyMCE filters the word contents when pasted. Looks like a bug in that feature.
BTW the html created by msword can be pretty complex and huge.
Maybe you could remove the MS formatting with a java-action. There are java libraries available to do this.
There is in the community commons string util section a HTMLToPlaintext java action. From the documentation: Use this function to convert HTML text to plain text. It will preserve linebreaks but strip all other markup. including html entity decoding.
Regards,
Ronald