HTML to String

0
I have a String field that contains HTML text. I want to remove all HTML mark-up, only the 'Enters' should stay visible in the converted String value. Who knows how to convert this String?
asked
5 answers
3

Perhaps you can use a Java action. See this from stackoverflow here

answered
2

In the community commons module you can use (HTMLToPlainText)  Function.


Regards,
Tariq.

answered
0

This is actually much harder than it seems. I suggest looking at the link David gave you and picking one of the libraries that they suggest, don't try to code this all yourself or you will most likely overlook many corner cases.

answered
0

Hi

 

why don’t you use regular expression? 

Bellow is JavaScript code, I’m sure same can be used in Mendix with regular expression. 

If you need more options to remove from your text, you can update the regular expression and make it suitable for your case.

In this link you can find how to use regular expression in Mendix : https://docs.mendix.com/refguide/string-function-calls/#13-replaceall

let data = "<p>first <h1> second <br> third </p>";
data = data.replace(/<\/?[^>]+(>|$)/g, '');
console.log(data);
answered
-4

I've personally used this in a Java Action in which the HTML is stripped with the following expression:

Content.replaceAll("<br/>", "\n").replaceAll("&nbsp;", "").replaceAll("</p>","\n").replaceAll("<[^>]*>", "")

with Content being the String which is to be stripped. So far hasn't caused any issues as far as I can remember, so might be a decent starting point for this. (Though depending on the potential input, you might also have to deal with things such as rsquo, lsquo etc.)

Just noticed btw, where the code says replaceAll(" ", ""), the tag being replaced is the nbsp one, which however is getting interpreted by the forum :p

answered