How to normalize a string with special characters (Efraïm to Efraim for example)
1
Is there a simple way (other than replaceAll() going through all posibilities) to normalize a string with special characters to a string without? I think I'v'e found the correct java to handle this, however I'm not sure how to implement this within Mendix. http://www.programcreek.com/java-api-examples/java.text.Normalizer /** * Turns html encoded text into plain text. * * Replaces &ouml; type of expressions into ¨<br/> * Removes accents<br/> * Replaces multiple whitespaces with a single space.<br/> * * @param text * @return */ public static String cleanText(String text) { text = unicodeTrim(text); // replace all multiple whitespaces by a single space Matcher matcher = WHITESPACE_PATTERN.matcher(text); text = matcher.replaceAll(" "); // turn accented characters into normalized form. Turns ö into o" text = Normalizer.normalize(text, Normalizer.Form.NFD); // removes the marks found in the previous line. text = REMOVE_ACCENT_PATTERN.matcher(text).replaceAll(""); // lowercase everything text = text.toLowerCase(); return text; }