Sanitize action gives extra spaces between the words in PDF
3
Hi All, I am using Sanitize action to convert HTML to XHTML and display that in PDF document. I am using CommunityCommons.XSSPolicy.anythinggoes option as i trust the HTML that is been passed in. The issue is i am getting some extra spaces between the words (see pdf screenshot below) which is strange and I can't figure out why is this happening and don't know how to stop it. The HTML code that we receive via web service <![CDATA[<p><b>About your lifestyle and health…</b></p><p>Does your work/pastimes involve travel to war zones or areas of overseas conflict?</p><p><b>No</b></p><p>Do you take part in motor car / bike racing, sports involving flying, mountain climbing or any extreme sports eg. BASE jumping?</p><p><b>No</b></p><p>Have you ever been advised to reduce alcohol intake or ever used non-prescribed drugs such as cannabis, ecstasy or heroin?</p><p><b>No</b></p><p><b>More about your health…</b></p><p>During the last 6 months have you noticed any symptoms or signs for which it would be considered reasonable to seek medical advice? </p><p><b>No</b></p><p>Are you having or waiting for any medical investigations, tests, hospital admissions or surgery?</p><p><b>No</b></p><p>Have you used cigarettes, e-cigarettes, cigars, pipes, or nicotine replacements in the last 12 months including occasional use?</p><p><b>No</b></p><p><b>In the last <strong>3 years</strong>, have you…</b></p><p>Attended any healthcare professional for a medical condition, symptom, illness or injury requiring three or more consultations?</p><p><b>No</b></p><p>Taken or been advised to take any form of drug treatment, medication, chemotherapy, radiotherapy, or any other types of therapy including counselling, that lasted more than 4 weeks?</p><p><b>No</b></p><p>Been unfit to work due to sickness for more than 4 consecutive weeks?</p><p><b>No</b></p>]]> Then we replace CDATA using change object with the code below replaceAll(replaceAll($NiftyDb/QandA_HTML ,'<!\[CDATA\[',''),'\]\]>','') Then used Sanitize action and the XHTML that we get out is ( I have placed it as image because when i copy the XHTML it is rendering as text in the preview) Please can anyone help me out in this. Thanks in advance.!
asked
Mohammed Siddiqui - 'Old Account'
3 answers
1
Hi Mohammed,
It looks like at the beginning of every new line there is an extra space. You can just remove these spaces and then the problem should be solved.
answered
Sebastiaan van der Plaat
0
Hi Mohammed,
It looks like at the beginning of every new line there is an extra space at the beginning of the line. You can just remove these spaces and then the problem should be solved.
answered
Sebastiaan van der Plaat
0
You stated you already trust the HTML you receive from the service. This is already valid XHTML. I tried creating a PDF from it without sanitize and it just works fine without any extra spaces.
This does not answer your question, I think it does solve your problem, as long as you are sure that the service will always produce content where all tags are terminated, as XHTML requires.
I will try to find if AntySamy has some additional setting that causes it to insert a new line.