Page 1 of 1

remove html tags from text

PostPosted: Sat Feb 25, 2012 5:44 pm
by Hans Nieuwenhuis
Hi,

I need to remove html tags from a text string.

In other solutions, I use the getAsPlainText method in combination with a hidden text field (html area) on a form.

But this solution runs as a batch server, so I guess I can not use this "trick" there ?

Regards,

Re: remove html tags from text

PostPosted: Mon Feb 27, 2012 10:34 am
by Joas
You can use a regular expression:
Code: Select all
var _plainText = _htmlText.replace(/<\/{0,1}\w+>/g, "");

Re: remove html tags from text

PostPosted: Mon Feb 27, 2012 11:55 am
by Hans Nieuwenhuis
Thanks,

But in my sample it still leaves the &nbsp; in the text.

Maybe there are more issues like that one ??

Googled some extra info and now I use this :

Code: Select all
data.replace(/<\/{0,1}\w+>/g,'').replace(/&[^;]+?;/g,''

Re: remove html tags from text

PostPosted: Mon Feb 27, 2012 8:33 pm
by jcarlos
Why don't you try the SmartDoc plugin? (https://www.servoyforge.net/projects/smartdoc)

The part of the SmartDoc that does the extraction is Tika. But the SmartDoc does more than extracting text. It might be too much just for this. But if were you I'd check it out. I am sure that you'll find it very useful - it also can serve you well in many other solutions.

JC