get plain text from mail body

Questions, tips and tricks and techniques for scripting in Servoy

get plain text from mail body

Postby Hans Nieuwenhuis » Tue Jul 17, 2012 10:30 am

Hi,

Hi,

I am using the Exchange Plugin from It2Be ( b.t.w. nice plugin !!) to interact with exchange 2010.

When I get the body from a mail it can have markup ( html / rtf / word /??)
is there a way to get the plain text from it ?

Regards,
Hans Nieuwenhuis
Betagraphics
http://www.deltics.nl
http://www.betagraphics.nl

Servoy Version 7.3.1
Java version 1.7.0.x
Database Oracle 11g
User avatar
Hans Nieuwenhuis
 
Posts: 1026
Joined: Thu Apr 12, 2007 12:36 pm
Location: Hengelo, The Netherlands

Re: get plain text from mail body

Postby mboegem » Tue Jul 17, 2012 10:49 am

Hi Hans,

Didn't have time to play around with the current plugin, but I used the previous Exchange plugin
This one had 2 properties in the mail object:
- plainMsg
- htmlMsg
Marc Boegem
Solutiative / JBS Group, Partner
• Servoy Certified Developer
• Servoy Valued Professional
• Freelance Developer

Image

Partner of Tower - The most powerful Git client for Mac and Windows
User avatar
mboegem
 
Posts: 1742
Joined: Sun Oct 14, 2007 1:34 pm
Location: Amsterdam

Re: get plain text from mail body

Postby Hans Nieuwenhuis » Tue Jul 17, 2012 10:55 am

The new ewsj ( exch 2010 ) does not have these...

Regards,
Hans Nieuwenhuis
Betagraphics
http://www.deltics.nl
http://www.betagraphics.nl

Servoy Version 7.3.1
Java version 1.7.0.x
Database Oracle 11g
User avatar
Hans Nieuwenhuis
 
Posts: 1026
Joined: Thu Apr 12, 2007 12:36 pm
Location: Hengelo, The Netherlands

Re: get plain text from mail body

Postby Hans Nieuwenhuis » Tue Jul 17, 2012 11:09 am

Marc,

I saw this entry from you :

Code: Select all
var htmlEditorKit = new Packages.javax.swing.text.html.HTMLEditorKit();
   var htmlDocument = htmlEditorKit.createDefaultDocument();
   
   var reader = new java.io.StringReader('myHtmlTextString');
   htmlEditorKit.read(reader, htmlDocument, 0);
   
   var result =  htmlDocument.getText(0, htmlDocument.getLength());

   return utils.stringTrim(result));


If I use this on the mailbody I get an Error : Exception Object: javax.swing.text.ChangedCharSetException

Any ideas ?

Regards,
Hans Nieuwenhuis
Betagraphics
http://www.deltics.nl
http://www.betagraphics.nl

Servoy Version 7.3.1
Java version 1.7.0.x
Database Oracle 11g
User avatar
Hans Nieuwenhuis
 
Posts: 1026
Joined: Thu Apr 12, 2007 12:36 pm
Location: Hengelo, The Netherlands

Re: get plain text from mail body

Postby mboegem » Tue Jul 17, 2012 11:19 am

Hi Hans,

not sure if that will solve every situation.
html gets more and more advanced and Java doesn't keep up with all the possibilities.

Anyway, I looked into the plugin and I have seen you can get/set the bodytype.

So: does this work?
Code: Select all
myMailObject.bodyType = plugins.it2be_exchangews.type.js_getBody_TEXT();
var _plainText = myMailObject.body;
Marc Boegem
Solutiative / JBS Group, Partner
• Servoy Certified Developer
• Servoy Valued Professional
• Freelance Developer

Image

Partner of Tower - The most powerful Git client for Mac and Windows
User avatar
mboegem
 
Posts: 1742
Joined: Sun Oct 14, 2007 1:34 pm
Location: Amsterdam

Re: get plain text from mail body

Postby Hans Nieuwenhuis » Tue Jul 17, 2012 11:23 am

Yes, but this is just a boolean or integer with value 0 or 1

plain = bodyType value = 0
or Html bodyType value is 1

Regards,

Hans
Hans Nieuwenhuis
Betagraphics
http://www.deltics.nl
http://www.betagraphics.nl

Servoy Version 7.3.1
Java version 1.7.0.x
Database Oracle 11g
User avatar
Hans Nieuwenhuis
 
Posts: 1026
Joined: Thu Apr 12, 2007 12:36 pm
Location: Hengelo, The Netherlands

Re: get plain text from mail body

Postby Hans Nieuwenhuis » Tue Jul 17, 2012 11:54 am

Well,

I also discussed this with It2Be, but I'll have to find some regex to strip out all the markup.

I used some regex to strip html

Code: Select all
replace(/<[a-zA-Z\/][^>]*>/g,'').replace(/&[^;]+?;/g,'')


but the font stuff stays :

Code: Select all
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 12 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
   {font-family:"Cambria Math";
   panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
   {font-family:Calibri;
   panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
   {font-family:Tahoma;
   panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
   {margin:0cm;
   margin-bottom:.0001pt;


Results in :

Code: Select all
l xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:off
ice:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://sche
mas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">


undefined
undefined
undefined
undefinedundefinedv\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
undefinedundefinedundefinedundefinedundefinedundefinedundefined
undefined
undefinedundefinedundefinedundefined
undefined
undefined
Hans Nieuwenhuis
Betagraphics
http://www.deltics.nl
http://www.betagraphics.nl

Servoy Version 7.3.1
Java version 1.7.0.x
Database Oracle 11g
User avatar
Hans Nieuwenhuis
 
Posts: 1026
Joined: Thu Apr 12, 2007 12:36 pm
Location: Hengelo, The Netherlands

Re: get plain text from mail body

Postby Hans Nieuwenhuis » Tue Jul 17, 2012 3:11 pm

Got it working by using some code from a post by Marc Boegem.

Just added the line regarding characterset and then it worked.

Code: Select all
var htmlEditorKit = new Packages.javax.swing.text.html.HTMLEditorKit();
var htmlDocument = htmlEditorKit.createDefaultDocument();
         
htmlDocument.putProperty("IgnoreCharsetDirective", true);
var reader = new java.io.StringReader(_mail.body);
htmlEditorKit.read(reader, htmlDocument, 0);
var result =  htmlDocument.getText(0, htmlDocument.getLength());
_noHtml = utils.stringTrim(result);


Regards and thanks Marc !!
Hans Nieuwenhuis
Betagraphics
http://www.deltics.nl
http://www.betagraphics.nl

Servoy Version 7.3.1
Java version 1.7.0.x
Database Oracle 11g
User avatar
Hans Nieuwenhuis
 
Posts: 1026
Joined: Thu Apr 12, 2007 12:36 pm
Location: Hengelo, The Netherlands


Return to Methods

Who is online

Users browsing this forum: No registered users and 8 guests